When considering using performance counters as my companies' .NET based site, I was wondering how big the overhead is of using them.
当考虑使用性能计数器作为我公司的基于.NET的站点时,我想知道使用它们的开销有多大。
Do I want to have my site continuously update it's counters or am I better off to only do when I measure?
我是否希望让我的网站不断更新它的计数器,或者我最好只在我测量时做什么?
6 个解决方案
#1
20
The performance impact is negligible in updating. Microsoft's intent is that you always write to the performance counters. It's the monitoring of (or capturing of) those performance counters that will cause a degradation of performance. So, only when you use something like perfmon to capture the data.
更新时性能影响可以忽略不计。微软的意图是你总是写信给性能计数器。它是监视(或捕获)那些会导致性能下降的性能计数器。因此,只有当您使用像perfmon这样的东西来捕获数据时。
In effect, the performance counter objects will have the effect of only "doing it when you measure."
实际上,性能计数器对象仅具有“在测量时执行此操作”的效果。
#2
28
The overhead of setting up the performance counters is generally not high enough to worry about (setting up a shared memory region and some .NET objects, along with CLR overhead because the CLR actually does the management for you). Here I'm referring to classes like PerformanceCounter.
设置性能计数器的开销通常不足以担心(设置共享内存区域和一些.NET对象,以及CLR开销,因为CLR实际上为您进行管理)。这里我指的是像PerformanceCounter这样的类。
The overhead of registering the perfromance counters can be decently slow, but generally is not a concern because it is intended to happen once at setup time because you want to change machine-wide state. It will be dwarfed by any copying that you do. It's not generally something you want to do at runtime. Here I'm referring to PerformanceCounterInstaller.
注册性能计数器的开销可能会相当慢,但通常不是问题,因为它打算在设置时发生一次,因为您想要更改机器范围的状态。你做的任何复制都会相形见绌。它通常不是你想在运行时做的事情。这里我指的是PerformanceCounterInstaller。
The overhead of updating a performance counter generally comes down to the cost of performing an Interlocked operation on the shared memory. This is slower than normal memory access but is a processor primitive (that's how it gets atomic operations across the entire memory subsystem including caches). Generally this cost is not high to worry about. It could be 10 times a normal memory operation, potentially worse depending on the update and what contention is like across threads and CPUs. But consider this, it's literally impossible to do any better than interlocked operations for cross-process communication with atomic updates, and no locks are held. Here I refer to PerformanceCounter.Increment and similar methods.
更新性能计数器的开销通常归结为在共享存储器上执行互锁操作的成本。这比正常的内存访问慢,但它是一个处理器原语(这就是它如何在包括缓存在内的整个内存子系统中获得原子操作)。一般来说,这个成本并不高。它可能是正常内存操作的10倍,可能更糟糕,具体取决于更新以及线程和CPU之间的争用情况。但考虑到这一点,与原子更新进行跨进程通信的互锁操作完全不可能做得更好,并且没有锁定。这里我指的是PerformanceCounter.Increment和类似的方法。
The overhead of reading a performance counter is generally a read from shared memory. As others have said, you want to sample on a reasonable period (just like any other sampling) but just think of PerfMon and try to keep the sampling on a human scale (think seconds instead of milliseconds) and you proably won't have any problems.
读取性能计数器的开销通常是从共享内存中读取。正如其他人所说,你想在一个合理的时间内进行采样(就像任何其他采样一样),但只要想到PerfMon,并尝试将采样保持在人体尺度上(想想秒而不是毫秒),你可能根本没有问题。
Finally, an appeal to experience: Performance counters are so lightweight that they are used everywhere in Windows, from the kernel to drivers to user applications. Microsoft relies on them internally.
最后,体验的吸引力:性能计数器非常轻巧,可以在Windows中的任何地方使用,从内核到驱动程序再到用户应用程序。微软在内部依赖它们。
Advice: The real question with performance counters is the learning curve in understanding (which is moderate) and one measuring the right things (seems easy but often you get it wrong).
建议:性能计数器的真正问题是理解中的学习曲线(温和)和衡量正确事物的曲线(看起来很简单,但通常你会弄错)。
#3
8
I've tested these a LOT.
我已经测试了很多。
On an old compaq 1Ghz 1 processor machine, I was able to create about 10,000 counters and monitor them remotely for about 20% CPU usage. These aren't custom counters, just checking CPU or whatever.
在一台旧的compaq 1Ghz 1处理器机器上,我能够创建大约10,000个计数器并远程监控它们的CPU使用率约为20%。这些不是自定义计数器,只是检查CPU或其他什么。
Basically, you can monitor all the counters on any decent newer machine with very little impact.
基本上,您可以监控任何体面的新机器上的所有计数器,影响很小。
The instantiation of the object can take a long time tho, a few seconds to a few minutes. I suggest you multithread this for all the counters you collect otherwise your app will sit there forever creating these objects. Not sure what MS does once you create it that takes so long, but you can do it for 1000 counters with 1000 threads in the same time you can do it for 1 counter and 1 thread.
对象的实例化可能需要很长时间,几秒到几分钟。我建议你为你收集的所有计数器多线程,否则你的应用程序将永远坐在那里创建这些对象。一旦你创建它需要这么长时间不确定MS会做什么,但你可以在1000个计数器中使用1000个线程同时为1个计数器和1个线程执行此操作。
#4
7
A performance counter is just a pointer to 4/8 bytes in shared memory (aka memory mapped file), so their cost is very similar to that of accessing an int/long variabile.
性能计数器只是指向共享内存(也就是内存映射文件)中4/8字节的指针,因此它们的成本与访问int / long变量非常相似。
#5
2
I agree with famoushamsandwich, but would add that as long as your sampling rate is reasonable (5 seconds or more) and you monitor a reasonable set of counters, then the impact of measuring is negligible as well (in most cases).
我同意着名的hamsandwich,但是只要您的采样率合理(5秒或更长)并且您监控一组合理的计数器,那么测量的影响也可以忽略不计(在大多数情况下)。
#6
1
The thing that I have found is that it is not that slow for the majority of applications. I wouldn't put one in a tight loop, or something that is called thousands of times a second.
我发现的事情是,对于大多数应用程序而言,它并不那么慢。我不会把它放在一个紧密的循环中,或者每秒被称为数千次的东西。
Secondly, I found that programmatically creating the performance counters is very slow, so make sure that you create them before hand and not in code.
其次,我发现以编程方式创建性能计数器非常慢,因此请确保先创建它们而不是代码。
#1
20
The performance impact is negligible in updating. Microsoft's intent is that you always write to the performance counters. It's the monitoring of (or capturing of) those performance counters that will cause a degradation of performance. So, only when you use something like perfmon to capture the data.
更新时性能影响可以忽略不计。微软的意图是你总是写信给性能计数器。它是监视(或捕获)那些会导致性能下降的性能计数器。因此,只有当您使用像perfmon这样的东西来捕获数据时。
In effect, the performance counter objects will have the effect of only "doing it when you measure."
实际上,性能计数器对象仅具有“在测量时执行此操作”的效果。
#2
28
The overhead of setting up the performance counters is generally not high enough to worry about (setting up a shared memory region and some .NET objects, along with CLR overhead because the CLR actually does the management for you). Here I'm referring to classes like PerformanceCounter.
设置性能计数器的开销通常不足以担心(设置共享内存区域和一些.NET对象,以及CLR开销,因为CLR实际上为您进行管理)。这里我指的是像PerformanceCounter这样的类。
The overhead of registering the perfromance counters can be decently slow, but generally is not a concern because it is intended to happen once at setup time because you want to change machine-wide state. It will be dwarfed by any copying that you do. It's not generally something you want to do at runtime. Here I'm referring to PerformanceCounterInstaller.
注册性能计数器的开销可能会相当慢,但通常不是问题,因为它打算在设置时发生一次,因为您想要更改机器范围的状态。你做的任何复制都会相形见绌。它通常不是你想在运行时做的事情。这里我指的是PerformanceCounterInstaller。
The overhead of updating a performance counter generally comes down to the cost of performing an Interlocked operation on the shared memory. This is slower than normal memory access but is a processor primitive (that's how it gets atomic operations across the entire memory subsystem including caches). Generally this cost is not high to worry about. It could be 10 times a normal memory operation, potentially worse depending on the update and what contention is like across threads and CPUs. But consider this, it's literally impossible to do any better than interlocked operations for cross-process communication with atomic updates, and no locks are held. Here I refer to PerformanceCounter.Increment and similar methods.
更新性能计数器的开销通常归结为在共享存储器上执行互锁操作的成本。这比正常的内存访问慢,但它是一个处理器原语(这就是它如何在包括缓存在内的整个内存子系统中获得原子操作)。一般来说,这个成本并不高。它可能是正常内存操作的10倍,可能更糟糕,具体取决于更新以及线程和CPU之间的争用情况。但考虑到这一点,与原子更新进行跨进程通信的互锁操作完全不可能做得更好,并且没有锁定。这里我指的是PerformanceCounter.Increment和类似的方法。
The overhead of reading a performance counter is generally a read from shared memory. As others have said, you want to sample on a reasonable period (just like any other sampling) but just think of PerfMon and try to keep the sampling on a human scale (think seconds instead of milliseconds) and you proably won't have any problems.
读取性能计数器的开销通常是从共享内存中读取。正如其他人所说,你想在一个合理的时间内进行采样(就像任何其他采样一样),但只要想到PerfMon,并尝试将采样保持在人体尺度上(想想秒而不是毫秒),你可能根本没有问题。
Finally, an appeal to experience: Performance counters are so lightweight that they are used everywhere in Windows, from the kernel to drivers to user applications. Microsoft relies on them internally.
最后,体验的吸引力:性能计数器非常轻巧,可以在Windows中的任何地方使用,从内核到驱动程序再到用户应用程序。微软在内部依赖它们。
Advice: The real question with performance counters is the learning curve in understanding (which is moderate) and one measuring the right things (seems easy but often you get it wrong).
建议:性能计数器的真正问题是理解中的学习曲线(温和)和衡量正确事物的曲线(看起来很简单,但通常你会弄错)。
#3
8
I've tested these a LOT.
我已经测试了很多。
On an old compaq 1Ghz 1 processor machine, I was able to create about 10,000 counters and monitor them remotely for about 20% CPU usage. These aren't custom counters, just checking CPU or whatever.
在一台旧的compaq 1Ghz 1处理器机器上,我能够创建大约10,000个计数器并远程监控它们的CPU使用率约为20%。这些不是自定义计数器,只是检查CPU或其他什么。
Basically, you can monitor all the counters on any decent newer machine with very little impact.
基本上,您可以监控任何体面的新机器上的所有计数器,影响很小。
The instantiation of the object can take a long time tho, a few seconds to a few minutes. I suggest you multithread this for all the counters you collect otherwise your app will sit there forever creating these objects. Not sure what MS does once you create it that takes so long, but you can do it for 1000 counters with 1000 threads in the same time you can do it for 1 counter and 1 thread.
对象的实例化可能需要很长时间,几秒到几分钟。我建议你为你收集的所有计数器多线程,否则你的应用程序将永远坐在那里创建这些对象。一旦你创建它需要这么长时间不确定MS会做什么,但你可以在1000个计数器中使用1000个线程同时为1个计数器和1个线程执行此操作。
#4
7
A performance counter is just a pointer to 4/8 bytes in shared memory (aka memory mapped file), so their cost is very similar to that of accessing an int/long variabile.
性能计数器只是指向共享内存(也就是内存映射文件)中4/8字节的指针,因此它们的成本与访问int / long变量非常相似。
#5
2
I agree with famoushamsandwich, but would add that as long as your sampling rate is reasonable (5 seconds or more) and you monitor a reasonable set of counters, then the impact of measuring is negligible as well (in most cases).
我同意着名的hamsandwich,但是只要您的采样率合理(5秒或更长)并且您监控一组合理的计数器,那么测量的影响也可以忽略不计(在大多数情况下)。
#6
1
The thing that I have found is that it is not that slow for the majority of applications. I wouldn't put one in a tight loop, or something that is called thousands of times a second.
我发现的事情是,对于大多数应用程序而言,它并不那么慢。我不会把它放在一个紧密的循环中,或者每秒被称为数千次的东西。
Secondly, I found that programmatically creating the performance counters is very slow, so make sure that you create them before hand and not in code.
其次,我发现以编程方式创建性能计数器非常慢,因此请确保先创建它们而不是代码。