让CPU的占有率曲线听我指挥

最近我要在公司的一个study group负责AWS的AutoScaling功能的介绍。AWS可以根据instance（虚拟机）的CPU使用量进行scaling。

为了做demo，于是就有这样一个需求：让instance上的CPU听我指挥，当然简单的方法就是写一个死循环，让CPU 100%。但如果make things more interesting，希望实现CPU在某个范围内变化又要怎么做哩？

之前看过，邹欣大神的书《编程之美》，其中第一个问题就是“让CPU占有率曲线听你指挥”，里面提到了一些解法，更有甚者，做到了能让CPU占有率曲线按正弦函数波动。我和同事大神Jason san中午吃饭时聊了这个需求，想不到他下午就分别用C++和Python实现了一种动态适应的解决方法。以下我们就来讨论下这个有趣问题的解法：

首先书上提到一个简单方法让CPU维持在50%：

让CPU在一段时间内（根据Task Manager的采样率）跑busy和idle两个不同的循环，从而通过不同的时间比例，来调节CPU使用率。

那么对于一个空循环

for(i = 0; i < n; i++);

又该如何来估算这个最合适的n的值呢？首先我们把这个空循环简单写成汇编代码（伪代码）：

 next:

 mov eax, dword ptr[i]; i放入寄存器

 add exa, ; 寄存器加1

 mov dword ptr [i], eax; 寄存器赋回i

 cmp eax, dword ptr[n]; 比较i和n

 jl next; i小于n时重复循环

假设这段代码要运行的CPU是P4，主频是2.4Ghz（2.4*10的9次方个时钟周期每秒）。现代CPU每个时钟周期可以执行两条以上的代码，我们取平均值两条，于是有

(2 400 000 000*2)/ = 960 000 000 （循环/秒）（这边除以5是因为上面汇编代码有5条汇编语句，尼玛书上没有说清楚，让我这样的小白想了好久……），也就是说CPU每一秒钟可以运行这个空循环960 000 000次。不过我们还是不能简单的将n=960 000 000，然后Sleep(1000)了事。如果我们让CPU工作1秒钟，然后休息1秒钟，波形很有可能就是锯齿状的一样先到达峰值（>50%），然后跌倒一个很低的占用率。

我们尝试着降低两个数量级，令n = 9 600 000，而睡眠时间则相应的改为10毫秒。代码清单如下：

int main()

{

    for(; ;)

    {

        for(int i = ; i < ; i++)

            ;

        sleep()

    }

    return ;

}

再不断调整参数后，就能得到一条大致稳定的CPU占有率直线。

但是这个方法最大的问题是，参数都是根据特定机器算出来的。在其他机器上要重新调整，所以需要有更加智能的方法。

之后书中给出了另外两个解法:

解法二：使用了系统API：GetTickCount()和Sleep()

解法三：使用工具Perfmon.exe

具体我就不列出来了，大家有兴趣去看书吧。（我这样给这本书打广告了，大神不会告我侵权了吧……^_^）

但是，这些方法都没有考虑到多核和多CPU的情况，书中提到对于多CPU的问题首先需要获得系统的CPU信息。可以使用GetProcessorInfo()获得多处理器的信息，然后指定进程在哪一个处理器上运行。其中指定运行使用的是SetThreadAffinityMask()函数。

另外，还可以使用RDTSC指令获取当前CPU核心运行周期数。

在x86平台定义函数：

inline unsigned_int64 GetCPUTickCount()

{

    _asm

    {

        rdtsc;

    }

}

在x64平台上定义：

　　#define GetCPUTickCount()_rdtsc()

使用CallNtPowerInformation API得到CPU频率，从而将周期数转化为毫秒数。（这边这段有点不知所云了……）

这边我给出两段代码，分别用C++和Python实现，通过动态获取CPU的状态，调整线程的数量来实现让CPU保持在某一个值，且考虑了多CPU情况。

 #define _WIN32_WINNT 0x0502

 #include <cstdio>

 #include <cstdlib>

 #include <ctime>

 #include <Windows.h>

 #include <process.h>

 #define TARGET_CPU_RATE (80.0)

 extern "C" {

     typedef struct _CONTROL_PARAM

     {

         volatile LONG m_exit;

         volatile LONGLONG m_rest;

     } CONTROL_PARAM;

 };

 static CONTROL_PARAM g_param;

 unsigned __stdcall task_thread(void *pparam)

 {

     if (!pparam)

         return ;

     CONTROL_PARAM *pctrl = (CONTROL_PARAM *)pparam;

     LONGLONG rest64 = ;

     while (true)

     {

         if (rest64 > pctrl->m_rest)

         {

             Sleep();

             rest64=;

         }

         else

             rest64++;

     }

     return ;

 }

 inline unsigned __int64 u64_time(FILETIME &_ft)

 {

     unsigned __int64 u64;

     u64 = _ft.dwHighDateTime;

     u64 = u64 << ;

     u64 |= _ft.dwLowDateTime;

     return u64;

 }

 int main()

 {

     SYSTEM_INFO sys_info;

     ZeroMemory(&sys_info, sizeof(SYSTEM_INFO));

     GetSystemInfo(&sys_info);

     int cpu_cnt = (int)sys_info.dwNumberOfProcessors;

     if ( == cpu_cnt)

         cpu_cnt = ;

     printf("cpu count: %d\n", cpu_cnt);

     g_param.m_rest = (DWORD)-;

     for (int i=; i<cpu_cnt; ++i)

     {

         _beginthreadex(NULL,,task_thread,&g_param,,NULL);

     }

     FILETIME idleTime;

     FILETIME kernelTime;

     FILETIME userTime;

     FILETIME last_idleTime;

     FILETIME last_kernelTime;

     FILETIME last_userTime;

     bool initialized = false;

     while (true)

     {

         if (GetSystemTimes(&idleTime,&kernelTime,&userTime))

         {

             if (initialized)

             {

                 unsigned __int64 usr = u64_time(userTime) - u64_time(last_userTime);

                 unsigned __int64 ker = u64_time(kernelTime) - u64_time(last_kernelTime);

                 unsigned __int64 idl = u64_time(idleTime) - u64_time(last_idleTime);

                 double sys = ker + usr;

                 double cpu = (sys - (double)idl) / sys * 100.0;

                 double dif = TARGET_CPU_RATE - cpu;

                 g_param.m_rest = (LONGLONG)((double)g_param.m_rest * (1.0 + dif/100.0));

                 printf("rest = %I64d, cpu = %d\n", g_param.m_rest, (int)cpu);

             }

             else

                 initialized = true;

             last_idleTime = idleTime;

             last_kernelTime = kernelTime;

             last_userTime = userTime;

         }

         Sleep();

     }

     return getchar();

 }

Python程序：

 from ctypes import windll, Structure, byref, c_longlong

 from ctypes.wintypes import DWORD

 import win32api

 import multiprocessing

 WinGetSystemTimes = windll.kernel32.GetSystemTimes

 class FILETIME(Structure):

     _fields_ = [ ("dwLowDateTime", DWORD), ("dwHighDateTime", DWORD) ]

 def pyGetSystemTimes():

     idleTime, kernelTime, userTime = FILETIME(), FILETIME(), FILETIME()

     WinGetSystemTimes(byref(idleTime), byref(kernelTime), byref(userTime))

     return (idleTime, kernelTime, userTime)

 def longTime(ft):

     tm = ft.dwHighDateTime

     tm = tm << 32

     tm = tm | ft.dwLowDateTime

     return tm

 TARGET_CPU_RATE = 70.0

 def worker(val):

     rest = 0

     while (True):

         if rest > val.value:

             rest = 0

             win32api.Sleep(1)

         else:

             rest += 1

 if __name__ == '__main__':

     sys_info = win32api.GetSystemInfo()

     cpu_cnt = sys_info[5]

     val = multiprocessing.Value(c_longlong, 100, lock=False)

     print type(val.value)

     threads = []

     for i in range(cpu_cnt):

         p = multiprocessing.Process(target=worker, args=(val,))

         p.start()

         threads.append(p)

     initialized = False

     last_times = (FILETIME(), FILETIME(), FILETIME())

     while (True):

         cur_times = pyGetSystemTimes()

         if initialized:

             usr = longTime(cur_times[2]) - longTime(last_times[2])

             ker = longTime(cur_times[1]) - longTime(last_times[1])

             idl = longTime(cur_times[0]) - longTime(last_times[0])

             sys = float(ker + usr)

             cpu = (sys - float(idl)) / sys * 100.0;

             dif = TARGET_CPU_RATE - cpu;

             val.value = long(float(val.value) * (1.0 + dif / 100.0));

             print "rest=", val.value, ", cpu=", int(cpu)

         else:

             initialized = True

         last_times = cur_times

         win32api.Sleep(300)

秒客网

让CPU的占有率曲线听我指挥

相关文章