[置顶] CUDA之clock()方法详解

在CUDA开发中，经常需要获取代码在GPU上执行的时间，在CPU上，我们可以简单的在执行方法之前调用clock()方法获取当前时间，方法执行完毕再调用clock()获取时间，两个时间相减就可以得到以毫秒计算的运行时间： clock_t start=clock(); call fun(); clock_t end=clock();end-start is the time the fun function used.但在GPU上，clock()返回值是each thread of the number of clock cycles，不是时间，而是每个线程在执行代码用了多少个时钟周期。此时end-start并不能够得到方法的执行时间。必须除以GPU的频率才能得到时间。我们可以通过以下方法获取GPU的频率int get_GPU_Rate(){ cudaDeviceProp deviceProp;//CUDA定义的存储GPU属性的结构体
cudaGetDeviceProperties(&deviceProp,0);//CUDA定义函数
return deviceProp.clockRate;} __global__ void fun(int *num, int *result,clock_t* time)
{ //This function decorate by __global__ will be execute by GPU or not CPU.//In CUDA,a function will run as a Thread and could be execute some times.
    int i,temp=0;
clock_t start=clock();

    for(i = 0; i < DATA_SIZE; i++) {
        temp+= num[i];
    }
*result=temp;
*time=clock()-start;
//此处clock()获得的是此线程在GPU上执行的时钟数，要得到时间，还需除以GPU的时钟主频
} call fun(num,result,time);int GPU_Rate=get_GPU_Rate();time/GPU_Rate is the time the fun function used.

秒客网

[置顶] CUDA之clock()方法详解

相关文章