Linux查看GPU信息和使用情况

时间:2023-03-09 07:29:52
Linux查看GPU信息和使用情况

Linux查看显卡信息:

lspci | grep -i vga

使用nvidia GPU可以:

lspci | grep -i nvidia

[root@gpu-server-002 ~]# lspci | grep -i nvidia
02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
02:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
03:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
82:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
82:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
83:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
83:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)

  

前边的序号 "00:0f.0"是显卡的代号(这里是用的虚拟机);

查看指定显卡的详细信息用以下指令:

lspci -v -s 00:0f.0

Linux查看Nvidia显卡信息及使用情况

Nvidia自带一个命令行工具可以查看显存的使用情况:

nvidia-smi

[root@gpu-server-002 ~]# nvidia-smi
Tue Nov 27 00:20:51 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.98 Driver Version: 384.98 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... On | 00000000:02:00.0 Off | N/A |
| 66% 85C P2 175W / 250W | 10795MiB / 11172MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... On | 00000000:03:00.0 Off | N/A |
| 56% 83C P2 162W / 250W | 10795MiB / 11172MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... On | 00000000:82:00.0 Off | N/A |
| 52% 82C P2 250W / 250W | 10795MiB / 11172MiB | 90% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... On | 00000000:83:00.0 Off | N/A |
| 54% 83C P2 126W / 250W | 10795MiB / 11172MiB | 82% Default |
+-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 11161 C python 10785MiB |
| 1 11161 C python 10785MiB |
| 2 12049 C python 10785MiB |
| 3 12049 C python 10785MiB |

  

表头释义:

Fan:显示风扇转速,数值在0到100%之间,是计算机的期望转速,如果计算机不是通过风扇冷却或者风扇坏了,显示出来就是N/A; 
Temp:显卡内部的温度,单位是摄氏度;
Perf:表征性能状态,从P0到P12,P0表示最大性能,P12表示状态最小性能;
Pwr:能耗表示; 
Bus-Id:涉及GPU总线的相关信息; 
Disp.A:是Display Active的意思,表示GPU的显示是否初始化; 
Memory Usage:显存的使用率; 
Volatile GPU-Util:浮动的GPU利用率;
Compute M:计算模式; 
下边的Processes显示每块GPU上每个进程所使用的显存情况。

如果要周期性的输出显卡的使用情况,可以用watch指令实现:

watch -n 10 nvidia-smi
命令行参数-n后边跟的是执行命令的周期,以s为单位。

---------------------
作者:-牧野-
来源:CSDN
原文:https://blog.csdn.net/dcrmg/article/details/78146797
版权声明:本文为博主原创文章,转载请附上博文链接!