安装的cudnn的版本是7.1.0.3,而要求的cudnn版本是7.3.0.0。
将tensorflow版本从1.5换成1.8,顺利运行程序(升级tensorflow版本来解决)
ll 命令查看 连接 /usr/local/cuda/lib64下 把对应的 libcudnn.so.7,4,2 连到 libcudnn.so.7 在连到libcudnn.so
sudo ln -sf libcudnn.so.5.1.10 libcudnn.so.5
sudo ln -sf libcudnn.so.5 libcudnn.so
sudo ldconfig
使用 nvidia-smi 命令
$ nvidia-smi
但是这个命令只能显示一次,如果要实时显示,配合watch命令, 让一秒刷新一次
$ watch -n 1 nvidia-smi
Ubuntu16.04下安装多版本cuda和cudnn
2018年06月28日 16:33:58 tiankong_hut 阅读数:517
i7-7700k + TITAN X + 16G DDR4 2400 + 256G SSD
ubuntu16.0.4+ anaconda3+ tensorflow-gpu(0.12.1)
电脑已经安装CUDA9+ cuDNN7.1, 本次安装CUDA8.0.44 + cuDNN5.1
相关命令:
查看cuda版本 : nvcc -V
查看位置 : which nvcc
查看NVIDIA动态使用情况: watch -n 1 nvidia-smi
cuda 版本 : cat /usr/local/cuda/version.txt
cudnn 版本 : cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
NVIDIA 驱动版本 : cat /proc/driver/nvidia/version
查看环境变量 : env
LD_DEBUG=all cat
卸载cuda : sudo /usr/local/cuda-8.0/bin/uninstall_cuda_8.0.pl
卸载NVIDIA Driver : sudo /usr/bin/nvidia-uninstall
多版本CUDA切换:
sudo rm -rf /usr/local/cuda
sudo ln -s /usr/local/cuda-8.0 /usr/local/cuda
sudo ln -s /usr/local/cuda-9.1 /usr/local/cuda
查看目录属性: ls -l 目录名
具有管理员权限的文件管理器, 比如移动文件夹 : sudo nautilus
加入-R 参数,将权限传递给子文件夹 : chmod -R 777 /home/mypackage
**********************************************************************************************************
GitHub上下了个程序,tensorflow-gpu=0.12,gpu下跑报错,应该是CUDA版本高了
-
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7103 (compatibility version 7100) but source was compiled with 5105 (compatibility version 5100). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
-
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7103 (compatibility version 7100) but source was compiled with 5105 (compatibility version 5100). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
-
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
-
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
https://blog.csdn.net/tunhuzhuang1836/article/details/79545625 :Ubuntu16.04下安装多版本cuda和cudnn
https://blog.csdn.net/Mr_KkTian/article/details/78632756 :Ubuntu16.04下同时安装CUDA8.0和CUDA7.0
https://blog.csdn.net/maple2014/article/details/78574275 :安装多版本 cuda ,多版本之间切换
https://blog.csdn.net/mumoDM/article/details/79462604 :多版本CUDA问题
https://blog.csdn.net/liangyihuai/article/details/78688228 :windows下tensorflow-gpu安装
0、 Tensorflow gpu 官方安装指南:
https://www.tensorflow.org/install/install_windows
-
下载CUDA并安装:
各个版本的CUDA :https://developer.nvidia.com/cuda-toolkit-archive -
下载CUDNN (要注册)
CUDNN库下载地址:https://developer.nvidia.com/cudnn
Installation Guide for Linux : cuda_8.0.44(官方安装说明) cuda_8.0.44
安装CUDA8.0和cuDNN5.1:
下载好后直接命令行解压然后复制 lib64 和 include 文件夹到 usr/local/cuda-8.0,命令如下:
# Installing from a Tar File
-
tar -zxvf 压缩文件名.tar.gz
-
sudo cp cuda/include/cudnn.h /usr/local/cuda-8.0/include
-
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-8.0/lib64
-
sudo chmod a+r /usr/local/cuda-9.1/include/cudnn.h /usr/local/cuda-8.0/lib64/libcudnn*
cuda版本切换
gedit ~/.bashrc #更改 ~/.bashrc 文件,添加两行
-
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
-
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
下面的不行:
-
export PATH="$PATH:/usr/local/cuda/bin"
-
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64/"
sudo /etc/profile #必须更改/etc/profile 文件, 而且更改后必须重启计算机才有效 (source /etc/profile 不能生效)
-
export PATH=/usr/local/cuda-9.1/bin:$PATH
-
export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64:$LD_LIBRARY_PATH
改为:
-
export PATH=/usr/local/cuda/bin:$PATH
-
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
从cuda9.1切换到cuda8.0:
-
sudo rm -rf /usr/local/cuda #删除之前创建的软链接
-
sudo ln -s /usr/local/cuda-8.0 /usr/local/cuda #创建新 cuda 的软链接
从cuda8.0切换到cuda9.0:
sudo rm -rf /usr/local/cuda #删除之前创建的软链接
sudo ln -s /usr/local/cuda-9.1 /usr/local/cuda #创建新 cuda 的软链接
可以用命令来查看cuda是否切换完成:
-
$ nvcc --version
-
nvcc: NVIDIA (R) Cuda compiler driver
-
Copyright (c) 2005-2017 NVIDIA Corporation
-
Built on Fri_Sep__1_21:08:03_CDT_2017
-
Cuda compilation tools, release 9.0, V9.0.176
which nvcc :查看nvcc位置
CUDA8.0+cuDNN5.1 未完全安装:
-
Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...
-
Installing the CUDA Samples in /home/human-machine ...
-
Copying samples to /home/human-machine/NVIDIA_CUDA-8.0_Samples now...
-
Finished copying samples.
-
===========
-
= Summary =
-
===========
-
Driver: Not Selected
-
Toolkit: Installed in /usr/local/cuda-8.0
-
Samples: Installed in /home/human-machine
-
Please make sure that
-
- PATH includes /usr/local/cuda-8.0/bin
-
- LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root
-
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin
-
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.
-
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
-
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
-
sudo <CudaInstaller>.run -silent -driver
-
Logfile is /tmp/cuda_install_1534.log
#编译并测试设备 deviceQuery:
切换到例子存放的路径,默认路径是 ~/NVIDIA_CUDA-7.5_Samples ,切换到相应路径
然后终端输入:$ make
运行编译生成的二进制文件
编译后的二进制文件, 默认存放在 ~/NVIDIA_CUDA-7.5_Samples/bin
切换路径 :$ cd /NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/release
终端输入 :$ ./deviceQuery
#编译并测试带宽 bandwidthTest:
cd ../bandwidthTest
sudo make
./bandwidthTest
如果这两个测试的最后结果都是Result = PASS,说明CUDA安装成功
CUDA8.0+cuDNN5.1报错,但tensorflow-gpu可以跑,tensorboard也可以用。
-
[email protected]:~/NVIDIA_CUDA-8.0_Samples$ make
-
make[1]: Entering directory '/home/human-machine/NVIDIA_CUDA-8.0_Samples/0_Simple/simpleVoteIntrinsics_nvrtc'
-
find: `/usr/local/cuda-8.0/lib64/stubs': 没有那个文件或目录
-
>>> WARNING - libcuda.so not found, CUDA Driver is not installed. Please re-install the driver. <<<
-
[@] g++ -I../../common/inc -I/usr/local/cuda-8.0/include -o simpleVoteIntrinsics.o -c simpleVoteIntrinsics.cpp
-
[@] g++ -L/usr/local/cuda-8.0/lib64 -L/usr/local/cuda-8.0/lib64/stubs -o simpleVoteIntrinsics_nvrtc simpleVoteIntrinsics.o -lcuda -lnvrtc
-
[@] mkdir -p ../../bin/x86_64/linux/release
-
[@] cp simpleVoteIntrinsics_nvrtc ../../bin/x86_64/linux/release
-
make[1]: Leaving directory '/home/human-machine/NVIDIA_CUDA-8.0_Samples/0_Simple/simpleVoteIntrinsics_nvrtc'
-
make[1]: Entering directory '/home/human-machine/NVIDIA_CUDA-8.0_Samples/0_Simple/matrixMul'
-
"/usr/local/cuda-8.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o matrixMul.o -c matrixMul.cu
-
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
-
cc1plus: fatal error: cuda_runtime.h: 没有那个文件或目录
-
compilation terminated.
-
Makefile:250: recipe for target 'matrixMul.o' failed
-
make[1]: *** [matrixMul.o] Error 1
-
make[1]: Leaving directory '/home/human-machine/NVIDIA_CUDA-8.0_Samples/0_Simple/matrixMul'
-
Makefile:52: recipe for target '0_Simple/matrixMul/Makefile.ph_build' failed
-
make: *** [0_Simple/matrixMul/Makefile.ph_build] Error 2
CUDA9.1+cuDNN7.1 编译测试正常:
https://www.linuxidc.com/Linux/2017-08/146391.htm :参考官方文档,干货
https://blog.csdn.net/weixin_32820767/article/details/80421913
http://www.sohu.com/a/225953058_491081