ubuntu 搭建深度学习 tensorflow 环境

英伟达驱动安装

把 nouveau 驱动加入黑名单
sudo nano /etc/modprobe.d/blacklist-nouveau.conf
加入如下内容:

1
2
3
4
5
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

禁用 nouveau 内核模块

1
2
3
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
sudo reboot
1
2
3
4
5
6
7
8
9
10
11
12
13
14
sudo apt-get remove --purge nvidia*


sudo add-apt-repository ppa:graphics-drivers/ppa

sudo apt-get update

有的操作系统安装的时候已经装好驱动了,比如我装的的 linux-mint

sudo apt-get install nvidia-375(375 是你查到的版本号)

sudo apt-get install mesa-common-dev

sudo apt-get install freeglut3-dev

nvidia-smi 查看显卡信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
(anaconda3-4.4.0) feel@feel ~ $ nvidia-smi
Fri Aug 4 23:35:12 2017
+-----------------------------------------------------------------------------+

| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 0000:01:00.0 On | N/A |
| 0% 39C P8 7W / 166W | 290MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+


+-----------------------------------------------------------------------------+

| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1358 G /usr/lib/xorg/Xorg 187MiB |
| 0 2148 G cinnamon 53MiB |
| 0 2225 G fcitx-qimpanel 6MiB |
| 0 14661 G ...el-token=EB93CBA024396B73A9BB8EB7603278B5 41MiB |
+-----------------------------------------------------------------------------+

cuda 的安装及配置

ubuntu 的 gcc 编译器是 5.4.0,然而 cuda8.0 不支持 5.0 以上的编译器,因此需要降级,把编译器版本降到 4.9
在 terminal 中执行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
sudo apt-get install gcc -4.9 gcc-5 g++-4.9 g++-5

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10

sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20

sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10

sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 30

sudo update-alternatives --set cc /usr/bin/gcc

sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 30

sudo update-alternatives --set c++ /usr/bin/g++
1
2
3
4
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt install cuda

进入非图像化界面

1
sudo init 3

测试 cuda 的 Samples
命令行输入(注意 cuda-8.0 是要相对应自己的 cuda 版本)

1
2
3
4
5
cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery

make

sudo ./deviceQuery

返回如下信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
(anaconda3-4.4.0) feel@feel /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery $ sudo ./deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1070"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 8114 MBytes (8507752448 bytes)
(15) Multiprocessors, (128) CUDA Cores/MP: 1920 CUDA Cores
GPU Max Clock rate: 1835 MHz (1.84 GHz)
Memory Clock rate: 4004 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1070
Result = PASS

安装 cuDNN

1
2
3
4
tar  -xvf  cudnn-8.0-linux-x64-v7.tar
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
1
2
3
4
5
6
7
8
9
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
sess.run(hello)
a = tf.constant(10)
b = tf.constant(32)
sess.run(a + b)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> sess = tf.Session()
2017-08-05 02:45:02.887841: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-05 02:45:02.887891: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-05 02:45:02.887909: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-05 02:45:02.887925: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-05 02:45:02.887941: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-08-05 02:45:03.014577: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-08-05 02:45:03.014837: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.835
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.59GiB
2017-08-05 02:45:03.014850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-08-05 02:45:03.014853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2017-08-05 02:45:03.014862: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)

以上会打出警告信息,我们只有把这个警告关闭即可

1
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'