千家信息网

阿里ECS GPU机型如何安装驱动(系统:CentOS7.3 GPU:Nvidia P100)

发表于:2025-12-02 作者:千家信息网编辑
千家信息网最后更新 2025年12月02日,一、配置DNS以及yum1、配置DNS[root@gpu-test-01 ~]# vim /etc/resolv.conf nameserver 223.5.5.5nameserver 114.114
千家信息网最后更新 2025年12月02日阿里ECS GPU机型如何安装驱动(系统:CentOS7.3 GPU:Nvidia P100)

一、配置DNS以及yum
1、配置DNS

[root@gpu-test-01 ~]# vim /etc/resolv.conf nameserver 223.5.5.5nameserver 114.114.114.114options timeout:2 attempts:3 rotate single-request-reopen说明:我这里配置了两个外部DNS 223.5.5.5以及114.114.114.114[root@gpu-test-01 ~]# chattr +i /etc/reslov.conf

2、配置yum

[root@gpu-test-01 ~]# cd /etc/yum.repos.d/[root@gpu-test-01 yum.repos.d]# rm -rf ./*[root@gpu-test-01 yum.repos.d]# mv /etc/yum.repos.d/* /tmp[root@gpu-test-01 yum.repos.d]# wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo[root@gpu-test-01 yum.repos.d]# wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo[root@gpu-test-01 yum.repos.d]# wget https://us.download.nvidia.cn/tesla/418.67/nvidia-diag-driver-local-repo-rhel7-418.67-1.0-1.x86_64.rpm[root@gpu-test-01 yum.repos.d]# yum install nvidia-diag-driver-local-repo-rhel7-418.67-1.0-1.x86_64.rpm -y[root@gpu-test-01 yum.repos.d]# mv nvidia-diag-driver-local-repo-rhel7-418.67-1.0-1.x86_64.rpm /tmp/

二、下载驱动包

1、下载P100/P4驱动:

[root@gpu-test-01 ~]# wget http://us.download.nvidia.com/tesla/396.44/NVIDIA-Linux-x86_64-396.44.run

2、下载内核开发包:

[root@gpu-test-01 ~]# wget https://buildlogs.centos.org/c7.1611.u/kernel/20170620132051/3.10.0-514.21.2.el7.x86_64/kernel-devel-3.10.0-514.21.2.el7.x86_64.rpm

3、下载cuda包:(如果使用yum来装cuda-drivers,这一步也可以忽略)

[root@gpu-test-01 ~]# wget https://developer.nvidia.com/compute/cuda/9.1/Prod/local_installers/cuda_9.1.85_387.26_linux

三、配置信息

下载并安装kernel对应版本的kernel-devel和kernel-header包

[root@gpu-test-01 ~]# rpm -ivh kernel-devel-3.10.0-514.21.2.el7.x86_64.rpm Preparing...                          ################################# [100%]Updating / installing...   1:kernel-devel-3.10.0-514.21.2.el7 ################################# [100%][root@gpu-test-01 ~]# sudo rpm -qa | grep $(uname -r)kernel-headers-3.10.0-514.21.2.el7.x86_64kernel-3.10.0-514.21.2.el7.x86_64kernel-devel-3.10.0-514.21.2.el7.x86_64

说明:kernel-devel和kernel版本不一致会导致在安装driver rpm过程中driver编译出错。您可以在实例里运行rpm -qa | grep kernel检测版本是否一致。确认版本后,再重新安装驱动。

[root@gpu-test-01 ~]# sh NVIDIA-Linux-x86_64-396.44.run按照引导一路下一步:

验证下是否安装成功:

[root@gpu-test-01 ~]# nvidia-smi Sat Jun 22 18:39:14 2019       +-----------------------------------------------------------------------------+| NVIDIA-SMI 396.44                 Driver Version: 396.44                    ||-------------------------------+----------------------+----------------------+| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC || Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. ||===============================+======================+======================||   0  Tesla P100-PCIE...  Off  | 00000000:00:08.0 Off |                    0 || N/A   33C    P0    27W / 250W |      0MiB / 16280MiB |      4%      Default |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+| Processes:                                                       GPU Memory ||  GPU       PID   Type   Process name                             Usage      ||=============================================================================||  No running processes found                                                 |+-----------------------------------------------------------------------------+

到此驱动已经安装完成。

0