试用AI加速卡, 以及nvidia在debian12上的驱动安装过程 NVIDIA Corporation GP102GL [Tesla P40] [10de:1b38]

咸鱼810rmb买到一块p40加速卡,因为没有视频输出接口, 不能叫显卡了, 加速卡是在服务器机箱里用的,只有风道, 没有风扇,在普通机箱里用, 需要加个风扇。附件是我的风扇支架的openscad文件。

加速卡在我的2015年的主板上,电脑直接不能启动, 开机提示pci资源不够,然后找到一个2018年的bios升级后, 电脑可以进系统,

debian12下,先修改软件源, 增加非自由软件, 

cat /etc/apt/sources.list

deb http://mirrors.aliyun.com/debian bookworm main non-free non-free-firmware contrib
deb-src http://mirrors.aliyun.com/debian bookworm main non-free non-free-firmware contrib
deb http://mirrors.aliyun.com/debian bookworm-backports main non-free non-free-firmware contrib
deb-src http://mirrors.aliyun.com/debian bookworm-backports main non-free non-free-firmware contrib
deb http://mirrors.aliyun.com/debian-security/ bookworm-security main contrib non-free non-free-firmware
deb-src http://mirrors.aliyun.com/debian-security/ bookworm-security main contrib non-free non-free-firmware
 
安装驱动:
apt update
apt install nvidia-detect                             #先装个诊断软件, 看看加速卡需要装什么驱动包
apt install linux-headers-`uname -r`         #内核头文件
apt install dkms nvidia-driver                    #会用dkms进行自动编译安装驱动
apt install firmware-misc-nonfree             #这个包里包含GP102芯片的固件
apt install nvidia-smi                                 #看nvidia显卡状态的软件
 
执行
modprobe nvidia-current 
提示找不到显卡,
dmesg
里的提示信息是pci资源无效,
[ 37.010271] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid: NVRM: BAR1 is 0M @ 0x0 (PCI:0000:4f:00.0)
解决方法是在bios里,把"Above 4GB decoding" 设置项打开。
后面就没有问题了。
 
bak1:/lib/firmware/nvidia/gp102# nvidia-smi 
Wed Mar  6 14:02:20 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P40           On   | 00000000:01:00.0 Off |                  Off |
| N/A   58C    P0    53W / 250W |    280MiB / 24576MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2373      C   /usr/local/bin/ollama             244MiB |
|    0   N/A  N/A      2969      G   /usr/lib/xorg/Xorg                 34MiB |
+-----------------------------------------------------------------------------+
bak1:/lib/firmware/nvidia/gp102# 
bak1:/lib/firmware/nvidia/gp102# nvidia-detect 
Detected NVIDIA GPUs:
01:00.0 3D controller [0302]: NVIDIA Corporation GP102GL [Tesla P40] [10de:1b38] (rev a1)
 
Checking card:  NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
Your card is supported by all driver versions.
Your card is also supported by the Tesla drivers series.
Your card is also supported by the Tesla 470 drivers series.
It is recommended to install the
    nvidia-driver
package.
 
 
附件大小
p40_funa.scad2.96 千字节