安装docker以及nvidia-container-toolkit

介绍

旧版本的docker不支持GPU, 需要安装nvidia-docker才能运行支持GPU的docker image. 新的docker支持GPU后, nvidia-docker等工具移至nvidia-container-toolkit下面, 只要安装nvidia-container-toolkit和docker就可以使用支持GPU的docker image.

安装docker

参考docker官网安装教程: https://docs.docker.com/engine/install/ubuntu/
具体步骤为:

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

# Add the repository to Apt sources:
echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# 验证是否安装成功
sudo docker run hello-world
# 将自己的用户加入到docker组中
sudo usermod -aG docker $(whoami)

安装nvidia-container-toolkit

参考nvidia安装文档: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \
  && \
    sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
# configure runtime 
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
# 验证安装是否成功
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi