Installing Nvidia drivers on Ubuntu for dockered ollama – Linux

For some tests with a large language model (LLM) I needed a test system with docker and a Nvidia card (for faster AI processing). Here’s what it takes to convert a basic Ubuntu 24.04.1 installation into a docker based LLM test machine:

First let’s have a look at our hardware:

# NVIDIA System Management Interface
linux > sudo lshw -numeric -C display
  *-display:1 UNCLAIMED
       description: 3D controller
       product: AD104GL [L4] [10DE:27B8]
       vendor: NVIDIA Corporation [10DE]
       physical id: 5
       bus info: pci@0000:00:05.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msix cap_list
       configuration: latency=0
       resources: iomemory:80-7f iomemory:100-ff memory:fc000000-fcffffff memory:800000000-fffffffff memory:1000000000-1001ffffff

Next: Install basic Nvidia drivers:

linux > sudo apt-get purge nvidia*
linux > sudo add-apt-repository ppa:graphics-drivers
linux > sudo apt-get update
linux > sudo apt upgrade
linux > sudo apt install ubuntu-drivers-common
linux > ubuntu-drivers list
nvidia-driver-535, (kernel modules provided by nvidia-dkms-535)
nvidia-driver-545, (kernel modules provided by nvidia-dkms-545)
nvidia-driver-560-open, (kernel modules provided by nvidia-dkms-560-open)
nvidia-driver-535-open, (kernel modules provided by nvidia-dkms-535-open)
nvidia-driver-550, (kernel modules provided by nvidia-dkms-550)
nvidia-driver-555-open, (kernel modules provided by nvidia-dkms-555-open)
nvidia-driver-550-open, (kernel modules provided by nvidia-dkms-550-open)
nvidia-driver-560, (kernel modules provided by nvidia-dkms-560)
nvidia-driver-565, (kernel modules provided by nvidia-dkms-565)
nvidia-driver-535-server-open, (kernel modules provided by nvidia-dkms-535-server-open)
nvidia-driver-565-open, (kernel modules provided by nvidia-dkms-565-open)
nvidia-driver-535-server, (kernel modules provided by nvidia-dkms-535-server)
nvidia-driver-555, (kernel modules provided by nvidia-dkms-555)
nvidia-driver-545-open, (kernel modules provided by nvidia-dkms-545-open)

linux > sudo apt install nvidia-driver-565
linux > reboot

After installation let’s have a close look at out GPU:

linux > nvidia-smi
Fri Nov 15 15:11:50 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L4                      Off |   00000000:00:05.0 Off |                    0 |
| N/A   56C    P0             32W /   72W |       1MiB /  23034MiB |      5%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

As everything looks as expected let’s continue with the CUDA installation:

# NVIDIA CUDA Toolkit/Compiler
linux > wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
linux > sudo dpkg -i cuda-keyring_1.1-1_all.deb
linux > sudo apt update
linux > sudo apt install cuda-toolkit
linux > sudo apt install vim
linux > vim ~/.bashrc
>> export PATH=/usr/local/cuda/bin:$PATH
>> export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
linux > source ~/.bashrc

As running llama in a docker container makes things much easier, the next step is to install docker itself.

Once done we’ll pull and run the ollama docker image:

linux > sudo docker pull ollama/ollama
linux > sudo docker run --rm --name ollama  ollama/ollama
<...>
level=INFO source=routes.go:1240 msg="Listening on [::]:11434 (version 0.4.1)"
level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
level=INFO source=gpu.go:221 msg="looking for compatible GPUs"
level=INFO source=gpu.go:386 msg="no compatible GPUs were discovered"
<...>

So ollama seems to run fine, but though cuda support is built-in docker complains “no compatible GPUs were discovered”. So let’s quit it using Ctrl-C and see what’s required to use GPUs within docker.

Docker and GPUs

Just a short search later you’ll find that there’s a –gpus docker option (–gpus=all selecting all available GPUs for the specified docker image). So let’s try it:

linux > sudo docker run --name ollama --rm --gpus=all ollama/ollama
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Looks like we’re missing some component, and here it is:

# Docker special
linux > sudo apt install nvidia-container-toolkit
linux > sudo systemctl restart docker

After restarting docker, let’s try again:

linux > sudo docker run --name ollama --rm --gpus=all ollama/ollama
<...>
level=INFO source=images.go:755 msg="total blobs: 0"
level=INFO source=images.go:762 msg="total unused blobs removed: 0"
level=INFO source=routes.go:1240 msg="Listening on [::]:11434 (version 0.4.1)"
level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu_avx2 cuda_v11 cuda_v12 cpu cpu_avx]"
level=INFO source=gpu.go:221 msg="looking for compatible GPUs"
level=INFO source=types.go:123 msg="inference compute" id=GPU-58ebc024-1121-104d-0609-79961d6f7a1c library=cuda variant=v12 compute=8.9 driver=12.7 name="NVIDIA L4" total="22.1 GiB" available="21.9 GiB"

And finally we got a docker container running with GPU support. Please be sure to add other options for the ollama docker container (export ports, bind volumes, etc.) otherwise this container will not be of much use.

And finally some extras you might find useful:

Some LLMs require special quantization harware support, that’s not available on all hardware. In order to get your Nvidia card’s capabilities use this:

# NVIDIA GeForce GTX 1080, CUDA 12.8, 570.158.01
linux # nvidia-smi --query-gpu=compute_cap --format=csv
compute_cap
6.1

# NVIDIA L4, CUDA 12.7, 565.57.01
linux # nvidia-smi --query-gpu=compute_cap --format=csv
compute_cap
8.9

# NVIDIA L40S, CUDA 12.8, 570.133.20
linux # nvidia-smi --query-gpu=compute_cap --format=csv
compute_cap
8.9

# System Information
linux > sudo apt install screenfetch
$ vim ~/.bashrc
>> screenfetch
linux > source ~/.bashrc

# Display
linux > nvidia-smi && nvcc -V

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu

https://medium.com/@scofield44165/ubuntu-24-04-1-install-nvidia-driver-560-cuda-12-6-fe8f820b1a2b

Docker and GPUs

Leave a Reply Cancel reply