Debarshi Ray: Ollama on Fedora Silverblue

I found myself dealing with various rough edges and questions around running Ollama on Fedora Silverblue for the past few months. These arise from the fact that there are a few different ways of installing Ollama, /usr is a read-only mount point on Silverblue, people have different kinds of GPUs or none at all, the program that’s using Ollama might be a graphical application in a Flatpak or part of the operating system image, and so on. So, I thought I’ll document a few different use-cases in one place for future reference or maybe someone will find it useful.
Different ways of installing Ollama
There are at least three different ways of installing Ollama on Fedora Silverblue. Each of those have their own nuances and trade-offs that we will explore later.
First, there’s the popular single command POSIX shell script installer:
$ curl -fsSL https://ollama.com/install.sh | sh

There is a manual step by step variant for those who are uncomfortable with running a script straight off the Internet. They both install Ollama in the operating system’s /usr/local or /usr or / prefix, depending on which one comes first in the PATH environment variable, and attempts to enable and activate a systemd service unit that runs ollama serve.
Second, there’s a docker.io/ollama/ollama OCI image that can be used to put Ollama in a container. The container runs ollama serve by default.
Finally, there’s Fedora’s ollama RPM.
Surprise
Astute readers might be wondering why I mentioned the shell script installer in the context of Fedora Silverblue, because /usr is a read-only mount point. Won’t it break the script? Not really, or the script breaks but not in the way one might expect.
Even though, /usr is read-only on Silverblue, /usr/local is not, because it’s a symbolic link to /var/usrlocal, and Fedora defaults to putting /usr/local/bin earlier in the PATH environment variable than the other prefixes that the installer attempts to use, as long as pkexec(1) isn’t being used. This happy coincidence allows the installer to place the Ollama binaries in their right places.
The script does fail eventually when attempting to create the systemd service unit to run ollama serve, because it tries to create an ollama user with /usr/share/ollama as its home directory. However, this half-baked installation works surprisingly well as long as nobody is trying to use an AMD GPU.
NVIDIA GPUs work, if the proprietary driver and nvidia-smi(1) are present in the operating system, which are provided by the kmod-nvidia and xorg-x11-drv-nvidia-cuda packages from RPM Fusion; and so does CPU fallback.
Unfortunately, the results would be the same if the shell script installer is used inside a Toolbx container. It will fail to create the systemd service unit because it can’t connect to the system-wide instance of systemd.
Using AMD GPUs with Ollama is an important use-case. So, let’s see if we can do better than trying to manually work around the hurdles faced by the script.
OCI image
The docker.io/ollama/ollama OCI image requires the user to know what processing hardware they have or want to use. To use it only with the CPU without any GPU acceleration:
$ podman run
–name ollama
–publish 11434:11434
–rm
–security-opt label=disable
–volume ~/.ollama:/root/.ollama
docker.io/ollama/ollama:latest

This will be used as the baseline to enable different kinds of GPUs. Port 11434 is the default port on which the Ollama server listens, and ~/.ollama is the default directory where it stores its SSH keys and artificial intelligence models.
To enable NVIDIA GPUs, the proprietary driver and nvidia-smi(1) must be present on the host operating system, as provided by the kmod-nvidia and xorg-x11-drv-nvidia-cuda packages from RPM Fusion. The user space driver has to be injected into the container from the host using NVIDIA Container Toolkit, provided by the nvidia-container-toolkit package from Fedora, for Ollama to be able to use the GPUs.
The first step is to generate a Container Device Interface (or CDI) specification for the user space driver:
$ sudo nvidia-ctk cdi generate –output /etc/cdi/nvidia.yaml
…
…

Then the container needs to be run with access to the GPUs, by adding the –gpus option to the baseline command above:
$ podman run
–gpus all
–name ollama
–publish 11434:11434
–rm
–security-opt label=disable
–volume ~/.ollama:/root/.ollama
docker.io/ollama/ollama:latest

AMD GPUs don’t need the driver to be injected into the container from the host, because it can be bundled with the OCI image. Therefore, instead of generating a CDI specification for them, an image that bundles the driver must be used. This is done by using the rocm tag for the docker.io/ollama/ollama image.
Then container needs to be run with access to the GPUs. However, the –gpus option only works for NVIDIA GPUs. So, the specific devices need to be spelled out by adding the –devices option to the baseline command above:
$ podman run
–device /dev/dri
–device /dev/kfd
–name ollama
–publish 11434:11434
–rm
–security-opt label=disable
–volume ~/.ollama:/root/.ollama
docker.io/ollama/ollama:rocm

However, because of how AMD GPUs are programmed with ROCm, it’s possible that some decent GPUs might not be supported by the docker.io/ollama/ollama:rocm image. The ROCm compiler needs to explicitly support the GPU in question, and Ollama needs to be built with such a compiler. Unfortunately, the binaries in the image leave out support for some GPUs that would otherwise work. For example, my AMD Radeon RX 6700 XT isn’t supported.
This can be verified with nvtop(1) in a Toolbx container. If there’s no spike in the GPU and its memory then its not being used.
It will be good to support as many AMD GPUs as possible with Ollama. So, let’s see if we can do better.
Fedora’s ollama RPM
Fedora offers a very capable ollama RPM, as far as AMD GPUs are concerned, because Fedora’s ROCm stack supports a lot more GPUs than other builds out there. It’s possible to check if a GPU is supported either by using the RPM and keeping an eye on nvtop(1), or by comparing the name of the GPU shown by rocminfo with those listed in the rocm-rpm-macros RPM.
For example, according to rocminfo, the name for my AMD Radeon RX 6700 XT is gfx1031, which is listed in rocm-rpm-macros:
$ rocminfo
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
Runtime Ext Version: 1.6
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
DMAbuf Support: YES

==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 7 5800X 8-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 7 5800X 8-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
…
…
*******
Agent 2
*******
Name: gfx1031
Uuid: GPU-XX
Marketing Name: AMD Radeon RX 6700 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
…
…

The ollama RPM can be installed inside a Toolbx container, or it can be layered on top of the base registry.fedoraproject.org/fedora image to replace the docker.io/ollama/ollama:rocm image:
FROM registry.fedoraproject.org/fedora:42
RUN dnf –assumeyes upgrade
RUN dnf –assumeyes install ollama
RUN dnf clean all
ENV OLLAMA_HOST=0.0.0.0:11434
EXPOSE 11434
ENTRYPOINT [“/usr/bin/ollama”]
CMD [“serve”]

Unfortunately, for obvious reasons, Fedora’s ollama RPM doesn’t support NVIDIA GPUs.
Conclusion
From the puristic perspective of not touching the operating system’s OSTree image, and being able to easily remove or upgrade Ollama, using an OCI container is the best option for using Ollama on Fedora Silverblue. Tools like Podman offer a suite of features to manage OCI containers and images that are far beyond what the POSIX shell script installer can hope to offer.
It seems that the realities of GPUs from AMD and NVIDIA prevent the use of the same OCI image, if we want to maximize our hardware support, and force the use of slightly different Podman commands and associated set-up. We have to create our own image using Fedora’s ollama RPM for AMD, and the docker.io/ollama/ollama:latest image with NVIDIA Container Toolkit for NVIDIA.