From 9ad3e53957322f1a4b06732c7cc3ecec722e5d6b Mon Sep 17 00:00:00 2001
From: Shane McDonald <me@shanemcd.com>
Date: Fri, 28 Mar 2025 18:55:23 -0400
Subject: [PATCH] Add post on how to run ollama under rootless podman with
 systemd and quadlet

---
 .../03-ollama-rootless-podman-quadlet.md      | 73 +++++++++++++++++++
 1 file changed, 73 insertions(+)
 create mode 100644 content/posts/03-ollama-rootless-podman-quadlet.md

diff --git a/content/posts/03-ollama-rootless-podman-quadlet.md b/content/posts/03-ollama-rootless-podman-quadlet.md
new file mode 100644
index 0000000..a377c23
--- /dev/null
+++ b/content/posts/03-ollama-rootless-podman-quadlet.md
@@ -0,0 +1,73 @@
+---
+title: Running Ollama under Rootless Podman with Quadlet
+---
+
+I haven't seen any instances of other people running Ollama quite like this, so I thought I would share in case it proves to be useful for anyone else out there.
+
+For those not familiar with Quadlet, it provides functionality that allows you to run and manage containers with `systemd`.
+
+## NVIDIA GPU support
+
+Before we can run Ollama inside of a container we first need to install the NVIDIA Container Toolkit as described [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#with-dnf-rhel-centos-fedora-amazon-linux).
+
+### Generating the CDI specification file
+
+The documentation [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html) shows running this command manually. Given this will likely need to be re-ran over time and it is safe to re-invoke multiple times, I decided to wrap this up in a systemd unit that runs once every time my machine boots:
+
+```ini
+[Unit]
+Description=Generate NVIDIA CDI configuration
+
+[Service]
+Type=oneshot
+ExecStart=/usr/bin/nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
+RemainAfterExit=yes
+
+[Install]
+WantedBy=multi-user.target
+```
+
+Place this in `/etc/systemd/system/nvidia-cdi-generator.service` and run these commands as root:
+
+```
+$ systemctl daemon-reload
+$ systemctl enable --now nvidia-cdi-generator.service
+```
+
+## Ollama with Podman and Quadlet
+
+Now that we can talk to our NVIDIA GPU from within a container, we can create a Quadlet in `~/.config/containers/systemd/ollama.container`:
+
+```ini
+[Unit]
+Description=My Llama
+Requires=nvidia-cdi-generator
+After=nvidia-cdi-generator
+
+[Container]
+Image=docker.io/ollama/ollama
+AutoUpdate=registry
+PodmanArgs=--privileged --gpus=all
+Environment=NVIDIA_VISIBLE_DEVICES=all
+Volume=%h/.ollama:/root/.ollama
+PublishPort=11434:11434
+
+[Service]
+Restart=always
+
+[Install]
+WantedBy=default.target
+```
+
+Start it by:
+
+```
+$ systemctl --user daemon-reload
+$ systemctl --user start ollama.service
+```
+
+Verify it's running by viewing the logs:
+
+```
+$ journalctl --user -xeu ollama
+```