Skip to content

Free, local Langfuse OSS setup with Ollama for LLM evaluation, scoring, and datasets.

License

Notifications You must be signed in to change notification settings

MohsinCreed/LangfuseOllama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6,127 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Langfuse + Ollama (Local LLM Evaluation Setup)

This repository provides a local, free, self-hosted setup for LLM evaluation using Langfuse and Ollama.

The goal is to:

  • Experiment with prompts
  • Compare multiple LLMs
  • Run evaluations (scores, LLM-as-judge, datasets)
  • Keep everything local and offline

This setup uses the official Langfuse OSS (version v3.148.0) with a local Ollama instance via the OpenAI-compatible API.

No Langfuse code changes, forks, or plugins are required.

This follows the recommended integration approach documented in: https://github.com/langfuse/langfuse


PREREQUISITES

  • Docker Desktop (WSL2 enabled on Windows)
  • 16 GB RAM recommended
  • Optional: NVIDIA GPU (for faster inference)

STARTING THE STACK

From the project root directory, run:

docker compose up -d

This starts:

  • Langfuse (UI + workers)
  • PostgreSQL, ClickHouse, Redis, MinIO
  • Ollama (CPU-only by default)

ACCESS LANGFUSE

Open your browser:

http://localhost:3000

First-time setup:

  1. Create a user
  2. Create an organization
  3. Create a project

All Langfuse data is stored locally in Docker volumes.


OLLAMA (LOCAL LLM RUNTIME)

Ollama runs inside Docker and exposes an OpenAI-compatible API.

You can interact with Ollama in TWO ways:

  1. Simple host command (recommended for most users)
  2. Explicit Docker command (useful if you also have native Ollama)

Both are shown below.


LIST AVAILABLE MODELS

Simple command: ollama list

Docker-specific command: docker exec -it ollama ollama list

NOTE:

  • If you have native Ollama installed AND Docker Ollama running, the docker command guarantees you are talking to the container.

PULL A MODEL (EXAMPLES)

Recommended models for evaluation:

Gemma 3 (4B) – primary eval / judge: docker exec -it ollama ollama pull gemma3:4b

Mistral 7B – comparison model: docker exec -it ollama ollama pull mistral

LLaMA 3.2 (3B) – reasoning comparison: docker exec -it ollama ollama pull llama3.2:3b

After pulling, verify:

ollama list or docker exec -it ollama ollama list


CONNECT OLLAMA TO LANGFUSE (UI)

In Langfuse UI:

Settings → LLM Connections → Add LLM Connection

Fill in exactly:

LLM adapter: openai

Provider name: OpenAI

API Key: ollama

API Base URL: http://ollama:11434/v1

Custom models: Add model name as in Ollama: For example, gemma3:1b

Save the connection.

You can now select Ollama models in:

  • Prompt Playground
  • Evaluations
  • Datasets
  • LLM-as-Judge

NOTE

  • Ollama must be running and reachable from Langfuse
  • The model must already be pulled in Ollama
  • Model names must match exactly (case-sensitive)
  • This works for both CPU-only and GPU-enabled Ollama
  • No external scripts or SDKs are required for UI usage

PROMPT PLAYGROUND TRACING

By default, Prompt Playground runs are NOT traced.

To enable tracing:

  1. Go to Settings → Tracing
  2. Enable "Trace Prompt Playground runs"
  3. Save

This allows:

  • Viewing traces
  • Running evaluations
  • Adding runs to datasets

OPTIONAL: ENABLE GPU SUPPORT (NVIDIA ONLY)

GPU support is DISABLED by default.

Requirements:

  • NVIDIA GPU
  • Latest NVIDIA drivers
  • Docker Desktop (WSL2)
  • NVIDIA Container Toolkit installed

Steps:

  1. Open docker-compose.yml
  2. Find the ollama service
  3. Uncomment the GPU block
  4. Restart Docker services:

docker compose down docker compose up -d


VERIFY GPU USAGE

Run:

docker exec -it ollama ollama ps

If GPU is enabled, GPU-related information will appear. If running on CPU, this command still works but shows CPU usage.


IMPORTANT NOTES

  • CPU-only mode is sufficient for learning and evaluations
  • GPU is recommended for larger models (7B+)
  • Models are NOT auto-downloaded
  • Each user chooses which models to pull
  • No prompts or data leave your machine

REPOSITORY PHILOSOPHY

  • Infrastructure is versioned in Git
  • Runtime data and models are local
  • No secrets are committed
  • Users have freedom to experiment
  • This is a learning & evaluation environment, not production

COMPATIBILITY

  • Langfuse OSS: v3.148.0
  • Ollama: OpenAI-compatible API
  • Deployment: Local Docker
  • Licensing: Fully open source

END

Releases

No releases published

Packages

No packages published

Languages