GPU to Docker Container Metrics Exporter

A Prometheus exporter that enhances DCGM (Data Center GPU Manager) metrics with Docker container mapping information. This exporter is designed to work alongside DCGM Exporter, enriching its metrics with container-level information to provide better observability of GPU usage in containerized environments.

Features

Re-exports DCGM metrics with added container information (primary use case)
Maps GPU metrics to Docker container names
Adds container name information to DCGM metrics
Real-time monitoring of GPU processes and their container associations
Configurable update intervals and logging levels

Prerequisites

Go 1.x or higher
NVIDIA GPU(s)
NVIDIA drivers installed
DCGM Exporter running in your environment
nvidia-smi command-line tool
Docker runtime

Installation

Pre-built Binaries

You can download pre-built binaries for Linux (AMD64 and ARM64) from the releases page.

# Download the latest release for your architecture
# For AMD64:
curl -L -o dcgm-container-mapper "https://github.com/brtnshrdr/dcgm-container-mapper/releases/latest/download/dcgm-container-mapper-linux-amd64"
# For ARM64:
curl -L -o dcgm-container-mapper "https://github.com/brtnshrdr/dcgm-container-mapper/releases/latest/download/dcgm-container-mapper-linux-arm64"

# Make it executable
chmod +x dcgm-container-mapper-linux-amd64

Building from Source

# Clone the repository
git clone https://github.com/brtnshrdr/dcgm-container-mapper.git
cd dcgm-container-mapper

# Install dependencies
go mod tidy

# Build the binaries
./build.sh

Usage

The most common usage is with DCGM re-export enabled:

./dcgm-container-mapper --reexport-dcgm --dcgm-port 9400

This mode is recommended because:

It preserves all valuable DCGM metrics (GPU utilization, memory usage, temperature, etc.)
Adds container context to these metrics (container name, pod name, namespace)
Maintains compatibility with existing DCGM-based dashboards while adding container visibility
Enables better correlation between GPU metrics and container performance

Command Line Flags

--reexport-dcgm: Enable re-exporting of DCGM metrics [default: false] (Recommended to enable)
--dcgm-port: DCGM exporter port to read from [default: "9400"]
--port: Port to listen on [default: "9100"]
--listen-address: Address to listen on [default: "localhost"]
--update-interval: Interval to update GPU information [default: 5s]
--log-level: Set logging level (debug, info, warn, error) [default: "info"]

Metrics

The exporter provides metrics in two modes:

Re-export Mode (Recommended)

When running with --reexport-dcgm, all DCGM metrics are re-exported with additional container context labels:

exported_pod
exported_container
exported_namespace

exported_pod will always equal exported_container, and exported_namespace will always be "docker". This is to align with "Kubernetes mode" (DCGM_EXPORTER_KUBERNETES=true) of the DCGM exporter.

This enriches the standard DCGM metrics with container information, making it easier to track GPU usage per container/pod.

Basic Mode (Limited)

Without --reexport-dcgm, only basic GPU-to-container mapping is provided:

# HELP dcgm_container_mapping Mapping between GPU ID and container and process name
# TYPE dcgm_container_mapping gauge

Metric format:

dcgm_container_mapping{gpu="0",modelName="Tesla V100",UUID="GPU-xxx",container="container_name",process="process_name"} 0

Example

Start the exporter:

./dcgm-container-mapper --port 9100 --log-level debug

Access metrics:

curl http://localhost:9100/metrics

Monitoring Setup

Prometheus Configuration

Add the following to your prometheus.yml:

scrape_configs:
  - job_name: 'dcgm-container-mapper'
    static_configs:
      - targets: ['localhost:9100']

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
README.md		README.md
build.sh		build.sh
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU to Docker Container Metrics Exporter

Features

Prerequisites

Installation

Pre-built Binaries

Building from Source

Usage

Command Line Flags

Metrics

Re-export Mode (Recommended)

Basic Mode (Limited)

Example

Monitoring Setup

Prometheus Configuration

About

Uh oh!

Releases 5

Packages

Uh oh!

Languages

brtnshrdr/dcgm-container-mapper

Folders and files

Latest commit

History

Repository files navigation

GPU to Docker Container Metrics Exporter

Features

Prerequisites

Installation

Pre-built Binaries

Building from Source

Usage

Command Line Flags

Metrics

Re-export Mode (Recommended)

Basic Mode (Limited)

Example

Monitoring Setup

Prometheus Configuration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Languages

Packages