Skip to content

Developer productivity analytics. Python CLI that visualizes organization-wide contribution patterns and churn metrics via GitHub API.

License

Notifications You must be signed in to change notification settings

rgilks/github-org-metrics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Organization Metrics

A Python tool to fetch and analyze GitHub organization metrics, including developer activity, repository statistics, and DORA metrics for measuring software delivery performance.

Python 3.12+ License: MIT

Dashboard Screenshot

Features

Developer Metrics

  • Commit counts and code contribution (lines added/deleted)
  • Pull requests opened, reviewed, and commented on
  • Repository contribution breakdown
  • Smart Filtering: Automatically excludes bots and inactive users (0 lines changed)
  • Outlier Detection: Separates high-volume contributors (>100k lines) into a separate report

Repository Metrics

  • Activity levels and commit frequency
  • Branch and contributor counts
  • Primary programming language
  • Creation and last update dates

DORA Metrics

DORA (DevOps Research and Assessment) metrics help measure software delivery performance:

Metric Description
Lead Time Time from first commit to merge (branch-to-merge time)
Deployment Frequency How often code is deployed per repository
Change Failure Rate Percentage of deployments that fail
Mean Time to Recover Average recovery time after failures

Additional Features

  • Caching: Save API responses locally for faster re-analysis
  • Configurable: Analyze specific repos or top N by activity
  • CSV Export: Export results for further analysis in spreadsheets

Prerequisites

Installation

  1. Clone the repository:

    git clone https://github.com/rgilks/github-org-metrics.git
    cd github-org-metrics
  2. Install dependencies:

    uv sync
  3. Create a GitHub Personal Access Token:

    Go to GitHub Settings → Developer Settings → Personal Access Tokens → Fine-grained tokens and create a token with these permissions:

    Repository permissions:

    Permission Access
    Actions Read-only
    Contents Read-only
    Deployments Read-only
    Issues Read-only
    Metadata Read-only
    Pull requests Read-only

    Organization permissions:

    Permission Access
    Administration Read-only
    Members Read-only

    For more details, see GitHub's permissions documentation.

  4. Set the token as an environment variable:

    export GITHUB_TOKEN=your_token_here

    Tip: Add this to your shell profile (~/.bashrc, ~/.zshrc, etc.) for persistence.

Usage

uv run github_metrics.py <organization> [options]

Options

Option Description Default
--months N Number of months to analyze 3
--repos N Limit number of repositories all
--target-repos A B C Analyze specific repositories only -
--use-cache Use cached data if available -
--update-cache Refresh the cache with new data -
--fast Skip PR reviews/comments (faster) -
--anonymize Anonymize names in console (for screenshots) -
-v, --verbose Enable debug logging -

Examples

# Analyze all repos from the last 3 months
uv run github_metrics.py my-organization

# Fast mode (skip PR reviews/comments)
uv run github_metrics.py my-organization --fast

# Anonymize output for screenshots
uv run github_metrics.py my-organization --anonymize

# Analyze specific repositories
uv run github_metrics.py my-organization --target-repos api-service web-app

# Use cached data for faster re-analysis
uv run github_metrics.py my-organization --use-cache

# Refresh cache and re-analyze
uv run github_metrics.py my-organization --update-cache

# Enable verbose output for debugging
uv run github_metrics.py my-organization -v

Output

The script generates up to three CSV files:

<org>_github_developer_metrics.csv

Column Description
Developer GitHub username
Commits Number of commits in the period
Lines Added Total lines of code added
Lines Deleted Total lines of code deleted
PRs Opened Pull requests created
PRs Reviewed Pull requests reviewed
PR Comments Comments on pull requests
Repositories Top repositories contributed to

<org>_github_outliers.csv

Contains the same columns as Developer Metrics but isolates accounts with >100,000 lines added (typically generated files or bulk imports).

<org>_github_repository_metrics.csv

Column Description
Repository Repository name
Commits Number of commits in the period
PRs Pull requests in the period
Lead Time (h) Average hours from branch to merge
Deploys Number of CI/CD deployments
Fail % Percentage of failed deployments
Deploy (m) Average deployment time (minutes)
Created Repository creation date
Updated Last update date
Language Primary programming language
Branches Number of branches
Contributors Number of contributors

Caching

Data is cached to <org>_github_data_cache.json. This allows:

  • Faster re-runs: Skip API calls when experimenting with analysis
  • Offline analysis: Work with previously fetched data
  • Historical snapshots: Keep records of your metrics over time

Note: The cache stores raw API data. Use --update-cache to refresh with the latest data.

Understanding DORA Metrics

This tool calculates DORA metrics based on your GitHub data:

Lead Time for Changes

Measured as the time from the first commit on a branch to when it's merged. Lower is better—elite performers typically achieve less than 1 hour.

Deployment Frequency

Calculated from GitHub Actions workflow runs. Tracks how often your CI/CD pipeline successfully deploys. Elite performers deploy on demand (multiple times per day).

Change Failure Rate

The percentage of deployments that result in failures (based on workflow run conclusions). Elite performers have less than 15% failure rate.

Mean Time to Recover

The average time to recover from a failed deployment. Elite performers recover in less than 1 hour.

For more on DORA metrics and how to improve them, see:

Limitations

  • Rate limits: The GitHub API has rate limits. Use --use-cache to minimize API calls.
  • Large organizations: May take a while to fetch data for organizations with many active repositories.
  • Permissions: Some metrics require specific token permissions. Ensure your token has all required scopes.
  • DORA accuracy: Metrics are approximated from available GitHub data. For example, deployment frequency relies on GitHub Actions workflows.

Troubleshooting

"Rate limit exceeded"

The script automatically waits and retries when rate limited. For faster runs, use --use-cache after the initial fetch.

"Permission error"

Ensure your GitHub token has all the required permissions listed in the Installation section.

"Repository not found"

Check that:

  1. The repository exists and is accessible to your token
  2. You're using the correct organization name
  3. Your token has access to the organization

Dependency issues

uv sync

Development

This project uses modern Python tooling:

  • uv: Package management
  • ruff: Linting and formatting
# Lint code
uv run ruff check .

# Format code
uv run ruff format .

License

This project is open-source and available under the MIT License.

References

About

Developer productivity analytics. Python CLI that visualizes organization-wide contribution patterns and churn metrics via GitHub API.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages