Skip to content

feat(python): support PEP 770 (SBOM metadata in Python packages) #10021

@knqyf263

Description

@knqyf263

Summary

Add support for PEP 770 (Software Bill-of-Materials in Python packages). This PEP was accepted in April 2025 and allows Python packages to include SBOM documents in .dist-info/sboms/ directory.

Motivation

PEP 770 addresses the "Phantom Dependency Problem" - compiled Python packages (wheels) often bundle native libraries (.so files), but these bundled components are invisible to traditional SCA tools.

For example, xgboost includes libgomp bundled via auditwheel, but this dependency was not detectable until now.

With auditwheel 6.5.0 (November 2024), wheels now include CycloneDX SBOMs documenting bundled shared libraries:

xgboost-3.1.2.dist-info/
├── METADATA
├── WHEEL
├── RECORD
└── sboms/
    └── auditwheel.cdx.json   # New SBOM file

Current Problem

Trivy's SBOM Analyzer (pkg/fanal/analyzer/sbom/sbom.go) detects .cdx.json files and processes them. This causes:

  1. Duplicate packages: The same package appears twice (from METADATA and from SBOM)
  2. Missing FilePath: SBOM-derived entries lack FilePath
  3. Inconsistent PURLs: SBOM entries have file_name= qualifier that METADATA entries don't have

Example scan output for xgboost:

// From METADATA (Python packaging analyzer)
{
  "Name": "xgboost",
  "Version": "3.1.2",
  "Identifier": {
    "PURL": "pkg:pypi/xgboost@3.1.2"
  },
  "Licenses": ["Apache-2.0"],
  "FilePath": "usr/local/lib/python3.13/site-packages/xgboost-3.1.2.dist-info/METADATA"
}

// From auditwheel SBOM (SBOM analyzer) - duplicated without FilePath
{
  "ID": "xgboost@3.1.2",
  "Name": "xgboost",
  "Version": "3.1.2",
  "Identifier": {
    "PURL": "pkg:pypi/xgboost@3.1.2?file_name=xgboost-3.1.2-py3-none-manylinux_2_28_aarch64.whl"
  },
  "DependsOn": ["libgomp@8.5.0-28.el8_10.alma.1"]
  // No FilePath, No Licenses
}

Example SBOM (auditwheel.cdx.json)

{
  "bomFormat": "CycloneDX",
  "specVersion": "1.4",
  "version": 1,
  "metadata": {
    "component": {
      "type": "library",
      "bom-ref": "pkg:pypi/xgboost@3.1.2?file_name=xgboost-3.1.2-py3-none-manylinux_2_28_aarch64.whl",
      "name": "xgboost",
      "version": "3.1.2",
      "purl": "pkg:pypi/xgboost@3.1.2?file_name=xgboost-3.1.2-py3-none-manylinux_2_28_aarch64.whl"
    },
    "tools": [
      {
        "name": "auditwheel",
        "version": "6.5.0"
      }
    ]
  },
  "components": [
    {
      "type": "library",
      "bom-ref": "pkg:pypi/xgboost@3.1.2?file_name=xgboost-3.1.2-py3-none-manylinux_2_28_aarch64.whl",
      "name": "xgboost",
      "version": "3.1.2",
      "purl": "pkg:pypi/xgboost@3.1.2?file_name=xgboost-3.1.2-py3-none-manylinux_2_28_aarch64.whl"
    },
    {
      "type": "library",
      "bom-ref": "pkg:rpm/almalinux/libgomp@8.5.0-28.el8_10.alma.1#c61017c9a24eb6e1e1a3cdc9becd004a6419cbda3d54b4848b98f240a4829571",
      "name": "libgomp",
      "version": "8.5.0-28.el8_10.alma.1",
      "purl": "pkg:rpm/almalinux/libgomp@8.5.0-28.el8_10.alma.1"
    }
  ],
  "dependencies": [
    {
      "ref": "pkg:pypi/xgboost@3.1.2?file_name=xgboost-3.1.2-py3-none-manylinux_2_28_aarch64.whl",
      "dependsOn": [
        "pkg:rpm/almalinux/libgomp@8.5.0-28.el8_10.alma.1#c61017c9a24eb6e1e1a3cdc9becd004a6419cbda3d54b4848b98f240a4829571"
      ]
    },
    {
      "ref": "pkg:rpm/almalinux/libgomp@8.5.0-28.el8_10.alma.1#c61017c9a24eb6e1e1a3cdc9becd004a6419cbda3d54b4848b98f240a4829571"
    }
  ]
}

Proposed Solution

Handle PEP 770 SBOMs in the Python packaging analyzer rather than the generic SBOM analyzer:

  1. Exclude .dist-info/sboms/ from SBOM Analyzer: Prevent duplicate processing
  2. Merge SBOM data in Python packaging analyzer: When parsing METADATA, check for sboms/ directory and merge:
    • Keep FilePath and Licenses from METADATA
    • Add dependency information from SBOM
    • Add bundled native libraries (e.g., libgomp) as separate packages
    • Use canonical PURL without file_name= qualifier

Expected output after fix:

{
  "Name": "xgboost",
  "Version": "3.1.2",
  "Identifier": {
    "PURL": "pkg:pypi/xgboost@3.1.2"
  },
  "Licenses": ["Apache-2.0"],
  "FilePath": "usr/local/lib/python3.13/site-packages/xgboost-3.1.2.dist-info/METADATA",
  "DependsOn": ["libgomp@8.5.0-28.el8_10.alma.1"]
}

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions