Skip to content

Python package for calculating weighted Matthews Correlation Coefficient (MCC) and its multiclass extensions (ECC, MPC1, MPC2) with theoretical robustness guarantees.

License

Notifications You must be signed in to change notification settings

kuslavicek/weighted_mcc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Weighted MCC: Robust Multiclass Metrics

License Maintained Version zread

Weighted MCC is a Python package that implements robust performance metrics for binary and multiclass classification tasks where individual observations have different importance weights.

Based on the paper describing Weighted Matthews Correlation Coefficient (MCC), this package provides a mathematically sound way to evaluate classifiers in high-stakes domains like medical imaging (segmentation) and autonomous driving, where some errors are costlier than others.

Features

  • Weighted Binary MCC: Calculate MCC for binary tasks with per-sample weights.
  • Multiclass Extensions:
    • ECC (Extended Correlation Coefficient): A robust multiclass generalization of MCC.
    • MPC (Multivariate Pearson Correlation): Variants (MPC1, MPC2) derived from covariance matrix theory.
  • Robustness Analysis:
    • Compute theoretical upper bounds on metric stability given weight perturbations ($\epsilon$).
    • Theoretically proven stability ensures metrics are not brittle to small weight changes.
  • Efficient Implementation: Vectorized operations using NumPy for high performance on large datasets.

Installation

pip install weighted_mcc

Usage

Binary Classification

import numpy as np
from weighted_mcc import weighted_mcc

y_true = np.array([1, 0, 1, 1, 0])
y_pred = np.array([1, 0, 0, 1, 1])
weights = np.array([2.0, 1.0, 5.0, 1.0, 1.0]) # 3rd sample is critical

# Calculate Weighted MCC
score = weighted_mcc(y_true, y_pred, weights)
print(f"Weighted MCC: {score:.4f}")

Multiclass Classification

For multiclass, inputs should generally be one-hot encoded for the mathematical functions, or use helper utilities if provided (check documentation).

import numpy as np
from weighted_mcc import extended_corr_coef, mpc_trace_ratio

# Example: 3 classes, 4 samples
y_true = np.array([[1,0,0], [0,1,0], [0,0,1], [1,0,0]])
y_pred = np.array([[1,0,0], [0,0,1], [0,0,1], [0,1,0]])
weights = np.array([1.0, 1.0, 2.0, 1.0])

# Extended Correlation Coefficient
ecc = extended_corr_coef(y_true, y_pred, weights)
print(f"ECC: {ecc:.4f}")

# Multivariate Pearson Correlation (Trace Ratio)
mpc1 = mpc_trace_ratio(y_true, y_pred, weights)
print(f"MPC1: {mpc1:.4f}")

Robustness Check

Verify if your metric score is stable under potential weight noise (e.g., if weights are subjective).

from weighted_mcc import calculate_multiclass_stability_bound

epsilon = 0.01 # Max potential deviation in weights
bound = calculate_multiclass_stability_bound(y_true, y_pred, weights, epsilon, metric_type='ECC')

print(f"Score could vary by at most ±{bound:.4f} given epsilon={epsilon}")

References

This project incorporates research from the following paper:

  • Weighted MCC: A Robust Measure of Multiclass Classifier Performance for Observations with Individual Weights Rommel Cortez, Bala Krishnamoorthy arXiv:2512.20811

About

Python package for calculating weighted Matthews Correlation Coefficient (MCC) and its multiclass extensions (ECC, MPC1, MPC2) with theoretical robustness guarantees.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages