Skip to content

feat(skore): Detect high variance of metrics in cross valdiations #1989

@rouk1

Description

@rouk1

Is your feature request related to a problem? Please describe.

It could be nice to warn users that a metric has a high variance in specific split of a cross validation.

Describe the solution you'd like

Sample detection code by @glemaitre.

def detect_outliers_mad(scores, threshold=3.5):
    median = np.median(scores)
    mad = np.median(np.abs(scores - median))
    modified_z_scores = 0.6745 * (scores - median) / mad

    outliers = np.where(np.abs(modified_z_scores) > threshold)[0]
    return outliers, modified_z_scores

This new metric should also be serialized by CrossValidationReportPayload.

Describe alternatives you've considered, if relevant

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs API design 🎨Requires public/private API design before implementationneeds Investigation 🔎Requires investigating the issue to know if we should go further with the idearfc ❓Request for comments

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions