Skip to content

[BACK] fix(scoring): update subsidy transparency score thresholds#489

Merged
jb-delafosse merged 2 commits intomainfrom
back/update-subsidy-score-thresholds
Jan 21, 2026
Merged

[BACK] fix(scoring): update subsidy transparency score thresholds#489
jb-delafosse merged 2 commits intomainfrom
back/update-subsidy-score-thresholds

Conversation

@jb-delafosse
Copy link
Collaborator

Summary

  • Adjust the subsidy transparency scoring scale to better discriminate communities at the low end
  • Previously too many communities ended up with score E when they had published something

New scoring thresholds

Score New threshold
E 0% (no data) or >105% (suspicious over-declaration)
D ]0%, 25%]
C ]25%, 50%]
B ]50%, 95%]
A ]95%, 105%]

Changes

  • Updated get_score_from_tp() function in bareme_enricher.py
  • Added explicit handling for invalid values (NaN, negative)
  • Added comprehensive unit tests

Test plan

  • Backend unit tests pass (9/9)
  • Run full pipeline to verify score recalculation
  • Visual verification on staging environment

Comment on lines 268 to 271
if math.isnan(tp) or tp < 0:
return "E" # Données invalides
if tp == 0:
return "E" # Aucune donnée exploitable
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋

Suggested change
if math.isnan(tp) or tp < 0:
return "E" # Données invalides
if tp == 0:
return "E" # Aucune donnée exploitable
if math.isnan(tp) or tp <= 0:
return "E" # Données invalides ou inexploitables

@jb-delafosse
Copy link
Collaborator Author

jb-delafosse commented Jan 21, 2026

Distribution actuelle des scores en production

Voici la distribution actuelle avant l'application des nouveaux seuils :

Score Subventions Marchés Publics Global
A 7 (0.002%) 5 028 (1.4%) 7 (0.002%)
B 16 (0.004%) 1 636 (0.5%) 24 (0.007%)
C 25 (0.007%) 13 806 (3.8%) 6 661 (1.8%)
D 35 (0.01%) 883 (0.2%) 14 699 (4.1%)
E 361 947 (99.98%) 340 677 (94.1%) 340 639 (94.1%)
Total 362 030 362 030 362 030

Cela confirme le besoin d'ajuster les seuils : 99.98% des (collectivités,année) ont un score E pour les subventions, montrant une mauvaise discrimination dans les scores bas.

@jb-delafosse
Copy link
Collaborator Author

jb-delafosse commented Jan 21, 2026

Mise à jour appliquée en production ✅

Comparaison avant/après

Score Avant Après Évolution
A 7 8 +1
B 16 51 +35 (+219%)
C 25 43 +18 (+72%)
D 35 45 +10
E 361 947 398 075 +36 128*

*Le nombre total de (collectivités,année) a augmenté de 362 030 à 398 222 (nouvelles (collectivités,année) ajoutées au dataset avec 2026

Amélioration clé

Les collectivités ayant publié des subventions sont maintenant mieux discriminées :

  • Scores non-E : 83 → 147 (augmentation de 77%)
  • Les collectivités qui publient quelque chose (même un petit montant) obtiennent désormais des scores D/C/B au lieu de E

Les nouveaux seuils fonctionnent comme prévu.

Adjust the scoring scale to better discriminate communities at the low end:
- E: 0% only (no usable data) or >105% (suspicious over-declaration)
- D: ]0%, 25%] (minimal effort)
- C: ]25%, 50%] (significant under-declaration)
- B: ]50%, 95%] (partial to good declaration)
- A: ]95%, 105%] (optimal declaration)

Also adds explicit handling for invalid values (NaN, negative) returning E.

no_jira
Add comprehensive tests for the subsidy transparency scoring function:
- Test all score levels (A, B, C, D, E)
- Test boundary values (0, 25, 50, 95, 105)
- Test invalid values (negative, NaN)
- Test over-declaration cases (>105%)

no_jira
@jb-delafosse jb-delafosse force-pushed the back/update-subsidy-score-thresholds branch from 6d76e59 to a27bee0 Compare January 21, 2026 21:53
@jb-delafosse jb-delafosse merged commit e3d46ba into main Jan 21, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants