Add Precision-Recall curve in probscores (PR_curve)#531
Add Precision-Recall curve in probscores (PR_curve)#531DANIELV95 wants to merge 7 commits intopySTEPS:masterfrom
Conversation
…better for imbalanced distribution
dnerini
left a comment
There was a problem hiding this comment.
hi @DANIELV95 , thanks for this useful addition (PR curves are often more informative than ROC curves for rare-event verification). The structure (init/accum/compute) matches the ROC approach, which is good for API consistency.
This said, there are a few details worth correcting before merge:
- Missing visualization: I suggest that this PR be complemented with the relative plotting function
plot_PRin https://github.com/pySTEPS/pysteps/blob/master/pysteps/verification/plots.py (similar toplot_ROC). - Missing converage: we need basic units test along side the new method, see examples in https://github.com/pySTEPS/pysteps/blob/master/pysteps/tests/test_verification_probscores.py
| PR["corr_neg"][i] += np.sum(np.logical_and(~forecast_yes, ~obs_yes)) | ||
|
|
||
|
|
||
| def PR_curve_compute(PR, X_o, X_min, compute_area=False): |
There was a problem hiding this comment.
PR_curve_computedoes not use X_o or X_min anywhere, maybe a copy/paste error? please remove them
| false_alarms = PR["false_alarms"][i] | ||
|
|
||
| recall = hits / (hits + misses) if (hits + misses) > 0 else 0.0 | ||
| precision = hits / (hits + false_alarms) if (hits + false_alarms) > 0 else 1.0 |
There was a problem hiding this comment.
the convention using if (hits + false_alarms) > 0 else 1.0 should be documented in the docstrings as it can lead to some surprising results when thresholds yield zero predicted positives
There was a problem hiding this comment.
could you comment on how this is handled in scikit-learn for example?
| X_o = X_o[mask] | ||
| for i, p in enumerate(PR["prob_thrs"]): | ||
| forecast_yes = P_f >= p | ||
| obs_yes = X_o >= PR["X_min"] |
There was a problem hiding this comment.
this line computing obs_yes is constant and can be moved outside of the loop
Added Precision-Recall curve in probscores. Similar to ROC curve but better metric for imbalanced distribution.