问题描述
使用sklearn
计算AP时,当label全是负标签时会返回NaN,例如:
>>> import numpy as np
>>> from sklearn.metrics import average_precision_score
>>> average_precision_score(np.array([0, 0, 0, 0, 0]), np.array([0.1, 0.1, 0.1, 0.1, 0.1]))
xxx/lib/python3.7/site-packages/sklearn/metrics/_ranking.py:864: RuntimeWarning: invalid value encountered in true_divide
recall = tps / tps[-1]
nan
问题解决
参考average_precision_score does not return correct AP when all negative ground truth labels,这个Bug在scikit-learn==1.1.0
之后的版本被修复,因此升级sklearn
的版本即可。
具体方法:在Python (>= 3.8)
下执行命令,安装新的版本
pip install scikit-learn==1.1.0
效果如下:
>>> import numpy as np
>>> from sklearn.metrics import average_precision_score
>>> average_precision_score(np.array([0, 0, 0, 0, 0]), np.array([0.1, 0.1, 0.1, 0.1, 0.1]))
xxx/lib/python3.8/site-packages/sklearn/metrics/_ranking.py:874: UserWarning: No positive class found in y_true, recall is set to one for all thresholds.
warnings.warn(
-0.0
参考
1、average_precision_score does not return correct AP when all negative ground truth labels
2、FIX Fix recall in multilabel classification when true labels are all negative
3、scikit-learn 1.1.0