ABSTRACT
The Precision-Recall (PR) curve is a widely used visual tool to evaluate the performance of scoring functions in regards to their capacities to discriminate between two populations. The purpose of this paper is to examine both theoretical and practical issues related to the statistical estimation of PR curves based on classification data. Consistency and asymptotic normality of the empirical counterpart of the PR curve in sup norm are rigorously established. Eventually, the issue of building confidence bands in the PR space is considered and a specific resampling procedure based on a smoothed and truncated version of the empirical distribution of the data is promoted. Arguments of theoretical and computational nature are presented to explain why such a bootstrap is preferable to a "naive" bootstrap in this setup.
- Bertail, P., Clémençon, S., & Vayatis, N. (2008). On bootstrapping the ROC curve. In Proc. of Neur. Inf. Proc. Syst. 2008, Vancouver, Canada.Google Scholar
- Bucklew, J. (2003). Introduction to rare event simulation. Springer.Google Scholar
- Clémençon, S., & Vayatis, N. (2008). Tree-structured ranking rules and approximation of the optimal ROC curve. Proceedings of the 2008 conference on Algorithmic Learning Theory. Lect. Notes Art. Int. 5254, pp. 22--37, Springer. Google ScholarDigital Library
- Csorgo, M., & Revesz, P. (1981). Strong approximations in probability and statistics. Academic Press.Google Scholar
- Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In Proceedings of the 23 rd International Conference on Machine Learning, Vol. 148, pp. 233--240. Google ScholarDigital Library
- Efron, B. (1979). Bootstrap methods: another look at the jacknife. Annals of Statistics, 7, 1--26.Google ScholarCross Ref
- Falk, M., & Reiss, R. (1989). Weak convergence of smoothed and nonsmoothed bootstrap quantile estimates. Annals of Probability, 17, 362--371.Google ScholarCross Ref
- Giné, E., & Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. Poincaré (B), Probabilités et Statistiques, 38, 907--921.Google ScholarCross Ref
- Horvath, L., Horvath, Z., & Zhou (2008). Confidence bands for ROC curves. Journal of Statistical Planning and Inference, 138, 1894--1904.Google ScholarCross Ref
- Hsieh, F., & Turnbull, B. (1996). Nonparametric and semi-parametric statistical estimation of the ROC curve. The Annals of Statistics, 24, 25--40.Google ScholarCross Ref
- Macskassy, S., & Provost, F. (2004). Confidence bands for ROC curves: methods and an empirical study. In Proceedings of the first Workshop on ROC Analysis in Artif. Int. at Eur. Conf. on Artif. Int. 2004.Google Scholar
- Macskassy, S., Provost, F., & Rosset, S. (2005). Bootstrapping the ROC curve: an empirical evaluation. In Proceedings of Int. Conf. Mach. Learn.-2005 Workshop on ROC Analysis in Machine Learning. Google ScholarDigital Library
- Manning, C. M., & Schutze, H. (1999). Foundations of statistical natural language processing. MIT Press. Google ScholarDigital Library
- Raghavan, V., Bollmann, P., & Jung, G. (1989). A critical investigation of recall and precision as measures of retrieval system performance. ACM Trans. Inf. Syst., 7, 205--229. Google ScholarDigital Library
- Shao, G., & Tu, J. (1995). The jackknife and bootstrap. Springer, NY.Google Scholar
- Shorack, G., & Wellner, J. (1986). Empirical processes with applications to statistics. Wiley, NY.Google Scholar
- Silverman, B., & Young, G. (1987). The bootstrap: to smooth or not to smooth? Biometrika, 74, 469--479.Google ScholarCross Ref
Index Terms
- Nonparametric estimation of the precision-recall curve
Recommendations
The relationship between Precision-Recall and ROC curves
ICML '06: Proceedings of the 23rd international conference on Machine learningReceiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an ...
On the Null Distribution of the Precision and Recall Curve
Machine Learning and Knowledge Discovery in DatabasesAbstractPrecision recall curves (pr-curves) and the associated area under (AUPRC) are commonly used to assess the accuracy of information retrieval (IR) algorithms. An informative baseline is random selection. The associated probability distribution makes ...
Nonparametric curve estimation and bootstrap bandwidth selection
AbstractOver the last four decades, the bootstrap method has been considered so as to define data‐driven bandwidth selectors for nonparametric curve estimation. An extensive and updated review of bootstrap methods used to select the smoothing parameter ...
The bootstrap method can be used for bandwidth selection in nonparametric curve estimation. image image
Comments