Precision-Recall (PR) curves are useful for evaluating the performance of a binary classifier on highly skewed datasets where one class is much more prevalent than the other. This situation is common in biology where most things have little to no effect and there is a small subset of things that have large effect. In contrast to ROC curves, PR curves do not overestimate performance in these cases (Davis and Goadrich 2006). The reason ROC curves are more sensitive to this issue is due to their reliance on the false positive rate (FPR), defined as $$\frac{FP}{FP + TN}$$ where $$FP$$ and $$TN$$ are the number of false positives and true negatives, respectively. Since $$TN >> FP$$ for skewed datasets, ROC curves are insensitive to the number of false positives, making them overly optimistic.

PR curves, on the other hand, do not use $$TN$$ so they avoid this problem, since precision and recall are defined as $$\frac{TP}{TP+FP}$$ and $$\frac{TP}{TP+FN}$$, respectively. Intuitively, precision measures what fraction of called positive hits are correct and recall measures how many of the actual positive hits did the algorithm call. Generating the curves is all very nice, but it is desirable to collapse this curve down into a single value when scanning through a large parameter space, which is often the area under the curve (AUC). However, unlike with ROC curves, there isn’t a single accepted way of computing the AUC of a PR curve (AUPRC).

I recently found an interesting paper by Boyd, Eng, and Page (2013) that explored different ways of computing the AUPRC. They showed that there are some good and some very bad ways of computing this value and they generated some really nice figures in R. I much prefer Julia so I decided to recreate some of the results of the paper using it. My implementation is pretty fast, but I would gladly accept any PRs to improve it.

## Precision, recall, and AUC calculation

##Plotting

Then to recreate Figure 2: The artificial datasets with the positive distributions in blue and the negative ones in grey

And now we’re ready for Figure 3:

Voila, we have Figure 3: Precision-recall curves on highly skewed artificial datasets. 90% of the data is negative.

Boyd, Kendrick, Kevin H. Eng, and C. David Page. 2013. “Area Under the Precision-Recall Curve: Point Estimates and Confidence Intervals.” In Machine Learning and Knowledge Discovery in Databases, edited by Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, and Filip Železný, 451–66. Lecture Notes in Computer Science 8190. Springer Berlin Heidelberg. http://link.springer.com/chapter/10.1007/978-3-642-40994-3_29.

Davis, Jesse, and Mark Goadrich. 2006. “The Relationship Between Precision-Recall and ROC Curves.” In Proceedings of the 23rd International Conference on Machine Learning, 233–40. ICML ’06. New York, NY, USA: ACM. https://doi.org/10.1145/1143844.1143874.