Precision, Recall and Friends for Dummies

Qualitative descriptions of popular machine learning and statistical analysis jargon

Yuval Greenfield

2 min readDec 18, 2018

--

In all of the following the scale is from 0 to 1 where a better classifier gets 1 and a bad classifier gets 0.

Precision

The quality of each positive prediction. Given a positive prediction — what are the odds that it was correct.
Σ True positive / Σ Predicted condition positive

Recall = Sensitivity = True Positive Rate

Probability of marking a positive as such. What percentage of the positives was correctly identified. How good the classifier is at avoiding false negatives.
Σ True positive / Σ Condition positive
1 - (Σ False negatives / Σ Condition positive)

Specificity = Selectivity = True negative rate

Probability of marking a negative as such. What percentage of the negatives was correctly identified.
Extra important for medical diagnostics because you don’t want to tell someone they have cancer when they don’t. Practically useless when evaluating a search engine because it’s easy to have billions of true negatives and a 99.999% specificity in that case.
Σ True negative / Σ Condition negative

F1 Score = Sørensen–Dice coefficient

How perfect is the classifier while ignoring the amount of true negatives. How much overlap there is between the classified-positive and the actually-positive group. The harmonic mean between precision and recall.

In the next post I’ll show the proof for F1 = DSC and show some intuitions for it alongside a visualization.

Identity Equations

You can infer these from the confusion matrix rows and columns. They’re useful for example to prove F1 = DSC.

Σ True positive + Σ False negative = Σ Condition positive
Σ True negative + Σ False positive = Σ Condition negative
Σ True negative + Σ True positive + Σ False negative + Σ False positive = Σ All samples

References

Machine Learning

Written by Yuval Greenfield

DevRel at Google, opinions are my own … https://www.linkedin.com/in/yuv/

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams