Terminology and derivations from a confusion matrix
condition positive (P)
the number of real positive cases in the data
condition negative (N)
the number of real negative cases in the data
true positive (TP)
A test result that correctly indicates the presence of a condition or characteristic
true negative (TN)
A test result that correctly indicates the absence of a condition or characteristic
false positive (FP)
A test result which wrongly indicates that a particular condition or attribute is present
false negative (FN)
A test result which wrongly indicates that a particular condition or attribute is absent
sensitivity , recall , hit rate , or true positive rate (TPR)
T P R = T P P = T P T P + F N = 1 − F N R {\displaystyle \mathrm {TPR} ={\frac {\mathrm {TP} }{\mathrm {P} }}={\frac {\mathrm {TP} }{\mathrm {TP} +\mathrm {FN} }}=1-\mathrm {FNR} }
specificity , selectivity or true negative rate (TNR)
T N R = T N N = T N T N + F P = 1 − F P R {\displaystyle \mathrm {TNR} ={\frac {\mathrm {TN} }{\mathrm {N} }}={\frac {\mathrm {TN} }{\mathrm {TN} +\mathrm {FP} }}=1-\mathrm {FPR} }
precision or positive predictive value (PPV)
P P V = T P T P + F P = 1 − F D R {\displaystyle \mathrm {PPV} ={\frac {\mathrm {TP} }{\mathrm {TP} +\mathrm {FP} }}=1-\mathrm {FDR} }
negative predictive value (NPV)
N P V = T N T N + F N = 1 − F O R {\displaystyle \mathrm {NPV} ={\frac {\mathrm {TN} }{\mathrm {TN} +\mathrm {FN} }}=1-\mathrm {FOR} }
miss rate or false negative rate (FNR)
F N R = F N P = F N F N + T P = 1 − T P R {\displaystyle \mathrm {FNR} ={\frac {\mathrm {FN} }{\mathrm {P} }}={\frac {\mathrm {FN} }{\mathrm {FN} +\mathrm {TP} }}=1-\mathrm {TPR} }
fall-out or false positive rate (FPR)
F P R = F P N = F P F P + T N = 1 − T N R {\displaystyle \mathrm {FPR} ={\frac {\mathrm {FP} }{\mathrm {N} }}={\frac {\mathrm {FP} }{\mathrm {FP} +\mathrm {TN} }}=1-\mathrm {TNR} }
false discovery rate (FDR)
F D R = F P F P + T P = 1 − P P V {\displaystyle \mathrm {FDR} ={\frac {\mathrm {FP} }{\mathrm {FP} +\mathrm {TP} }}=1-\mathrm {PPV} }
false omission rate (FOR)
F O R = F N F N + T N = 1 − N P V {\displaystyle \mathrm {FOR} ={\frac {\mathrm {FN} }{\mathrm {FN} +\mathrm {TN} }}=1-\mathrm {NPV} }
Positive likelihood ratio (LR+)
L R + = T P R F P R {\displaystyle \mathrm {LR+} ={\frac {\mathrm {TPR} }{\mathrm {FPR} }}}
Negative likelihood ratio (LR-)
L R − = F N R T N R {\displaystyle \mathrm {LR-} ={\frac {\mathrm {FNR} }{\mathrm {TNR} }}}
prevalence threshold (PT)
P T = F P R T P R + F P R {\displaystyle \mathrm {PT} ={\frac {\sqrt {\mathrm {FPR} }}{{\sqrt {\mathrm {TPR} }}+{\sqrt {\mathrm {FPR} }}}}}
threat score (TS) or critical success index (CSI)
T S = T P T P + F N + F P {\displaystyle \mathrm {TS} ={\frac {\mathrm {TP} }{\mathrm {TP} +\mathrm {FN} +\mathrm {FP} }}}
Prevalence
P P + N {\displaystyle {\frac {\mathrm {P} }{\mathrm {P} +\mathrm {N} }}}
accuracy (ACC)
A C C = T P + T N P + N = T P + T N T P + T N + F P + F N {\displaystyle \mathrm {ACC} ={\frac {\mathrm {TP} +\mathrm {TN} }{\mathrm {P} +\mathrm {N} }}={\frac {\mathrm {TP} +\mathrm {TN} }{\mathrm {TP} +\mathrm {TN} +\mathrm {FP} +\mathrm {FN} }}}
balanced accuracy (BA)
B A = T P R + T N R 2 {\displaystyle \mathrm {BA} ={\frac {TPR+TNR}{2}}}
F1 score
is the harmonic mean of precision and sensitivity : F 1 = 2 × P P V × T P R P P V + T P R = 2 T P 2 T P + F P + F N {\displaystyle \mathrm {F} _{1}=2\times {\frac {\mathrm {PPV} \times \mathrm {TPR} }{\mathrm {PPV} +\mathrm {TPR} }}={\frac {2\mathrm {TP} }{2\mathrm {TP} +\mathrm {FP} +\mathrm {FN} }}}
phi coefficient (φ or rφ ) or Matthews correlation coefficient (MCC)
M C C = T P × T N − F P × F N ( T P + F P ) ( T P + F N ) ( T N + F P ) ( T N + F N ) {\displaystyle \mathrm {MCC} ={\frac {\mathrm {TP} \times \mathrm {TN} -\mathrm {FP} \times \mathrm {FN} }{\sqrt {(\mathrm {TP} +\mathrm {FP} )(\mathrm {TP} +\mathrm {FN} )(\mathrm {TN} +\mathrm {FP} )(\mathrm {TN} +\mathrm {FN} )}}}}
Fowlkes–Mallows index (FM)
F M = T P T P + F P × T P T P + F N = P P V × T P R {\displaystyle \mathrm {FM} ={\sqrt {{\frac {TP}{TP+FP}}\times {\frac {TP}{TP+FN}}}}={\sqrt {PPV\times TPR}}}
informedness or bookmaker informedness (BM)
B M = T P R + T N R − 1 {\displaystyle \mathrm {BM} =\mathrm {TPR} +\mathrm {TNR} -1}
markedness (MK) or deltaP (Δp)
M K = P P V + N P V − 1 {\displaystyle \mathrm {MK} =\mathrm {PPV} +\mathrm {NPV} -1}
Diagnostic odds ratio (DOR)
D O R = L R + L R − {\displaystyle \mathrm {DOR} ={\frac {\mathrm {LR+} }{\mathrm {LR-} }}}
Sources: Fawcett (2006),[1] Piryonesi and El-Diraby (2020),[2]
Powers (2011),[3] Ting (2011),[4] CAWCR,[5] D. Chicco & G. Jurman (2020, 2021, 2023) ,[6] [7] [8] Tharwat (2018).[9] Balayla (2020)[10]
[[Category:Template documentation pages{{#translation:}}]]
↑ Fawcett, Tom (2006). "An Introduction to ROC Analysis" (PDF) . Pattern Recognition Letters . 27 (8): 861–874. doi :10.1016/j.patrec.2005.10.010 .
↑ Piryonesi S. Madeh; El-Diraby Tamer E. (2020-03-01). "Data Analytics in Asset Management: Cost-Effective Prediction of the Pavement Condition Index". Journal of Infrastructure Systems . 26 (1): 04019036. doi :10.1061/(ASCE)IS.1943-555X.0000512 .
↑ Powers, David M. W. (2011). "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation" . Journal of Machine Learning Technologies . 2 (1): 37–63.
↑ Ting, Kai Ming (2011). Sammut, Claude; Webb, Geoffrey I. (eds.). Encyclopedia of machine learning . Springer. doi :10.1007/978-0-387-30164-8 . ISBN 978-0-387-30164-8 .
↑ Brooks, Harold; Brown, Barb; Ebert, Beth; Ferro, Chris; Jolliffe, Ian; Koh, Tieh-Yong; Roebber, Paul; Stephenson, David (2015-01-26). "WWRP/WGNE Joint Working Group on Forecast Verification Research" . Collaboration for Australian Weather and Climate Research . World Meteorological Organisation. Retrieved 2019-07-17 .
↑ Chicco D.; Jurman G. (January 2020). "The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation" . BMC Genomics . 21 (1): 6-1–6-13. doi :10.1186/s12864-019-6413-7 . PMC 6941312 . PMID 31898477 .
↑ Chicco D.; Toetsch N.; Jurman G. (February 2021). "The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation" . BioData Mining . 14 (13): 1-22. doi :10.1186/s13040-021-00244-z . PMC 7863449 . PMID 33541410 .
↑ Chicco D.; Jurman G. (2023). "The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification" . BioData Mining . 16 (1). doi :10.1186/s13040-023-00322-4 . PMC 9938573 .
↑ Tharwat A. (August 2018). "Classification assessment methods" . Applied Computing and Informatics . doi :10.1016/j.aci.2018.08.003 .
↑ Balayla, Jacques (2020). "Prevalence threshold (ϕe) and the geometry of screening curves" . PLoS One . 15 (10). doi :10.1371/journal.pone.0240215 .