F-Score

Table of contents
  1. Confusion matrix
  2. Precision and Recall
    1. Precision
    2. Recall
  3. F1-score

Confusion matrix

Actual\PredictionPositiveNegative
PositiveTrue PositiveFalse Negative
NegativeFalse PositiveTrue Negative

Precision and Recall

Precisionrecall

Walber, CC BY-SA 4.0, via Wikimedia Commons

Precision

Also called the positive predict value (PPV),

\[precision = \frac{TP}{TP + FP}\]

Recall

Also called the sensitivity,

\[recall = \frac{TP}{TP + FN}\]

F1-score

An F-score is a measure of a binary classification’s accuracy.

There exists a general F-score of $F_\beta$, where the $\beta$ acts as a weight of importance for recall.

However, the balanced F-score (or $F_1$ score), where precision and recall are considered equally, is the harmonic mean of precision and recall:

\[F_1 = \frac{2}{recall^{-1} + precision^{-1}} = 2 \cdot \frac{precision \cdot recall}{precision + recall} = \frac{TP}{TP + \frac{1}{2}(FP + FN)}\]

Where $0 \le F_1 \le 1$.

$F_1$ of $1.0$ means perfect precision and recall, while $0$ means either one of them was $0$.

Always remeber that F1 score (and the whole TP, TN …) is dependent on which threshold was selected (by examining ROC, etc.).
Low F1 score does not represent the performance of the entire model, but the performance at a certain classification threshold.