sklearn.metrics.f1_score (2024)

sklearn.metrics.f1_score(y_true, y_pred, *, labels=None, pos_label=1, average='binary', sample_weight=None, zero_division='warn')[source]

Compute the F1 score, also known as balanced F-score or F-measure.

The F1 score can be interpreted as a harmonic mean of the precision andrecall, where an F1 score reaches its best value at 1 and worst score at 0.The relative contribution of precision and recall to the F1 score areequal. The formula for the F1 score is:

\[\text{F1} = \frac{2 * \text{TP}}{2 * \text{TP} + \text{FP} + \text{FN}}\]

Where \(\text{TP}\) is the number of true positives, \(\text{FN}\) is thenumber of false negatives, and \(\text{FP}\) is the number of false positives.F1 is by defaultcalculated as 0.0 when there are no true positives, false negatives, orfalse positives.

Support beyond binary targets is achieved by treating multiclassand multilabel data as a collection of binary problems, one for eachlabel. For the binary case, setting average='binary' will returnF1 score for pos_label. If average is not 'binary', pos_label is ignoredand F1 score for both classes are computed, then averaged or both returned (whenaverage=None). Similarly, for multiclass and multilabel targets,F1 score for all labels are either returned or averaged depending on theaverage parameter. Use labels specify the set of labels to calculate F1 scorefor.

Read more in the User Guide.

Parameters:
y_true1d array-like, or label indicator array / sparse matrix

Ground truth (correct) target values.

y_pred1d array-like, or label indicator array / sparse matrix

Estimated targets as returned by a classifier.

labelsarray-like, default=None

The set of labels to include when average != 'binary', and theirorder if average is None. Labels present in the data can beexcluded, for example in multiclass classification to exclude a “negativeclass”. Labels not present in the data can be included and will be“assigned” 0 samples. For multilabel targets, labels are column indices.By default, all labels in y_true and y_pred are used in sorted order.

Changed in version 0.17: Parameter labels improved for multiclass problem.

pos_labelint, float, bool or str, default=1

The class to report if average='binary' and the data is binary,otherwise this parameter is ignored.For multiclass or multilabel targets, set labels=[pos_label] andaverage != 'binary' to report metrics for one label only.

average{‘micro’, ‘macro’, ‘samples’, ‘weighted’, ‘binary’} or None, default=’binary’

This parameter is required for multiclass/multilabel targets.If None, the scores for each class are returned. Otherwise, thisdetermines the type of averaging performed on the data:

'binary':

Only report results for the class specified by pos_label.This is applicable only if targets (y_{true,pred}) are binary.

'micro':

Calculate metrics globally by counting the total true positives,false negatives and false positives.

'macro':

Calculate metrics for each label, and find their unweightedmean. This does not take label imbalance into account.

'weighted':

Calculate metrics for each label, and find their average weightedby support (the number of true instances for each label). Thisalters ‘macro’ to account for label imbalance; it can result in anF-score that is not between precision and recall.

'samples':

Calculate metrics for each instance, and find their average (onlymeaningful for multilabel classification where this differs fromaccuracy_score).

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

zero_division{“warn”, 0.0, 1.0, np.nan}, default=”warn”

Sets the value to return when there is a zero division, i.e. when allpredictions and labels are negative.

Notes:- If set to “warn”, this acts like 0, but a warning is also raised.- If set to np.nan, such values will be excluded from the average.

New in version 1.3: np.nan option was added.

Returns:
f1_scorefloat or array of float, shape = [n_unique_labels]

F1 score of the positive class in binary classification or weightedaverage of the F1 scores of each class for the multiclass task.

See also

fbeta_score

Compute the F-beta score.

precision_recall_fscore_support

Compute the precision, recall, F-score, and support.

jaccard_score

Compute the Jaccard similarity coefficient score.

multilabel_confusion_matrix

Compute a confusion matrix for each class or sample.

Notes

When true positive + false positive + false negative == 0 (i.e. a classis completely absent from both y_true or y_pred), f-score isundefined. In such cases, by default f-score will be set to 0.0, andUndefinedMetricWarning will be raised. This behavior can be modified bysetting the zero_division parameter.

References

Examples

>>> import numpy as np>>> from sklearn.metrics import f1_score>>> y_true = [0, 1, 2, 0, 1, 2]>>> y_pred = [0, 2, 1, 0, 0, 1]>>> f1_score(y_true, y_pred, average='macro')0.26...>>> f1_score(y_true, y_pred, average='micro')0.33...>>> f1_score(y_true, y_pred, average='weighted')0.26...>>> f1_score(y_true, y_pred, average=None)array([0.8, 0. , 0. ])
>>> # binary classification>>> y_true_empty = [0, 0, 0, 0, 0, 0]>>> y_pred_empty = [0, 0, 0, 0, 0, 0]>>> f1_score(y_true_empty, y_pred_empty)0.0...>>> f1_score(y_true_empty, y_pred_empty, zero_division=1.0)1.0...>>> f1_score(y_true_empty, y_pred_empty, zero_division=np.nan)nan...
>>> # multilabel classification>>> y_true = [[0, 0, 0], [1, 1, 1], [0, 1, 1]]>>> y_pred = [[0, 0, 0], [1, 1, 1], [1, 1, 0]]>>> f1_score(y_true, y_pred, average=None)array([0.66666667, 1. , 0.66666667])

Examples using sklearn.metrics.f1_score

sklearn.metrics.f1_score (1)

Probability Calibration curves

Probability Calibration curves

sklearn.metrics.f1_score (2)

Precision-Recall

Precision-Recall

sklearn.metrics.f1_score (3)

Semi-supervised Classification on a Text Dataset

Semi-supervised Classification on a Text Dataset

sklearn.metrics.f1_score (2024)

References

Top Articles
Latest Posts
Article information

Author: Fredrick Kertzmann

Last Updated:

Views: 5470

Rating: 4.6 / 5 (66 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Fredrick Kertzmann

Birthday: 2000-04-29

Address: Apt. 203 613 Huels Gateway, Ralphtown, LA 40204

Phone: +2135150832870

Job: Regional Design Producer

Hobby: Nordic skating, Lacemaking, Mountain biking, Rowing, Gardening, Water sports, role-playing games

Introduction: My name is Fredrick Kertzmann, I am a gleaming, encouraging, inexpensive, thankful, tender, quaint, precious person who loves writing and wants to share my knowledge and understanding with you.