module documentation

Classification evaluation.

Copyright 2017-2025, Voxel51, Inc.

Class BinaryClassificationResults Class that stores the results of a binary classification evaluation.
Class BinaryEvaluation Binary classification evaluation.
Class BinaryEvaluationConfig Binary evaluation config.
Class ClassificationEvaluation Base class for classification evaluation methods.
Class ClassificationEvaluationConfig Base class for configuring ClassificationEvaluation instances.
Class ClassificationResults Class that stores the results of a classification evaluation.
Class SimpleEvaluation Standard classification evaluation.
Class SimpleEvaluationConfig Simple classification evaluation config.
Class TopKEvaluation Top-k classification evaluation.
Class TopKEvaluationConfig Top-k classification evaluation config.
Function evaluate_classifications Evaluates the classification predictions in the given collection with respect to the specified ground truth labels.
Function _evaluate_top_k Undocumented
Function _parse_config Undocumented
Function _to_binary_scores Undocumented
def evaluate_classifications(samples, pred_field, gt_field='ground_truth', eval_key=None, classes=None, missing=None, method=None, custom_metrics=None, progress=None, **kwargs): (source)

Evaluates the classification predictions in the given collection with respect to the specified ground truth labels.

By default, this method simply compares the ground truth and prediction for each sample, but other strategies such as binary evaluation and top-k matching can be configured via the method parameter.

You can customize the evaluation method by passing additional parameters for the method's config class as kwargs.

The natively provided method values and their associated configs are:

If an eval_key is specified, then this method will record some statistics on each sample:

  • When evaluating sample-level fields, an eval_key field will be populated on each sample recording whether that sample's prediction is correct.
  • When evaluating frame-level fields, an eval_key field will be populated on each frame recording whether that frame's prediction is correct. In addition, an eval_key field will be populated on each sample that records the average accuracy of the frame predictions of the sample.
Parameters
samplesa fiftyone.core.collections.SampleCollection
pred_fieldthe name of the field containing the predicted fiftyone.core.labels.Classification instances
gt_field:"ground_truth"the name of the field containing the ground truth fiftyone.core.labels.Classification instances
eval_key:Nonean evaluation key to use to refer to this evaluation
classes:Nonethe list of possible classes. If not provided, the observed ground truth/predicted labels are used
missing:Nonea missing label string. Any None-valued labels are given this label for results purposes
method:Nonea string specifying the evaluation method to use. The supported values are fo.evaluation_config.classification_backends.keys() and the default is fo.evaluation_config.default_classification_backend
custom_metrics:Nonean optional list of custom metrics to compute or dict mapping metric names to kwargs dicts
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional keyword arguments for the constructor of the ClassificationEvaluationConfig being used
Returns
a ClassificationResults
def _evaluate_top_k(ytrue, ypred, logits, k, targets_map): (source)

Undocumented

def _parse_config(pred_field, gt_field, method, **kwargs): (source)

Undocumented

def _to_binary_scores(y, confs, pos_label): (source)

Undocumented