class documentation

Abstract class representing an ordered collection of fiftyone.core.sample.Sample instances in a fiftyone.core.dataset.Dataset.

Class Method list_aggregations Returns a list of all available methods on this collection that apply fiftyone.core.aggregations.Aggregation operations to this collection.
Class Method list_view_stages Returns a list of all available methods on this collection that apply fiftyone.core.stages.ViewStage operations to this collection.
Method __add__ Undocumented
Method __bool__ Undocumented
Method __contains__ Undocumented
Method __getitem__ Undocumented
Method __iter__ Undocumented
Method __len__ Undocumented
Method __repr__ Undocumented
Method __str__ Undocumented
Method add_stage Applies the given fiftyone.core.stages.ViewStage to the collection.
Method aggregate Aggregates one or more fiftyone.core.aggregations.Aggregation instances.
Method annotate Exports the samples and optional label field(s) in this collection to the given annotation backend.
Method apply_model Applies the model to the samples in the collection.
Method bounds Computes the bounds of a numeric field of the collection.
Method compute_embeddings Computes embeddings for the samples in the collection using the given model.
Method compute_metadata Populates the metadata field of all samples in the collection.
Method compute_patch_embeddings Computes embeddings for the image patches defined by patches_field of the samples in the collection using the given model.
Method concat Concatenates the contents of the given SampleCollection to this collection.
Method count Counts the number of field values in the collection.
Method count_label_tags Counts the occurrences of all label tags in the specified label field(s) of this collection.
Method count_sample_tags Counts the occurrences of sample tags in this collection.
Method count_values Counts the occurrences of field values in the collection.
Method create_index Creates an index on the given field or with the given specification, if necessary.
Method delete_annotation_run Deletes the annotation run with the given key from this collection.
Method delete_annotation_runs Deletes all annotation runs from this collection.
Method delete_brain_run Deletes the brain method run with the given key from this collection.
Method delete_brain_runs Deletes all brain method runs from this collection.
Method delete_evaluation Deletes the evaluation results associated with the given evaluation key from this collection.
Method delete_evaluations Deletes all evaluation results from this collection.
Method delete_run Deletes the run with the given key from this collection.
Method delete_runs Deletes all runs from this collection.
Method description.setter Undocumented
Method distinct Computes the distinct values of a field in the collection.
Method draw_labels Renders annotated versions of the media in the collection with the specified label data overlaid to the given directory.
Method drop_index Drops the index for the given field or name, if necessary.
Method evaluate_classifications Evaluates the classification predictions in this collection with respect to the specified ground truth labels.
Method evaluate_detections Evaluates the specified predicted detections in this collection with respect to the specified ground truth detections.
Method evaluate_regressions Evaluates the regression predictions in this collection with respect to the specified ground truth values.
Method evaluate_segmentations Evaluates the specified semantic segmentation masks in this collection with respect to the specified ground truth masks.
Method exclude Excludes the samples with the given IDs from the collection.
Method exclude_by Excludes the samples with the given field values from the collection.
Method exclude_fields Excludes the fields with the given names from the samples in the collection.
Method exclude_frames Excludes the frames with the given IDs from the video collection.
Method exclude_groups Excludes the groups with the given IDs from the grouped collection.
Method exclude_labels Excludes the specified labels from the collection.
Method exists Returns a view containing the samples in the collection that have (or do not have) a non-None value for the given field or embedded field.
Method export Exports the samples in the collection to disk.
Method filter_field Filters the values of a field or embedded field of each sample in the collection.
Method filter_keypoints Filters the individual fiftyone.core.labels.Keypoint.points elements in the specified keypoints field of each sample in the collection.
Method filter_labels Filters the fiftyone.core.labels.Label field of each sample in the collection.
Method first Returns the first sample in the collection.
Method flatten Returns a flattened view that contains all samples in the dynamic grouped collection.
Method geo_near Sorts the samples in the collection by their proximity to a specified geolocation.
Method geo_within Filters the samples in this collection to only include samples whose geolocation is within a specified boundary.
Method get_annotation_info Returns information about the annotation run with the given key on this collection.
Method get_brain_info Returns information about the brain method run with the given key on this collection.
Method get_classes Gets the classes list for the given field, or None if no classes are available.
Method get_dynamic_field_schema Returns a schema dictionary describing the dynamic fields of the samples in the collection.
Method get_dynamic_frame_field_schema Returns a schema dictionary describing the dynamic fields of the frames in the collection.
Method get_evaluation_info Returns information about the evaluation with the given key on this collection.
Method get_field Returns the field instance of the provided path, or None if one does not exist.
Method get_field_schema Returns a schema dictionary describing the fields of the samples in the collection.
Method get_frame_field_schema Returns a schema dictionary describing the fields of the frames in the collection.
Method get_group Returns a dict containing the samples for the given group ID.
Method get_index_information Returns a dictionary of information about the indexes on this collection.
Method get_mask_targets Gets the mask targets for the given field, or None if no mask targets are available.
Method get_run_info Returns information about the run with the given key on this collection.
Method get_skeleton Gets the keypoint skeleton for the given field, or None if no skeleton is available.
Method group_by Creates a view that groups the samples in the collection by a specified field or expression.
Method has_annotation_run Whether this collection has an annotation run with the given key.
Method has_brain_run Whether this collection has a brain method run with the given key.
Method has_classes Determines whether this collection has a classes list for the given field.
Method has_evaluation Whether this collection has an evaluation with the given key.
Method has_field Determines whether the collection has a field with the given name.
Method has_frame_field Determines whether the collection has a frame-level field with the given name.
Method has_mask_targets Determines whether this collection has mask targets for the given field.
Method has_run Whether this collection has a run with the given key.
Method has_sample_field Determines whether the collection has a sample field with the given name.
Method has_skeleton Determines whether this collection has a keypoint skeleton for the given field.
Method head Returns a list of the first few samples in the collection.
Method histogram_values Computes a histogram of the field values in the collection.
Method init_run Initializes a config instance for a new run.
Method init_run_results Initializes a results instance for the run with the given key.
Method iter_groups Returns an iterator over the groups in the collection.
Method iter_samples Returns an iterator over the samples in the collection.
Method last Returns the last sample in the collection.
Method limit Returns a view with at most the given number of samples.
Method limit_labels Limits the number of fiftyone.core.labels.Label instances in the specified labels list field of each sample in the collection.
Method list_annotation_runs Returns a list of annotation keys on this collection.
Method list_brain_runs Returns a list of brain keys on this collection.
Method list_evaluations Returns a list of evaluation keys on this collection.
Method list_indexes Returns the list of index names on this collection.
Method list_runs Returns a list of run keys on this collection.
Method list_schema Extracts the value type(s) in a specified list field across all samples in the collection.
Method load_annotation_results Loads the results for the annotation run with the given key on this collection.
Method load_annotation_view Loads the fiftyone.core.view.DatasetView on which the specified annotation run was performed on this collection.
Method load_annotations Downloads the labels from the given annotation run from the annotation backend and merges them into this collection.
Method load_brain_results Loads the results for the brain method run with the given key on this collection.
Method load_brain_view Loads the fiftyone.core.view.DatasetView on which the specified brain method run was performed on this collection.
Method load_evaluation_results Loads the results for the evaluation with the given key on this collection.
Method load_evaluation_view Loads the fiftyone.core.view.DatasetView on which the specified evaluation was performed on this collection.
Method load_run_results Loads the results for the run with the given key on this collection.
Method load_run_view Loads the fiftyone.core.view.DatasetView on which the specified run was performed on this collection.
Method make_unique_field_name Makes a unique field name with the given root name for the collection.
Method map_labels Maps the label values of a fiftyone.core.labels.Label field to new values for each sample in the collection.
Method match Filters the samples in the collection by the given filter.
Method match_frames Filters the frames in the video collection by the given filter.
Method match_labels Selects the samples from the collection that contain (or do not contain) at least one label that matches the specified criteria.
Method match_tags Returns a view containing the samples in the collection that have or don't have any/all of the given tag(s).
Method max Computes the maximum of a numeric field of the collection.
Method mean Computes the arithmetic mean of the field values of the collection.
Method merge_labels Merges the labels from the given input field into the given output field of the collection.
Method min Computes the minimum of a numeric field of the collection.
Method mongo Adds a view stage defined by a raw MongoDB aggregation pipeline.
Method one Returns a single sample in this collection matching the expression.
Method quantiles Computes the quantile(s) of the field values of a collection.
Method register_run Registers a run under the given key on this collection.
Method reload Reloads the collection from the database.
Method rename_annotation_run Replaces the key for the given annotation run with a new key.
Method rename_brain_run Replaces the key for the given brain run with a new key.
Method rename_evaluation Replaces the key for the given evaluation with a new key.
Method rename_run Replaces the key for the given run with a new key.
Method save_context Returns a context that can be used to save samples from this collection according to a configurable batching strategy.
Method save_run_results Saves run results for the run with the given key.
Method schema Extracts the names and types of the attributes of a specified embedded document field across all samples in the collection.
Method select Selects the samples with the given IDs from the collection.
Method select_by Selects the samples with the given field values from the collection.
Method select_fields Selects only the fields with the given names from the samples in the collection. All other fields are excluded.
Method select_frames Selects the frames with the given IDs from the video collection.
Method select_group_slices Selects the samples in the group collection from the given slice(s).
Method select_groups Selects the groups with the given IDs from the grouped collection.
Method select_labels Selects only the specified labels from the collection.
Method set_field Sets a field or embedded field on each sample in a collection by evaluating the given expression.
Method set_label_values Sets the fields of the specified labels in the collection to the given values.
Method set_values Sets the field or embedded field on each sample or frame in the collection to the given values.
Method shuffle Randomly shuffles the samples in the collection.
Method skip Omits the given number of samples from the head of the collection.
Method sort_by Sorts the samples in the collection by the given field(s) or expression(s).
Method sort_by_similarity Sorts the collection by similarity to a specified query.
Method split_labels Splits the labels from the given input field into the given output field of the collection.
Method stats Returns stats about the collection on disk.
Method std Computes the standard deviation of the field values of the collection.
Method sum Computes the sum of the field values of the collection.
Method summary Returns a string summary of the collection.
Method sync_last_modified_at Syncs the last_modified_at property(s) of the dataset.
Method tag_labels Adds the tag(s) to all labels in the specified label field(s) of this collection, if necessary.
Method tag_samples Adds the tag(s) to all samples in this collection, if necessary.
Method tags.setter Undocumented
Method tail Returns a list of the last few samples in the collection.
Method take Randomly samples the given number of samples from the collection.
Method to_clips Creates a view that contains one sample per clip defined by the given field or expression in the video collection.
Method to_dict Returns a JSON dictionary representation of the collection.
Method to_evaluation_patches Creates a view based on the results of the evaluation with the given key that contains one sample for each true positive, false positive, and false negative example in the collection, respectively.
Method to_frames Creates a view that contains one sample per frame in the video collection.
Method to_json Returns a JSON string representation of the collection.
Method to_patches Creates a view that contains one sample per object patch in the specified field of the collection.
Method to_trajectories Creates a view that contains one clip for each unique object trajectory defined by their (label, index) in a frame-level field of a video collection.
Method untag_labels Removes the tag from all labels in the specified label field(s) of this collection, if necessary.
Method untag_samples Removes the tag(s) from all samples in this collection, if necessary.
Method update_run_config Updates the run config for the run with the given key.
Method validate_field_type Validates that the collection has a field of the given type.
Method validate_fields_exist Validates that the collection has field(s) with the given name(s).
Method values Extracts the values of a field from all samples in the collection.
Method view Returns a fiftyone.core.view.DatasetView containing the collection.
Method write_json Writes the colllection to disk in JSON format.
Class Variable __slots__ Undocumented
Property app_config Dataset-specific settings that customize how this collection is visualized in the :ref:`FiftyOne App <fiftyone-app>`.
Property classes The classes of the underlying dataset.
Property default_classes The default classes of the underlying dataset.
Property default_group_slice The default group slice of the collection, or None if the collection is not grouped.
Property default_mask_targets The default mask targets of the underlying dataset.
Property default_skeleton The default keypoint skeleton of the underlying dataset.
Property description A description of the underlying dataset.
Property group_field The group field of the collection, or None if the collection is not grouped.
Property group_media_types A dict mapping group slices to media types, or None if the collection is not grouped.
Property group_slice The current group slice of the collection, or None if the collection is not grouped.
Property group_slices The list of group slices of the collection, or None if the collection is not grouped.
Property has_annotation_runs Whether this collection has any annotation runs.
Property has_brain_runs Whether this collection has any brain runs.
Property has_evaluations Whether this collection has any evaluation results.
Property has_runs Whether this collection has any runs.
Property info The info dict of the underlying dataset.
Property mask_targets The mask targets of the underlying dataset.
Property media_type The media type of the collection.
Property name The name of the collection.
Property skeletons The keypoint skeletons of the underlying dataset.
Property tags The list of tags of the underlying dataset.
Method _add_view_stage Returns a fiftyone.core.view.DatasetView containing the contents of the collection with the given fiftyone.core.stages.ViewStage` appended to its aggregation pipeline.
Method _aggregate Runs the MongoDB aggregation pipeline on the collection and returns the result.
Async Method _async_aggregate Undocumented
Method _build_aggregation Undocumented
Method _build_batch_pipeline Undocumented
Method _build_big_pipeline Undocumented
Method _build_facets Undocumented
Method _contains_media_type Undocumented
Method _contains_videos Undocumented
Method _delete_labels Undocumented
Method _do_get_dynamic_field_schema Undocumented
Method _edit_label_tags Undocumented
Method _edit_sample_tags Undocumented
Method _expand_schema_from_values Undocumented
Method _get_db_fields_map Undocumented
Method _get_default_field Undocumented
Method _get_default_frame_fields Undocumented
Method _get_default_indexes Undocumented
Method _get_default_sample_fields Undocumented
Method _get_dynamic_field_schema Undocumented
Method _get_extremum Undocumented
Method _get_frame_label_field_schema Undocumented
Method _get_frames_bytes Computes the total size of the frame documents in the collection.
Method _get_geo_location_field Undocumented
Method _get_group_media_types Undocumented
Method _get_group_slices Undocumented
Method _get_label_attributes_schema Undocumented
Method _get_label_field_path Undocumented
Method _get_label_field_root Undocumented
Method _get_label_field_schema Undocumented
Method _get_label_field_type Undocumented
Method _get_label_fields Undocumented
Method _get_label_ids Undocumented
Method _get_media_fields Undocumented
Method _get_per_frame_bytes Returns a dictionary mapping frame IDs to document sizes (in bytes) for each frame in the video collection.
Method _get_per_sample_bytes Returns a dictionary mapping sample IDs to document sizes (in bytes) for each sample in the collection.
Method _get_per_sample_frames_bytes Returns a dictionary mapping sample IDs to total frame document sizes (in bytes) for each sample in the video collection.
Method _get_root_field_type Undocumented
Method _get_root_fields Undocumented
Method _get_samples_bytes Computes the total size of the sample documents in the collection.
Method _get_selected_labels Undocumented
Method _get_store Undocumented
Method _get_values_by_id Undocumented
Method _handle_db_field Undocumented
Method _handle_db_fields Undocumented
Method _handle_frame_field Undocumented
Method _handle_group_field Undocumented
Method _handle_id_fields Undocumented
Method _has_field Undocumented
Method _has_frame_fields Undocumented
Method _has_stores Undocumented
Method _is_default_field Undocumented
Method _is_frame_field Undocumented
Method _is_full_collection Undocumented
Method _is_group_field Undocumented
Method _is_label_field Undocumented
Method _is_read_only_field Undocumented
Method _list_stores Undocumented
Method _make_and_aggregate Undocumented
Method _make_set_field_pipeline Undocumented
Method _max Undocumented
Method _min Undocumented
Method _parse_aggregations Undocumented
Method _parse_big_result Undocumented
Method _parse_default_mask_targets Undocumented
Method _parse_default_skeleton Undocumented
Method _parse_faceted_result Undocumented
Method _parse_field Undocumented
Method _parse_field_name Undocumented
Method _parse_frame_labels_field Undocumented
Method _parse_label_field Undocumented
Method _parse_mask_targets Undocumented
Method _parse_media_field Undocumented
Method _parse_skeletons Undocumented
Method _pipeline Returns the MongoDB aggregation pipeline for the collection.
Method _process_aggregations Undocumented
Method _serialize Undocumented
Method _serialize_default_mask_targets Undocumented
Method _serialize_default_skeleton Undocumented
Method _serialize_field_schema Undocumented
Method _serialize_frame_field_schema Undocumented
Method _serialize_mask_targets Undocumented
Method _serialize_schema Undocumented
Method _serialize_skeletons Undocumented
Method _set_doc_values Undocumented
Method _set_frame_values Undocumented
Method _set_label_list_values Undocumented
Method _set_labels Undocumented
Method _set_list_values_by_id Undocumented
Method _set_sample_values Undocumented
Method _set_values Undocumented
Method _split_frame_fields Undocumented
Method _sync_dataset_last_modified_at Undocumented
Method _sync_samples_last_modified_at Undocumented
Method _tag_labels Undocumented
Method _to_fields_str Undocumented
Method _untag_labels Undocumented
Method _unwind_values Undocumented
Method _validate_root_field Undocumented
Constant _FRAMES_PREFIX Undocumented
Constant _GROUPS_PREFIX Undocumented
Property _dataset The fiftyone.core.dataset.Dataset that serves the samples in this collection.
Property _element_str Undocumented
Property _elements_str Undocumented
Property _is_clips Whether this collection contains clips.
Property _is_dynamic_groups Whether this collection contains dynamic groups.
Property _is_frames Whether this collection contains frames of a video dataset.
Property _is_generated Whether this collection's contents is generated from another collection.
Property _is_patches Whether this collection contains patches.
Property _root_dataset The root fiftyone.core.dataset.Dataset from which this collection is derived.
@classmethod
def list_aggregations(cls): (source)

Returns a list of all available methods on this collection that apply fiftyone.core.aggregations.Aggregation operations to this collection.

Returns
a list of SampleCollection method names
@classmethod
def list_view_stages(cls): (source)

Returns a list of all available methods on this collection that apply fiftyone.core.stages.ViewStage operations to this collection.

Returns
a list of SampleCollection method names
def __add__(self, samples): (source)

Undocumented

def __bool__(self): (source)

Undocumented

def __contains__(self, sample_id): (source)

Undocumented

def __getitem__(self, id_filepath_slice): (source)
def __iter__(self): (source)

Undocumented

def __repr__(self): (source)

Undocumented

def __str__(self): (source)

Undocumented

def add_stage(self, stage): (source)

Applies the given fiftyone.core.stages.ViewStage to the collection.

Parameters
stagea fiftyone.core.stages.ViewStage
Returns
a fiftyone.core.view.DatasetView
def aggregate(self, aggregations): (source)

Aggregates one or more fiftyone.core.aggregations.Aggregation instances.

Note that it is best practice to group aggregations into a single call to aggregate, as this will be more efficient than performing multiple aggregations in series.

Parameters
aggregationsan fiftyone.core.aggregations.Aggregation or iterable of fiftyone.core.aggregations.Aggregation instances
Returns
an aggregation result or list of aggregation results corresponding to the input aggregation(s)
def annotate(self, anno_key, label_schema=None, label_field=None, label_type=None, classes=None, attributes=True, mask_targets=None, allow_additions=True, allow_deletions=True, allow_label_edits=True, allow_index_edits=True, allow_spatial_edits=True, media_field='filepath', backend=None, launch_editor=False, **kwargs): (source)

Exports the samples and optional label field(s) in this collection to the given annotation backend.

The backend parameter controls which annotation backend to use. Depending on the backend you use, you may want/need to provide extra keyword arguments to this function for the constructor of the backend's fiftyone.utils.annotations.AnnotationBackendConfig class.

The natively provided backends and their associated config classes are:

See :ref:`this page <requesting-annotations>` for more information about using this method, including how to define label schemas and how to configure login credentials for your annotation provider.

Parameters
anno_keya string key to use to refer to this annotation run
label_schema:Nonea dictionary defining the label schema to use. If this argument is provided, it takes precedence over the other schema-related arguments
label_field:Nonea string indicating a new or existing label field to annotate
label_type:None

a string indicating the type of labels to annotate. The possible values are:

All new label fields must have their type specified via this argument or in label_schema. Note that annotation backends may not support all label types

classes:Nonea list of strings indicating the class options for label_field or all fields in label_schema without classes specified. All new label fields must have a class list provided via one of the supported methods. For existing label fields, if classes are not provided by this argument nor label_schema, they are retrieved from get_classes if possible, or else the observed labels on your dataset are used
attributes:True

specifies the label attributes of each label field to include (other than their label, which is always included) in the annotation export. Can be any of the following:

  • True: export all label attributes
  • False: don't export any custom label attributes
  • a list of label attributes to export
  • a dict mapping attribute names to dicts specifying the type, values, and default for each attribute

If a label_schema is also provided, this parameter determines which attributes are included for all fields that do not explicitly define their per-field attributes (in addition to any per-class attributes)

mask_targets:Nonea dict mapping pixel values to semantic label strings. Only applicable when annotating semantic segmentations
allow_additions:Truewhether to allow new labels to be added. Only applicable when editing existing label fields
allow_deletions:Truewhether to allow labels to be deleted. Only applicable when editing existing label fields
allow_label_edits:Truewhether to allow the label attribute of existing labels to be modified. Only applicable when editing existing fields with label attributes
allow_index_edits:Truewhether to allow the index attribute of existing video tracks to be modified. Only applicable when editing existing frame fields with index attributes
allow_spatial_edits:Truewhether to allow edits to the spatial properties (bounding boxes, vertices, keypoints, masks, etc) of labels. Only applicable when editing existing spatial label fields
media_field:"filepath"the field containing the paths to the media files to upload
backend:Nonethe annotation backend to use. The supported values are fiftyone.annotation_config.backends.keys() and the default is fiftyone.annotation_config.default_backend
launch_editor:Falsewhether to launch the annotation backend's editor after uploading the samples
**kwargskeyword arguments for the fiftyone.utils.annotations.AnnotationBackendConfig
Returns
an fiftyone.utils.annotations.AnnnotationResults
def apply_model(self, model, label_field='predictions', confidence_thresh=None, store_logits=False, batch_size=None, num_workers=None, skip_failures=True, output_dir=None, rel_dir=None, progress=None, **kwargs): (source)

Applies the model to the samples in the collection.

This method supports all of the following cases:

  • Applying an image model to an image collection
  • Applying an image model to the frames of a video collection
  • Applying a video model to a video collection
Parameters
modela fiftyone.core.models.Model, Hugging Face transformers model, Ultralytics model, SuperGradients model, or Lightning Flash model
label_field:"predictions"the name of the field in which to store the model predictions. When performing inference on video frames, the "frames." prefix is optional
confidence_thresh:Nonean optional confidence threshold to apply to any applicable labels generated by the model
store_logits:Falsewhether to store logits for the model predictions. This is only supported when the provided model has logits, model.has_logits == True
batch_size:Nonean optional batch size to use, if the model supports batching
num_workers:Nonethe number of workers for the torch:torch.utils.data.DataLoader to use. Only applicable for Torch-based models
skip_failures:Truewhether to gracefully continue without raising an error if predictions cannot be generated for a sample. Only applicable to fiftyone.core.models.Model instances
output_dir:Nonean optional output directory in which to write segmentation images. Only applicable if the model generates segmentations. If none is provided, the segmentations are stored in the database
rel_dir:Nonean optional relative directory to strip from each input filepath to generate a unique identifier that is joined with output_dir to generate an output path for each segmentation image. This argument allows for populating nested subdirectories in output_dir that match the shape of the input paths. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional model-specific keyword arguments passed through to the underlying inference implementation
@aggregation
def bounds(self, field_or_expr, expr=None, safe=False): (source)

Computes the bounds of a numeric field of the collection.

None-valued fields are ignored.

This aggregation is typically applied to numeric or date field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            numeric_field=1.0,
            numeric_list_field=[1, 2, 3],
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            numeric_field=4.0,
            numeric_list_field=[1, 2],
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            numeric_field=None,
            numeric_list_field=None,
        ),
    ]
)

#
# Compute the bounds of a numeric field
#

bounds = dataset.bounds("numeric_field")
print(bounds)  # (min, max)

#
# Compute the bounds of a numeric list field
#

bounds = dataset.bounds("numeric_list_field")
print(bounds)  # (min, max)

#
# Compute the bounds of a transformation of a numeric field
#

bounds = dataset.bounds(2 * (F("numeric_field") + 1))
print(bounds)  # (min, max)
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
Returns
the (min, max) bounds
def compute_embeddings(self, model, embeddings_field=None, batch_size=None, num_workers=None, skip_failures=True, progress=None, **kwargs): (source)

Computes embeddings for the samples in the collection using the given model.

This method supports all the following cases:

  • Using an image model to compute embeddings for an image collection
  • Using an image model to compute frame embeddings for a video collection
  • Using a video model to compute embeddings for a video collection

The model must expose embeddings, i.e., fiftyone.core.models.Model.has_embeddings must return True.

If an embeddings_field is provided, the embeddings are saved to the samples; otherwise, the embeddings are returned in-memory.

Parameters
modela fiftyone.core.models.Model, Hugging Face Transformers model, Ultralytics model, SuperGradients model, or Lightning Flash model
embeddings_field:Nonethe name of a field in which to store the embeddings. When computing video frame embeddings, the "frames." prefix is optional
batch_size:Nonean optional batch size to use, if the model supports batching
num_workers:Nonethe number of workers for the torch:torch.utils.data.DataLoader to use. Only applicable for Torch-based models
skip_failures:Truewhether to gracefully continue without raising an error if embeddings cannot be generated for a sample. Only applicable to fiftyone.core.models.Model instances
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional model-specific keyword arguments passed through to the underlying inference implementation
Returns
one of the following
  • None, if an embeddings_field is provided
  • a num_samples x num_dim array of embeddings, when computing embeddings for image/video collections with image/video models, respectively, and no embeddings_field is provided. If skip_failures is True and any errors are detected, a list of length num_samples is returned instead containing all successfully computed embedding vectors along with None entries for samples for which embeddings could not be computed
  • a dictionary mapping sample IDs to num_frames x num_dim arrays of embeddings, when computing frame embeddings for video collections using an image model. If skip_failures is True and any errors are detected, the values of this dictionary will contain arrays of embeddings for all frames 1, 2, ... until the error occurred, or None if no embeddings were computed at all
def compute_metadata(self, overwrite=False, num_workers=None, skip_failures=True, warn_failures=False, progress=None): (source)

Populates the metadata field of all samples in the collection.

Any samples with existing metadata are skipped, unless overwrite == True.

Parameters
overwrite:Falsewhether to overwrite existing metadata
num_workers:Nonea suggested number of threads to use
skip_failures:Truewhether to gracefully continue without raising an error if metadata cannot be computed for a sample
warn_failures:Falsewhether to log a warning if metadata cannot be computed for a sample
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
def compute_patch_embeddings(self, model, patches_field, embeddings_field=None, force_square=False, alpha=None, handle_missing='skip', batch_size=None, num_workers=None, skip_failures=True, progress=None): (source)

Computes embeddings for the image patches defined by patches_field of the samples in the collection using the given model.

This method supports all the following cases:

  • Using an image model to compute patch embeddings for an image collection
  • Using an image model to compute frame patch embeddings for a video collection

The model must expose embeddings, i.e., fiftyone.core.models.Model.has_embeddings must return True.

If an embeddings_field is provided, the embeddings are saved to the samples; otherwise, the embeddings are returned in-memory.

Parameters
modela fiftyone.core.models.Model, Hugging Face Transformers model, Ultralytics model, SuperGradients model, or Lightning Flash model
patches_fieldthe name of the field defining the image patches in each sample to embed. Must be of type fiftyone.core.labels.Detection, fiftyone.core.labels.Detections, fiftyone.core.labels.Polyline, or fiftyone.core.labels.Polylines. When computing video frame embeddings, the "frames." prefix is optional
embeddings_field:Nonethe name of a label attribute in which to store the embeddings
force_square:Falsewhether to minimally manipulate the patch bounding boxes into squares prior to extraction
alpha:Nonean optional expansion/contraction to apply to the patches before extracting them, in [-1, inf). If provided, the length and width of the box are expanded (or contracted, when alpha < 0) by (100 * alpha)%. For example, set alpha = 0.1 to expand the boxes by 10%, and set alpha = -0.1 to contract the boxes by 10%
handle_missing:"skip"

how to handle images with no patches. Supported values are:

  • "skip": skip the image and assign its embedding as None
  • "image": use the whole image as a single patch
  • "error": raise an error
batch_size:Nonean optional batch size to use, if the model supports batching
num_workers:Nonethe number of workers for the torch:torch.utils.data.DataLoader to use. Only applicable for Torch-based models
skip_failures:Truewhether to gracefully continue without raising an error if embeddings cannot be generated for a sample
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
Returns
one of the following
  • None, if an embeddings_field is provided
  • a dict mapping sample IDs to num_patches x num_dim arrays of patch embeddings, when computing patch embeddings for image collections and no embeddings_field is provided. If skip_failures is True and any errors are detected, this dictionary will contain None values for any samples for which embeddings could not be computed
  • a dict of dicts mapping sample IDs to frame numbers to num_patches x num_dim arrays of patch embeddings, when computing patch embeddings for the frames of video collections and no embeddings_field is provided. If skip_failures is True and any errors are detected, this nested dict will contain missing or None values to indicate uncomputable embeddings
@view_stage
def concat(self, samples): (source)

Concatenates the contents of the given SampleCollection to this collection.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

#
# Concatenate two views
#

view1 = dataset.match(F("uniqueness") < 0.2)
view2 = dataset.match(F("uniqueness") > 0.7)

view = view1.concat(view2)

print(view1)
print(view2)
print(view)

#
# Concatenate two patches views
#

gt_objects = dataset.to_patches("ground_truth")

patches1 = gt_objects[:50]
patches2 = gt_objects[-50:]
patches = patches1.concat(patches2)

print(patches1)
print(patches2)
print(patches)
Parameters
samplesa SampleCollection whose contents to append to this collection
Returns
a fiftyone.core.view.DatasetView
@aggregation
def count(self, field_or_expr=None, expr=None, safe=False): (source)

Counts the number of field values in the collection.

None-valued fields are ignored.

If no field is provided, the samples themselves are counted.

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(label="cat"),
                    fo.Detection(label="dog"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(label="cat"),
                    fo.Detection(label="rabbit"),
                    fo.Detection(label="squirrel"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            predictions=None,
        ),
    ]
)

#
# Count the number of samples in the dataset
#

count = dataset.count()
print(count)  # the count

#
# Count the number of samples with `predictions`
#

count = dataset.count("predictions")
print(count)  # the count

#
# Count the number of objects in the `predictions` field
#

count = dataset.count("predictions.detections")
print(count)  # the count

#
# Count the number of objects in samples with > 2 predictions
#

count = dataset.count(
    (F("predictions.detections").length() > 2).if_else(
        F("predictions.detections"), None
    )
)
print(count)  # the count
Parameters
field_or_expr:Nonea field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. If neither field_or_expr or expr is provided, the samples themselves are counted. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
Returns
the count
def count_label_tags(self, label_fields=None): (source)

Counts the occurrences of all label tags in the specified label field(s) of this collection.

Parameters
label_fields:Nonean optional name or iterable of names of fiftyone.core.labels.Label fields. By default, all label fields are used
Returns
a dict mapping tags to counts
def count_sample_tags(self): (source)

Counts the occurrences of sample tags in this collection.

Returns
a dict mapping tags to counts
@aggregation
def count_values(self, field_or_expr, expr=None, safe=False): (source)

Counts the occurrences of field values in the collection.

This aggregation is typically applied to countable field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            tags=["sunny"],
            predictions=fo.Detections(
                detections=[
                    fo.Detection(label="cat"),
                    fo.Detection(label="dog"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            tags=["cloudy"],
            predictions=fo.Detections(
                detections=[
                    fo.Detection(label="cat"),
                    fo.Detection(label="rabbit"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            predictions=None,
        ),
    ]
)

#
# Compute the tag counts in the dataset
#

counts = dataset.count_values("tags")
print(counts)  # dict mapping values to counts

#
# Compute the predicted label counts in the dataset
#

counts = dataset.count_values("predictions.detections.label")
print(counts)  # dict mapping values to counts

#
# Compute the predicted label counts after some normalization
#

counts = dataset.count_values(
    F("predictions.detections.label").map_values(
        {"cat": "pet", "dog": "pet"}
    ).upper()
)
print(counts)  # dict mapping values to counts
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to treat nan/inf values as None when dealing with floating point values
Returns
a dict mapping values to counts
def create_index(self, field_or_spec, unique=False, wait=True, **kwargs): (source)

Creates an index on the given field or with the given specification, if necessary.

Indexes enable efficient sorting, merging, and other such operations.

Frame-level fields can be indexed by prepending "frames." to the field name.

Note

If an index with the same field(s) but different order(s) already exists, no new index will be created.

Use drop_index to drop an existing index first if you wish to replace an existing index with new properties.

Note

If you are indexing a single field and it already has a unique constraint, it will be retained regardless of the unique value you specify. Conversely, if the given field already has a non-unique index but you requested a unique index, the existing index will be replaced with a unique index.

Use drop_index to drop an existing index first if you wish to replace an existing index with new properties.

Parameters
field_or_specthe field name, embedded.field.name, or index specification list. See pymongo:pymongo.collection.Collection.create_index for supported values
unique:Falsewhether to add a uniqueness constraint to the index
wait:Truewhether to wait for index creation to finish
**kwargsoptional keyword arguments for pymongo:pymongo.collection.Collection.create_index
Returns
the name of the index
def delete_annotation_run(self, anno_key): (source)

Deletes the annotation run with the given key from this collection.

Calling this method only deletes the record of the annotation run from the collection; it will not delete any annotations loaded onto your dataset via load_annotations, nor will it delete any associated information from the annotation backend.

Use load_annotation_results to programmatically manage/delete a run from the annotation backend.

Parameters
anno_keyan annotation key
def delete_annotation_runs(self): (source)

Deletes all annotation runs from this collection.

Calling this method only deletes the records of the annotation runs from this collection; it will not delete any annotations loaded onto your dataset via load_annotations, nor will it delete any associated information from the annotation backend.

Use load_annotation_results to programmatically manage/delete runs in the annotation backend.

def delete_brain_run(self, brain_key): (source)

Deletes the brain method run with the given key from this collection.

Parameters
brain_keya brain key
def delete_brain_runs(self): (source)

Deletes all brain method runs from this collection.

def delete_evaluation(self, eval_key): (source)

Deletes the evaluation results associated with the given evaluation key from this collection.

Parameters
eval_keyan evaluation key
def delete_evaluations(self): (source)

Deletes all evaluation results from this collection.

def delete_run(self, run_key): (source)

Deletes the run with the given key from this collection.

Parameters
run_keya run key
def delete_runs(self): (source)

Deletes all runs from this collection.

@description.setter
def description(self, description): (source)
@aggregation
def distinct(self, field_or_expr, expr=None, safe=False): (source)

Computes the distinct values of a field in the collection.

None-valued fields are ignored.

This aggregation is typically applied to countable field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            tags=["sunny"],
            predictions=fo.Detections(
                detections=[
                    fo.Detection(label="cat"),
                    fo.Detection(label="dog"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            tags=["sunny", "cloudy"],
            predictions=fo.Detections(
                detections=[
                    fo.Detection(label="cat"),
                    fo.Detection(label="rabbit"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            predictions=None,
        ),
    ]
)

#
# Get the distinct tags in a dataset
#

values = dataset.distinct("tags")
print(values)  # list of distinct values

#
# Get the distinct predicted labels in a dataset
#

values = dataset.distinct("predictions.detections.label")
print(values)  # list of distinct values

#
# Get the distinct predicted labels after some normalization
#

values = dataset.distinct(
    F("predictions.detections.label").map_values(
        {"cat": "pet", "dog": "pet"}
    ).upper()
)
print(values)  # list of distinct values
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
Returns
a sorted list of distinct values
def draw_labels(self, output_dir, rel_dir=None, label_fields=None, overwrite=False, config=None, progress=None, **kwargs): (source)

Renders annotated versions of the media in the collection with the specified label data overlaid to the given directory.

The filenames of the sample media are maintained, unless a name conflict would occur in output_dir, in which case an index of the form "-%d" % count is appended to the base filename.

Images are written in format fo.config.default_image_ext, and videos are written in format fo.config.default_video_ext.

Parameters
output_dirthe directory to write the annotated media
rel_dir:Nonean optional relative directory to strip from each input filepath to generate a unique identifier that is joined with output_dir to generate an output path for each annotated media. This argument allows for populating nested subdirectories in output_dir that match the shape of the input paths. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path
label_fields:Nonea label field or list of label fields to render. By default, all fiftyone.core.labels.Label fields are drawn
overwrite:Falsewhether to delete output_dir if it exists before rendering
config:Nonean optional fiftyone.utils.annotations.DrawConfig configuring how to draw the labels
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional keyword arguments specifying parameters of the default fiftyone.utils.annotations.DrawConfig to override
Returns
the list of paths to the rendered media
def drop_index(self, field_or_name): (source)

Drops the index for the given field or name, if necessary.

Parameters
field_or_namea field name, embedded.field.name, or compound index name. Use list_indexes to see the available indexes
def evaluate_classifications(self, pred_field, gt_field='ground_truth', eval_key=None, classes=None, missing=None, method=None, progress=None, **kwargs): (source)

Evaluates the classification predictions in this collection with respect to the specified ground truth labels.

By default, this method simply compares the ground truth and prediction for each sample, but other strategies such as binary evaluation and top-k matching can be configured via the method parameter.

You can customize the evaluation method by passing additional parameters for the method's config class as kwargs.

The natively provided method values and their associated configs are:

If an eval_key is specified, then this method will record some statistics on each sample:

  • When evaluating sample-level fields, an eval_key field will be populated on each sample recording whether that sample's prediction is correct.
  • When evaluating frame-level fields, an eval_key field will be populated on each frame recording whether that frame's prediction is correct. In addition, an eval_key field will be populated on each sample that records the average accuracy of the frame predictions of the sample.
Parameters
pred_fieldthe name of the field containing the predicted fiftyone.core.labels.Classification instances
gt_field:"ground_truth"the name of the field containing the ground truth fiftyone.core.labels.Classification instances
eval_key:Nonea string key to use to refer to this evaluation
classes:Nonethe list of possible classes. If not provided, the observed ground truth/predicted labels are used
missing:Nonea missing label string. Any None-valued labels are given this label for results purposes
method:Nonea string specifying the evaluation method to use. The supported values are fo.evaluation_config.classification_backends.keys() and the default is fo.evaluation_config.classification_default_backend
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional keyword arguments for the constructor of the fiftyone.utils.eval.classification.ClassificationEvaluationConfig being used
Returns
a fiftyone.utils.eval.classification.ClassificationResults
def evaluate_detections(self, pred_field, gt_field='ground_truth', eval_key=None, classes=None, missing=None, method=None, iou=0.5, use_masks=False, use_boxes=False, classwise=True, dynamic=True, progress=None, **kwargs): (source)

Evaluates the specified predicted detections in this collection with respect to the specified ground truth detections.

This method supports evaluating the following spatial data types:

For spatial object detection evaluation, this method uses COCO-style evaluation by default.

When evaluating keypoints, "IoUs" are computed via object keypoint similarity.

For temporal segment detection, this method uses ActivityNet-style evaluation by default.

You can use the method parameter to select a different method, and you can optionally customize the method by passing additional parameters for the method's config class as kwargs.

The natively provided method values and their associated configs are:

If an eval_key is provided, a number of fields are populated at the object- and sample-level recording the results of the evaluation:

  • True positive (TP), false positive (FP), and false negative (FN) counts for the each sample are saved in top-level fields of each sample:

    TP: sample.<eval_key>_tp
    FP: sample.<eval_key>_fp
    FN: sample.<eval_key>_fn
    

    In addition, when evaluating frame-level objects, TP/FP/FN counts are recorded for each frame:

    TP: frame.<eval_key>_tp
    FP: frame.<eval_key>_fp
    FN: frame.<eval_key>_fn
    
  • The fields listed below are populated on each individual object; these fields tabulate the TP/FP/FN status of the object, the ID of the matching object (if any), and the matching IoU:

    TP/FP/FN: object.<eval_key>
          ID: object.<eval_key>_id
         IoU: object.<eval_key>_iou
    
Parameters
pred_fieldthe name of the field containing the predicted fiftyone.core.labels.Detections, fiftyone.core.labels.Polylines, fiftyone.core.labels.Keypoints, or fiftyone.core.labels.TemporalDetections
gt_field:"ground_truth"the name of the field containing the ground truth fiftyone.core.labels.Detections, fiftyone.core.labels.Polylines, fiftyone.core.labels.Keypoints, or fiftyone.core.labels.TemporalDetections
eval_key:Nonea string key to use to refer to this evaluation
classes:Nonethe list of possible classes. If not provided, the observed ground truth/predicted labels are used
missing:Nonea missing label string. Any unmatched objects are given this label for results purposes
method:Nonea string specifying the evaluation method to use. The supported values are fo.evaluation_config.detection_backends.keys() and the default is fo.evaluation_config.detection_default_backend
iou:0.50the IoU threshold to use to determine matches
use_masks:Falsewhether to compute IoUs using the instances masks in the mask attribute of the provided objects, which must be fiftyone.core.labels.Detection instances
use_boxes:Falsewhether to compute IoUs using the bounding boxes of the provided fiftyone.core.labels.Polyline instances rather than using their actual geometries
classwise:Truewhether to only match objects with the same class label (True) or allow matches between classes (False)
dynamic:Truewhether to declare the dynamic object-level attributes that are populated on the dataset's schema
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional keyword arguments for the constructor of the fiftyone.utils.eval.detection.DetectionEvaluationConfig being used
Returns
a fiftyone.utils.eval.detection.DetectionResults
def evaluate_regressions(self, pred_field, gt_field='ground_truth', eval_key=None, missing=None, method=None, progress=None, **kwargs): (source)

Evaluates the regression predictions in this collection with respect to the specified ground truth values.

You can customize the evaluation method by passing additional parameters for the method's config class as kwargs.

The natively provided method values and their associated configs are:

If an eval_key is specified, then this method will record some statistics on each sample:

  • When evaluating sample-level fields, an eval_key field will be populated on each sample recording the error of that sample's prediction.
  • When evaluating frame-level fields, an eval_key field will be populated on each frame recording the error of that frame's prediction. In addition, an eval_key field will be populated on each sample that records the average error of the frame predictions of the sample.
Parameters
pred_fieldthe name of the field containing the predicted fiftyone.core.labels.Regression instances
gt_field:"ground_truth"the name of the field containing the ground truth fiftyone.core.labels.Regression instances
eval_key:Nonea string key to use to refer to this evaluation
missing:Nonea missing value. Any None-valued regressions are given this value for results purposes
method:Nonea string specifying the evaluation method to use. The supported values are fo.evaluation_config.regression_backends.keys() and the default is fo.evaluation_config.regression_default_backend
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional keyword arguments for the constructor of the fiftyone.utils.eval.regression.RegressionEvaluationConfig being used
Returns
a fiftyone.utils.eval.regression.RegressionResults
def evaluate_segmentations(self, pred_field, gt_field='ground_truth', eval_key=None, mask_targets=None, method=None, progress=None, **kwargs): (source)

Evaluates the specified semantic segmentation masks in this collection with respect to the specified ground truth masks.

If the size of a predicted mask does not match the ground truth mask, it is resized to match the ground truth.

By default, this method simply performs pixelwise evaluation of the full masks, but other strategies such as boundary-only evaluation can be configured by passing additional parameters for the method's config class as kwargs.

The natively provided method values and their associated configs are:

If an eval_key is provided, the accuracy, precision, and recall of each sample is recorded in top-level fields of each sample:

 Accuracy: sample.<eval_key>_accuracy
Precision: sample.<eval_key>_precision
   Recall: sample.<eval_key>_recall

In addition, when evaluating frame-level masks, the accuracy, precision, and recall of each frame if recorded in the following frame-level fields:

 Accuracy: frame.<eval_key>_accuracy
Precision: frame.<eval_key>_precision
   Recall: frame.<eval_key>_recall

Note

The mask values 0 and #000000 are treated as a background class for the purposes of computing evaluation metrics like precision and recall.

Parameters
pred_fieldthe name of the field containing the predicted fiftyone.core.labels.Segmentation instances
gt_field:"ground_truth"the name of the field containing the ground truth fiftyone.core.labels.Segmentation instances
eval_key:Nonea string key to use to refer to this evaluation
mask_targets:Nonea dict mapping pixel values or RGB hex strings to labels. If not provided, the observed values are used as labels
method:Nonea string specifying the evaluation method to use. The supported values are fo.evaluation_config.segmentation_backends.keys() and the default is fo.evaluation_config.segmentation_default_backend
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional keyword arguments for the constructor of the fiftyone.utils.eval.segmentation.SegmentationEvaluationConfig being used
Returns
a fiftyone.utils.eval.segmentation.SegmentationResults
@view_stage
def exclude(self, sample_ids): (source)

Excludes the samples with the given IDs from the collection.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(filepath="/path/to/image1.png"),
        fo.Sample(filepath="/path/to/image2.png"),
        fo.Sample(filepath="/path/to/image3.png"),
    ]
)

#
# Exclude the first sample from the dataset
#

sample_id = dataset.first().id
view = dataset.exclude(sample_id)

#
# Exclude the first and last samples from the dataset
#

sample_ids = [dataset.first().id, dataset.last().id]
view = dataset.exclude(sample_ids)
Parameters
sample_ids

the samples to exclude. Can be any of the following:

Returns
a fiftyone.core.view.DatasetView
@view_stage
def exclude_by(self, field, values): (source)

Excludes the samples with the given field values from the collection.

This stage is typically used to work with categorical fields (strings, ints, and bools). If you want to exclude samples based on floating point fields, use match.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(filepath="image%d.jpg" % i, int=i, str=str(i))
        for i in range(10)
    ]
)

#
# Create a view excluding samples whose `int` field have the given
# values
#

view = dataset.exclude_by("int", [1, 9, 3, 7, 5])
print(view.head(5))

#
# Create a view excluding samples whose `str` field have the given
# values
#

view = dataset.exclude_by("str", ["1", "9", "3", "7", "5"])
print(view.head(5))
Parameters
fielda field or embedded.field.name
valuesa value or iterable of values to exclude by
Returns
a fiftyone.core.view.DatasetView
@view_stage
def exclude_fields(self, field_names=None, meta_filter=None, _allow_missing=False): (source)

Excludes the fields with the given names from the samples in the collection.

Note that default fields cannot be excluded.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            ground_truth=fo.Classification(label="cat"),
            predictions=fo.Classification(
                label="cat",
                confidence=0.9,
                mood="surly",
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            ground_truth=fo.Classification(label="dog"),
            predictions=fo.Classification(
                label="dog",
                confidence=0.8,
                mood="happy",
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
        ),
    ]
)

#
# Exclude the `predictions` field from all samples
#

view = dataset.exclude_fields("predictions")

#
# Exclude the `mood` attribute from all classifications in the
# `predictions` field
#

view = dataset.exclude_fields("predictions.mood")
Parameters
field_names:Nonea field name or iterable of field names to exclude. May contain embedded.field.name as well
meta_filter:None

a filter that dynamically excludes fields in the collection's schema according to the specified rule, which can be matched against the field's name, type, description, and/or info. For example:

  • Use meta_filter="2023" or meta_filter={"any": "2023"} to exclude fields that have the string "2023" anywhere in their name, type, description, or info
  • Use meta_filter={"type": "StringField"} or meta_filter={"type": "Classification"} to exclude all string or classification fields, respectively
  • Use meta_filter={"description": "my description"} to exclude fields whose description contains the string "my description"
  • Use meta_filter={"info": "2023"} to exclude fields that have the string "2023" anywhere in their info
  • Use meta_filter={"info.key": "value"}} to exclude fields that have a specific key/value pair in their info
  • Include meta_filter={"include_nested_fields": True, ...} in your meta filter to include all nested fields in the filter
_allow_missingUndocumented
Returns
a fiftyone.core.view.DatasetView
@view_stage
def exclude_frames(self, frame_ids, omit_empty=True): (source)

Excludes the frames with the given IDs from the video collection.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-video")

#
# Exclude some specific frames
#

frame_ids = [
    dataset.first().frames.first().id,
    dataset.last().frames.last().id,
]

view = dataset.exclude_frames(frame_ids)

print(dataset.count("frames"))
print(view.count("frames"))
Parameters
frame_ids

the frames to exclude. Can be any of the following:

omit_empty:Truewhether to omit samples that have no frames after excluding the specified frames
Returns
a fiftyone.core.view.DatasetView
@view_stage
def exclude_groups(self, group_ids): (source)

Excludes the groups with the given IDs from the grouped collection.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

#
# Exclude some specific groups by ID
#

view = dataset.take(2)
group_ids = view.values("group.id")
other_groups = dataset.exclude_groups(group_ids)

assert len(set(group_ids) & set(other_groups.values("group.id"))) == 0
Parameters
group_idsUndocumented
groups_ids

the groups to exclude. Can be any of the following:

Returns
a fiftyone.core.view.DatasetView
@view_stage
def exclude_labels(self, labels=None, ids=None, tags=None, fields=None, omit_empty=True): (source)

Excludes the specified labels from the collection.

The returned view will omit samples, sample fields, and individual labels that do not match the specified selection criteria.

You can perform an exclusion via one or more of the following methods:

  • Provide the labels argument, which should contain a list of dicts in the format returned by fiftyone.core.session.Session.selected_labels, to exclude specific labels
  • Provide the ids argument to exclude labels with specific IDs
  • Provide the tags argument to exclude labels with specific tags

If multiple criteria are specified, labels must match all of them in order to be excluded.

By default, the exclusion is applied to all fiftyone.core.labels.Label fields, but you can provide the fields argument to explicitly define the field(s) in which to exclude.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

#
# Exclude the labels currently selected in the App
#

session = fo.launch_app(dataset)

# Select some labels in the App...

view = dataset.exclude_labels(labels=session.selected_labels)

#
# Exclude labels with the specified IDs
#

# Grab some label IDs
ids = [
    dataset.first().ground_truth.detections[0].id,
    dataset.last().predictions.detections[0].id,
]

view = dataset.exclude_labels(ids=ids)

print(dataset.count("ground_truth.detections"))
print(view.count("ground_truth.detections"))

print(dataset.count("predictions.detections"))
print(view.count("predictions.detections"))

#
# Exclude labels with the specified tags
#

# Grab some label IDs
ids = [
    dataset.first().ground_truth.detections[0].id,
    dataset.last().predictions.detections[0].id,
]

# Give the labels a "test" tag
dataset = dataset.clone()  # create copy since we're modifying data
dataset.select_labels(ids=ids).tag_labels("test")

print(dataset.count_values("ground_truth.detections.tags"))
print(dataset.count_values("predictions.detections.tags"))

# Exclude the labels via their tag
view = dataset.exclude_labels(tags="test")

print(dataset.count("ground_truth.detections"))
print(view.count("ground_truth.detections"))

print(dataset.count("predictions.detections"))
print(view.count("predictions.detections"))
Parameters
labels:Nonea list of dicts specifying the labels to exclude in the format returned by fiftyone.core.session.Session.selected_labels
ids:Nonean ID or iterable of IDs of the labels to exclude
tags:Nonea tag or iterable of tags of labels to exclude
fields:Nonea field or iterable of fields from which to exclude
omit_empty:Truewhether to omit samples that have no labels after filtering
Returns
a fiftyone.core.view.DatasetView
@view_stage
def exists(self, field, bool=None): (source)

Returns a view containing the samples in the collection that have (or do not have) a non-None value for the given field or embedded field.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            ground_truth=fo.Classification(label="cat"),
            predictions=fo.Classification(label="cat", confidence=0.9),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            ground_truth=fo.Classification(label="dog"),
            predictions=fo.Classification(label="dog", confidence=0.8),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            ground_truth=fo.Classification(label="dog"),
            predictions=fo.Classification(label="dog"),
        ),
        fo.Sample(
            filepath="/path/to/image4.png",
            ground_truth=None,
            predictions=None,
        ),
        fo.Sample(filepath="/path/to/image5.png"),
    ]
)

#
# Only include samples that have a value in their `predictions`
# field
#

view = dataset.exists("predictions")

#
# Only include samples that do NOT have a value in their
# `predictions` field
#

view = dataset.exists("predictions", False)

#
# Only include samples that have prediction confidences
#

view = dataset.exists("predictions.confidence")
Parameters
fieldthe field name or embedded.field.name
bool:Nonewhether to check if the field exists (None or True) or does not exist (False)
Returns
a fiftyone.core.view.DatasetView
def export(self, export_dir=None, dataset_type=None, data_path=None, labels_path=None, export_media=None, rel_dir=None, dataset_exporter=None, label_field=None, frame_labels_field=None, overwrite=False, progress=None, **kwargs): (source)

Exports the samples in the collection to disk.

You can perform exports with this method via the following basic patterns:

  1. Provide export_dir and dataset_type to export the content to a directory in the default layout for the specified format, as documented in :ref:`this page <exporting-datasets>`
  2. Provide dataset_type along with data_path, labels_path, and/or export_media to directly specify where to export the source media and/or labels (if applicable) in your desired format. This syntax provides the flexibility to, for example, perform workflows like labels-only exports
  3. Provide a dataset_exporter to which to feed samples to perform a fully-customized export

In all workflows, the remaining parameters of this method can be provided to further configure the export.

See :ref:`this page <exporting-datasets>` for more information about the available export formats and examples of using this method.

See :ref:`this guide <custom-dataset-exporter>` for more details about exporting datasets in custom formats by defining your own fiftyone.utils.data.exporters.DatasetExporter.

This method will automatically coerce the data to match the requested export in the following cases:

Parameters
export_dir:None

the directory to which to export the samples in format dataset_type. This parameter may be omitted if you have provided appropriate values for the data_path and/or labels_path parameters. Alternatively, this can also be an archive path with one of the following extensions:

.zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz

If an archive path is specified, the export is performed in a directory of same name (minus extension) and then automatically archived and the directory then deleted

dataset_type:Nonethe fiftyone.types.Dataset type to write. If not specified, the default type for label_field is used
data_path:None

an optional parameter that enables explicit control over the location of the exported media for certain export formats. Can be any of the following:

  • a folder name like "data" or "data/" specifying a subfolder of export_dir in which to export the media
  • an absolute directory path in which to export the media. In this case, the export_dir has no effect on the location of the data
  • a filename like "data.json" specifying the filename of a JSON manifest file in export_dir generated when export_media is "manifest"
  • an absolute filepath specifying the location to write the JSON manifest file when export_media is "manifest". In this case, export_dir has no effect on the location of the data

If None, a default value of this parameter will be chosen based on the value of the export_media parameter. Note that this parameter is not applicable to certain export formats such as binary types like TF records

labels_path:None

an optional parameter that enables explicit control over the location of the exported labels. Only applicable when exporting in certain labeled dataset formats. Can be any of the following:

  • a type-specific folder name like "labels" or "labels/" or a filename like "labels.json" or "labels.xml" specifying the location in export_dir in which to export the labels
  • an absolute directory or filepath in which to export the labels. In this case, the export_dir has no effect on the location of the labels

For labeled datasets, the default value of this parameter will be chosen based on the export format so that the labels will be exported into export_dir

export_media:None

controls how to export the raw media. The supported values are:

  • True: copy all media files into the output directory
  • False: don't export media. This option is only useful when exporting labeled datasets whose label format stores sufficient information to locate the associated media
  • "move": move all media files into the output directory
  • "symlink": create symlinks to the media files in the output directory
  • "manifest": create a data.json in the output directory that maps UUIDs used in the labels files to the filepaths of the source media, rather than exporting the actual media

If None, an appropriate default value of this parameter will be chosen based on the value of the data_path parameter. Note that some dataset formats may not support certain values for this parameter (e.g., when exporting in binary formats such as TF records, "symlink" is not an option)

rel_dir:Nonean optional relative directory to strip from each input filepath to generate a unique identifier for each media. When exporting media, this identifier is joined with data_path to generate an output path for each exported media. This argument allows for populating nested subdirectories that match the shape of the input paths. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path
dataset_exporter:Nonea fiftyone.utils.data.exporters.DatasetExporter to use to export the samples. When provided, parameters such as export_dir, dataset_type, data_path, and labels_path have no effect
label_field:None

controls the label field(s) to export. Only applicable to labeled datasets. Can be any of the following:

  • the name of a label field to export
  • a glob pattern of label field(s) to export
  • a list or tuple of label field(s) to export
  • a dictionary mapping label field names to keys to use when constructing the label dictionaries to pass to the exporter

Note that multiple fields can only be specified when the exporter used can handle dictionaries of labels. By default, the first field of compatible type for the exporter is used. When exporting labeled video datasets, this argument may contain frame fields prefixed by "frames."

frame_labels_field:None

controls the frame label field(s) to export. The "frames." prefix is optional. Only applicable to labeled video datasets. Can be any of the following:

  • the name of a frame label field to export
  • a glob pattern of frame label field(s) to export
  • a list or tuple of frame label field(s) to export
  • a dictionary mapping frame label field names to keys to use when constructing the frame label dictionaries to pass to the exporter

Note that multiple fields can only be specified when the exporter used can handle dictionaries of frame labels. By default, the first field of compatible type for the exporter is used

overwrite:Falsewhether to delete existing directories before performing the export (True) or to merge the export with existing files and directories (False)
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargsoptional keyword arguments to pass to the dataset exporter's constructor. If you are exporting image patches, this can also contain keyword arguments for fiftyone.utils.patches.ImagePatchesExtractor
@view_stage
def filter_field(self, field, filter, only_matches=True): (source)

Filters the values of a field or embedded field of each sample in the collection.

Values of field for which filter returns False are replaced with None.

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            ground_truth=fo.Classification(label="cat"),
            predictions=fo.Classification(label="cat", confidence=0.9),
            numeric_field=1.0,
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            ground_truth=fo.Classification(label="dog"),
            predictions=fo.Classification(label="dog", confidence=0.8),
            numeric_field=-1.0,
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            ground_truth=None,
            predictions=None,
            numeric_field=None,
        ),
    ]
)

#
# Only include classifications in the `predictions` field
# whose `label` is "cat"
#

view = dataset.filter_field("predictions", F("label") == "cat")

#
# Only include samples whose `numeric_field` value is positive
#

view = dataset.filter_field("numeric_field", F() > 0)
Parameters
fieldthe field name or embedded.field.name
filtera fiftyone.core.expressions.ViewExpression or MongoDB expression that returns a boolean describing the filter to apply
only_matches:Truewhether to only include samples that match the filter (True) or include all samples (False)
Returns
a fiftyone.core.view.DatasetView
@view_stage
def filter_keypoints(self, field, filter=None, labels=None, only_matches=True): (source)

Filters the individual fiftyone.core.labels.Keypoint.points elements in the specified keypoints field of each sample in the collection.

Note

Use filter_labels if you simply want to filter entire fiftyone.core.labels.Keypoint objects in a field.

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            predictions=fo.Keypoints(
                keypoints=[
                    fo.Keypoint(
                        label="person",
                        points=[(0.1, 0.1), (0.1, 0.9), (0.9, 0.9), (0.9, 0.1)],
                        confidence=[0.7, 0.8, 0.95, 0.99],
                    )
                ]
            )
        ),
        fo.Sample(filepath="/path/to/image2.png"),
    ]
)

dataset.default_skeleton = fo.KeypointSkeleton(
    labels=["nose", "left eye", "right eye", "left ear", "right ear"],
    edges=[[0, 1, 2, 0], [0, 3], [0, 4]],
)

#
# Only include keypoints in the `predictions` field whose
# `confidence` is greater than 0.9
#

view = dataset.filter_keypoints(
    "predictions", filter=F("confidence") > 0.9
)

#
# Only include keypoints in the `predictions` field with less than
# four points
#

view = dataset.filter_keypoints(
    "predictions", labels=["left eye", "right eye"]
)
Parameters
fieldthe fiftyone.core.labels.Keypoint or fiftyone.core.labels.Keypoints field to filter
filter:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression that returns a boolean, like F("confidence") > 0.5 or F("occluded") == False, to apply elementwise to the specified field, which must be a list of same length as fiftyone.core.labels.Keypoint.points
labels:Nonea label or iterable of keypoint skeleton labels to keep
only_matches:Truewhether to only include keypoints/samples with at least one point after filtering (True) or include all keypoints/samples (False)
Returns
a fiftyone.core.view.DatasetView
@view_stage
def filter_labels(self, field, filter, only_matches=True, trajectories=False): (source)

Filters the fiftyone.core.labels.Label field of each sample in the collection.

If the specified field is a single fiftyone.core.labels.Label type, fields for which filter returns False are replaced with None:

If the specified field is a fiftyone.core.labels.Label list type, the label elements for which filter returns False are omitted from the view:

Classifications Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            predictions=fo.Classification(label="cat", confidence=0.9),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            predictions=fo.Classification(label="dog", confidence=0.8),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            predictions=fo.Classification(label="rabbit"),
        ),
        fo.Sample(
            filepath="/path/to/image4.png",
            predictions=None,
        ),
    ]
)

#
# Only include classifications in the `predictions` field whose
# `confidence` is greater than 0.8
#

view = dataset.filter_labels("predictions", F("confidence") > 0.8)

#
# Only include classifications in the `predictions` field whose
# `label` is "cat" or "dog"
#

view = dataset.filter_labels(
    "predictions", F("label").is_in(["cat", "dog"])
)

Detections Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.1, 0.1, 0.5, 0.5],
                        confidence=0.9,
                    ),
                    fo.Detection(
                        label="dog",
                        bounding_box=[0.2, 0.2, 0.3, 0.3],
                        confidence=0.8,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.5, 0.5, 0.4, 0.4],
                        confidence=0.95,
                    ),
                    fo.Detection(label="rabbit"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="squirrel",
                        bounding_box=[0.25, 0.25, 0.5, 0.5],
                        confidence=0.5,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image4.png",
            predictions=None,
        ),
    ]
)

#
# Only include detections in the `predictions` field whose
# `confidence` is greater than 0.8
#

view = dataset.filter_labels("predictions", F("confidence") > 0.8)

#
# Only include detections in the `predictions` field whose `label`
# is "cat" or "dog"
#

view = dataset.filter_labels(
    "predictions", F("label").is_in(["cat", "dog"])
)

#
# Only include detections in the `predictions` field whose bounding
# box area is smaller than 0.2
#

# Bboxes are in [top-left-x, top-left-y, width, height] format
bbox_area = F("bounding_box")[2] * F("bounding_box")[3]

view = dataset.filter_labels("predictions", bbox_area < 0.2)

Polylines Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            predictions=fo.Polylines(
                polylines=[
                    fo.Polyline(
                        label="lane",
                        points=[[(0.1, 0.1), (0.1, 0.6)]],
                        filled=False,
                    ),
                    fo.Polyline(
                        label="road",
                        points=[[(0.2, 0.2), (0.5, 0.5), (0.2, 0.5)]],
                        filled=True,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            predictions=fo.Polylines(
                polylines=[
                    fo.Polyline(
                        label="lane",
                        points=[[(0.4, 0.4), (0.9, 0.4)]],
                        filled=False,
                    ),
                    fo.Polyline(
                        label="road",
                        points=[[(0.6, 0.6), (0.9, 0.9), (0.6, 0.9)]],
                        filled=True,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            predictions=None,
        ),
    ]
)

#
# Only include polylines in the `predictions` field that are filled
#

view = dataset.filter_labels("predictions", F("filled") == True)

#
# Only include polylines in the `predictions` field whose `label`
# is "lane"
#

view = dataset.filter_labels("predictions", F("label") == "lane")

#
# Only include polylines in the `predictions` field with at least
# 3 vertices
#

num_vertices = F("points").map(F().length()).sum()
view = dataset.filter_labels("predictions", num_vertices >= 3)

Keypoints Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            predictions=fo.Keypoint(
                label="house",
                points=[(0.1, 0.1), (0.1, 0.9), (0.9, 0.9), (0.9, 0.1)],
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            predictions=fo.Keypoint(
                label="window",
                points=[(0.4, 0.4), (0.5, 0.5), (0.6, 0.6)],
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            predictions=None,
        ),
    ]
)

#
# Only include keypoints in the `predictions` field whose `label`
# is "house"
#

view = dataset.filter_labels("predictions", F("label") == "house")

#
# Only include keypoints in the `predictions` field with less than
# four points
#

view = dataset.filter_labels("predictions", F("points").length() < 4)
Parameters
fieldthe label field to filter
filtera fiftyone.core.expressions.ViewExpression or MongoDB expression that returns a boolean describing the filter to apply
only_matches:Truewhether to only include samples with at least one label after filtering (True) or include all samples (False)
trajectories:Falsewhether to match entire object trajectories for which the object matches the given filter on at least one frame. Only applicable to datasets that contain videos and frame-level label fields whose objects have their index attributes populated
Returns
a fiftyone.core.view.DatasetView
def first(self): (source)

Returns the first sample in the collection.

Returns
a fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView
@view_stage
def flatten(self, stages=None): (source)

Returns a flattened view that contains all samples in the dynamic grouped collection.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("cifar10", split="test")

# Group samples by ground truth label
grouped_view = dataset.take(1000).group_by("ground_truth.label")
print(len(grouped_view))  # 10

# Return a flat view that contains 10 samples from each class
flat_view = grouped_view.flatten(fo.Limit(10))
print(len(flat_view))  # 100
Parameters
stages:Nonea fiftyone.core.stages.ViewStage or list of fiftyone.core.stages.ViewStage instances to apply to each group's samples while flattening
Returns
a fiftyone.core.view.DatasetView
@view_stage
def geo_near(self, point, location_field=None, min_distance=None, max_distance=None, query=None, create_index=True): (source)

Sorts the samples in the collection by their proximity to a specified geolocation.

Note

This stage must be the first stage in any fiftyone.core.view.DatasetView in which it appears.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

TIMES_SQUARE = [-73.9855, 40.7580]

dataset = foz.load_zoo_dataset("quickstart-geo")

#
# Sort the samples by their proximity to Times Square
#

view = dataset.geo_near(TIMES_SQUARE)

#
# Sort the samples by their proximity to Times Square, and only
# include samples within 5km
#

view = dataset.geo_near(TIMES_SQUARE, max_distance=5000)

#
# Sort the samples by their proximity to Times Square, and only
# include samples that are in Manhattan
#

import fiftyone.utils.geojson as foug

in_manhattan = foug.geo_within(
    "location.point",
    [
        [
            [-73.949701, 40.834487],
            [-73.896611, 40.815076],
            [-73.998083, 40.696534],
            [-74.031751, 40.715273],
            [-73.949701, 40.834487],
        ]
    ]
)

view = dataset.geo_near(
    TIMES_SQUARE, location_field="location", query=in_manhattan
)
Parameters
point

the reference point to compute distances to. Can be any of the following:

location_field:None

the location data of each sample to use. Can be any of the following:

  • The name of a fiftyone.core.fields.GeoLocation field whose point attribute to use as location data
  • An embedded.field.name containing GeoJSON data to use as location data
  • None, in which case there must be a single fiftyone.core.fields.GeoLocation field on the samples, which is used by default
min_distance:Nonefilter samples that are less than this distance (in meters) from point
max_distance:Nonefilter samples that are greater than this distance (in meters) from point
query:Nonean optional dict defining a MongoDB read query that samples must match in order to be included in this view
create_index:Truewhether to create the required spherical index, if necessary
Returns
a fiftyone.core.view.DatasetView
@view_stage
def geo_within(self, boundary, location_field=None, strict=True, create_index=True): (source)

Filters the samples in this collection to only include samples whose geolocation is within a specified boundary.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

MANHATTAN = [
    [
        [-73.949701, 40.834487],
        [-73.896611, 40.815076],
        [-73.998083, 40.696534],
        [-74.031751, 40.715273],
        [-73.949701, 40.834487],
    ]
]

dataset = foz.load_zoo_dataset("quickstart-geo")

#
# Create a view that only contains samples in Manhattan
#

view = dataset.geo_within(MANHATTAN)
Parameters
boundarya fiftyone.core.labels.GeoLocation, fiftyone.core.labels.GeoLocations, GeoJSON dict, or list of coordinates that define a Polygon or MultiPolygon to search within
location_field:None

the location data of each sample to use. Can be any of the following:

  • The name of a fiftyone.core.fields.GeoLocation field whose point attribute to use as location data
  • An embedded.field.name that directly contains the GeoJSON location data to use
  • None, in which case there must be a single fiftyone.core.fields.GeoLocation field on the samples, which is used by default
strict:Truewhether a sample's location data must strictly fall within boundary (True) in order to match, or whether any intersection suffices (False)
create_index:Truewhether to create the required spherical index, if necessary
Returns
a fiftyone.core.view.DatasetView
def get_annotation_info(self, anno_key): (source)

Returns information about the annotation run with the given key on this collection.

Parameters
anno_keyan annotation key
Returns
a fiftyone.core.annotation.AnnotationInfo
def get_brain_info(self, brain_key): (source)

Returns information about the brain method run with the given key on this collection.

Parameters
brain_keya brain key
Returns
a fiftyone.core.brain.BrainInfo
def get_classes(self, field): (source)

Gets the classes list for the given field, or None if no classes are available.

Classes are first retrieved from classes if they exist, otherwise from default_classes.

Parameters
fielda field name
Returns
a list of classes, or None
def get_dynamic_field_schema(self, fields=None, recursive=True): (source)

Returns a schema dictionary describing the dynamic fields of the samples in the collection.

Dynamic fields are embedded document fields with at least one non-None value that have not been declared on the dataset's schema.

Parameters
fields:Nonean optional field or iterable of fields for which to return dynamic fields. By default, all fields are considered
recursive:Truewhether to recursively inspect nested lists and embedded documents
Returns
a dict mapping field paths to fiftyone.core.fields.Field instances or lists of them
def get_dynamic_frame_field_schema(self, fields=None, recursive=True): (source)

Returns a schema dictionary describing the dynamic fields of the frames in the collection.

Dynamic fields are embedded document fields with at least one non-None value that have not been declared on the dataset's schema.

Parameters
fields:Nonean optional field or iterable of fields for which to return dynamic fields. By default, all fields are considered
recursive:Truewhether to recursively inspect nested lists and embedded documents
Returns
a dict mapping field paths to fiftyone.core.fields.Field instances or lists of them, or None if the collection does not contain videos
def get_evaluation_info(self, eval_key): (source)

Returns information about the evaluation with the given key on this collection.

Parameters
eval_keyan evaluation key
Returns
an fiftyone.core.evaluation.EvaluationInfo
def get_field(self, path, ftype=None, embedded_doc_type=None, read_only=None, include_private=False, leaf=False): (source)

Returns the field instance of the provided path, or None if one does not exist.

Parameters
patha field path
ftype:Nonean optional field type to enforce. Must be a subclass of fiftyone.core.fields.Field
embedded_doc_type:Nonean optional embedded document type to enforce. Must be a subclass of fiftyone.core.odm.BaseEmbeddedDocument
read_only:Nonewhether to optionally enforce that the field is read-only (True) or not read-only (False)
include_private:Falsewhether to include fields that start with _ in the returned schema
leaf:Falsewhether to return the subfield of list fields
Returns
a fiftyone.core.fields.Field instance or None
Raises
ValueErrorif the field does not match provided constraints
def get_field_schema(self, ftype=None, embedded_doc_type=None, read_only=None, info_keys=None, created_after=None, include_private=False, flat=False, mode=None): (source)

Returns a schema dictionary describing the fields of the samples in the collection.

Parameters
ftype:Nonean optional field type or iterable of types to which to restrict the returned schema. Must be subclass(es) of fiftyone.core.fields.Field
embedded_doc_type:Nonean optional embedded document type or iterable of types to which to restrict the returned schema. Must be subclass(es) of fiftyone.core.odm.BaseEmbeddedDocument
read_only:Nonewhether to restrict to (True) or exclude (False) read-only fields. By default, all fields are included
info_keys:Nonean optional key or list of keys that must be in the field's info dict
created_after:Nonean optional datetime specifying a minimum creation date
include_private:Falsewhether to include fields that start with _ in the returned schema
flat:Falsewhether to return a flattened schema where all embedded document fields are included as top-level keys
mode:Nonewhether to apply the above constraints before and/or after flattening the schema. Only applicable when flat is True. Supported values are ("before", "after", "both"). The default is "after"
Returns
a dict mapping field names to fiftyone.core.fields.Field instances
def get_frame_field_schema(self, ftype=None, embedded_doc_type=None, read_only=None, info_keys=None, created_after=None, include_private=False, flat=False, mode=None): (source)

Returns a schema dictionary describing the fields of the frames in the collection.

Only applicable for collections that contain videos.

Parameters
ftype:Nonean optional field type to which to restrict the returned schema. Must be a subclass of fiftyone.core.fields.Field
embedded_doc_type:Nonean optional embedded document type to which to restrict the returned schema. Must be a subclass of fiftyone.core.odm.BaseEmbeddedDocument
read_only:Nonewhether to restrict to (True) or exclude (False) read-only fields. By default, all fields are included
info_keys:Nonean optional key or list of keys that must be in the field's info dict
created_after:Nonean optional datetime specifying a minimum creation date
include_private:Falsewhether to include fields that start with _ in the returned schema
flat:Falsewhether to return a flattened schema where all embedded document fields are included as top-level keys
mode:Nonewhether to apply the above constraints before and/or after flattening the schema. Only applicable when flat is True. Supported values are ("before", "after", "both"). The default is "after"
Returns
a dict mapping field names to fiftyone.core.fields.Field instances, or None if the collection does not contain videos
def get_group(self, group_id, group_slices=None): (source)

Returns a dict containing the samples for the given group ID.

Parameters
group_ida group ID
group_slices:Nonean optional subset of group slices to load
Returns
a dict mapping group names to fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView instances
Raises
KeyErrorif the group ID is not found
def get_index_information(self, include_stats=False): (source)

Returns a dictionary of information about the indexes on this collection.

See pymongo:pymongo.collection.Collection.index_information for details on the structure of this dictionary.

Parameters
include_stats:Falsewhether to include the size, usage, and build status of each index
Returns
a dict mapping index names to info dicts
def get_mask_targets(self, field): (source)

Gets the mask targets for the given field, or None if no mask targets are available.

Mask targets are first retrieved from mask_targets if they exist, otherwise from default_mask_targets.

Parameters
fielda field name
Returns
a list of classes, or None
def get_run_info(self, run_key): (source)

Returns information about the run with the given key on this collection.

Parameters
run_keya run key
Returns
a fiftyone.core.runs.RunInfo
def get_skeleton(self, field): (source)

Gets the keypoint skeleton for the given field, or None if no skeleton is available.

Skeletons are first retrieved from skeletons if they exist, otherwise from default_skeleton.

Parameters
fielda field name
Returns
a list of classes, or None
@view_stage
def group_by(self, field_or_expr, order_by=None, reverse=False, flat=False, match_expr=None, sort_expr=None, create_index=True): (source)

Creates a view that groups the samples in the collection by a specified field or expression.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("cifar10", split="test")

#
# Take 1000 samples at random and group them by ground truth label
#

view = dataset.take(1000).group_by("ground_truth.label")

for group in view.iter_dynamic_groups():
    group_value = group.first().ground_truth.label
    print("%s: %d" % (group_value, len(group)))

#
# Variation of above operation that arranges the groups in
# decreasing order of size and immediately flattens them
#

from itertools import groupby

view = dataset.take(1000).group_by(
    "ground_truth.label",
    flat=True,
    sort_expr=F().length(),
    reverse=True,
)

rle = lambda v: [(k, len(list(g))) for k, g in groupby(v)]
for label, count in rle(view.values("ground_truth.label")):
    print("%s: %d" % (label, count))
Parameters
field_or_exprthe field or embedded.field.name to group by, or a list of field names defining a compound group key, or a fiftyone.core.expressions.ViewExpression or MongoDB aggregation expression that defines the value to group by
order_by:Nonean optional field by which to order the samples in each group
reverse:Falsewhether to return the results in descending order. Applies both to order_by and sort_expr
flat:Falsewhether to return a grouped collection (False) or a flattened collection (True)
match_expr:Nonean optional fiftyone.core.expressions.ViewExpression or MongoDB aggregation expression that defines which groups to include in the output view. If provided, this expression will be evaluated on the list of samples in each group. Only applicable when flat=True
sort_expr:Nonean optional fiftyone.core.expressions.ViewExpression or MongoDB aggregation expression that defines how to sort the groups in the output view. If provided, this expression will be evaluated on the list of samples in each group. Only applicable when flat=True
create_index:Truewhether to create an index, if necessary, to optimize the grouping. Only applicable when grouping by field(s), not expressions
Returns
a fiftyone.core.view.DatasetView
def has_annotation_run(self, anno_key): (source)

Whether this collection has an annotation run with the given key.

Parameters
anno_keyan annotation key
Returns
True/False
def has_brain_run(self, brain_key): (source)

Whether this collection has a brain method run with the given key.

Parameters
brain_keya brain key
Returns
True/False
def has_classes(self, field): (source)

Determines whether this collection has a classes list for the given field.

Classes may be defined either in classes or default_classes.

Parameters
fielda field name
Returns
True/False
def has_evaluation(self, eval_key): (source)

Whether this collection has an evaluation with the given key.

Parameters
eval_keyan evaluation key
Returns
True/False
def has_field(self, path): (source)

Determines whether the collection has a field with the given name.

Parameters
paththe field name or embedded.field.name
Returns
True/False
def has_frame_field(self, path): (source)

Determines whether the collection has a frame-level field with the given name.

Parameters
paththe field name or embedded.field.name
Returns
True/False
def has_mask_targets(self, field): (source)

Determines whether this collection has mask targets for the given field.

Mask targets may be defined either in mask_targets or default_mask_targets.

Parameters
fielda field name
Returns
True/False
def has_run(self, run_key): (source)

Whether this collection has a run with the given key.

Parameters
run_keya run key
Returns
True/False
def has_sample_field(self, path): (source)

Determines whether the collection has a sample field with the given name.

Parameters
paththe field name or embedded.field.name
Returns
True/False
def has_skeleton(self, field): (source)

Determines whether this collection has a keypoint skeleton for the given field.

Keypoint skeletons may be defined either in skeletons or default_skeleton.

Parameters
fielda field name
Returns
True/False
def head(self, num_samples=3): (source)

Returns a list of the first few samples in the collection.

If fewer than num_samples samples are in the collection, only the available samples are returned.

Parameters
num_samples:3the number of samples
Returns
a list of fiftyone.core.sample.Sample objects
@aggregation
def histogram_values(self, field_or_expr, expr=None, bins=None, range=None, auto=False): (source)

Computes a histogram of the field values in the collection.

This aggregation is typically applied to numeric field types (or lists of such types):

Examples:

import numpy as np
import matplotlib.pyplot as plt

import fiftyone as fo
from fiftyone import ViewField as F

samples = []
for idx in range(100):
    samples.append(
        fo.Sample(
            filepath="/path/to/image%d.png" % idx,
            numeric_field=np.random.randn(),
            numeric_list_field=list(np.random.randn(10)),
        )
    )

dataset = fo.Dataset()
dataset.add_samples(samples)

def plot_hist(counts, edges):
    counts = np.asarray(counts)
    edges = np.asarray(edges)
    left_edges = edges[:-1]
    widths = edges[1:] - edges[:-1]
    plt.bar(left_edges, counts, width=widths, align="edge")

#
# Compute a histogram of a numeric field
#

counts, edges, other = dataset.histogram_values(
    "numeric_field", bins=50, range=(-4, 4)
)

plot_hist(counts, edges)
plt.show(block=False)

#
# Compute the histogram of a numeric list field
#

counts, edges, other = dataset.histogram_values(
    "numeric_list_field", bins=50
)

plot_hist(counts, edges)
plt.show(block=False)

#
# Compute the histogram of a transformation of a numeric field
#

counts, edges, other = dataset.histogram_values(
    2 * (F("numeric_field") + 1), bins=50
)

plot_hist(counts, edges)
plt.show(block=False)
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
bins:Nonecan be either an integer number of bins to generate or a monotonically increasing sequence specifying the bin edges to use. By default, 10 bins are created. If bins is an integer and no range is specified, bin edges are automatically distributed in an attempt to evenly distribute the counts in each bin
range:Nonea (lower, upper) tuple specifying a range in which to generate equal-width bins. Only applicable when bins is an integer
auto:Falsewhether to automatically choose bin edges in an attempt to evenly distribute the counts in each bin. If this option is chosen, bins will only be used if it is an integer, and the range parameter is ignored
Returns
a tuple of
  • counts: a list of counts in each bin
  • edges: an increasing list of bin edges of length len(counts) + 1. Note that each bin is treated as having an inclusive lower boundary and exclusive upper boundary, [lower, upper), including the rightmost bin
  • other: the number of items outside the bins
def init_run(self, **kwargs): (source)

Initializes a config instance for a new run.

Parameters
**kwargsJSON serializable config parameters
Returns
a fiftyone.core.runs.RunConfig
def init_run_results(self, run_key, **kwargs): (source)

Initializes a results instance for the run with the given key.

Parameters
run_keya run key
**kwargsJSON serializable data
Returns
a fiftyone.core.runs.RunResults
def iter_groups(self, group_slices=None, progress=False, autosave=False, batch_size=None, batching_strategy=None): (source)

Returns an iterator over the groups in the collection.

Parameters
group_slices:Nonean optional subset of group slices to load
progress:Falsewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
autosave:Falsewhether to automatically save changes to samples emitted by this iterator
batch_size:Nonethe batch size to use when autosaving samples. If a batching_strategy is provided, this parameter configures the strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency")
batching_strategy:None

the batching strategy to use for each save operation when autosaving samples. Supported values are:

  • "static": a fixed sample batch size for each save
  • "size": a target batch size, in bytes, for each save
  • "latency": a target latency, in seconds, between saves

By default, fo.config.default_batcher is used

Returns
an iterator that emits dicts mapping group slice names to fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView instances, one per group
def iter_samples(self, progress=False, autosave=False, batch_size=None, batching_strategy=None): (source)

Returns an iterator over the samples in the collection.

Parameters
progress:Falsewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
autosave:Falsewhether to automatically save changes to samples emitted by this iterator
batch_size:Nonethe batch size to use when autosaving samples. If a batching_strategy is provided, this parameter configures the strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency")
batching_strategy:None

the batching strategy to use for each save operation when autosaving samples. Supported values are:

  • "static": a fixed sample batch size for each save
  • "size": a target batch size, in bytes, for each save
  • "latency": a target latency, in seconds, between saves

By default, fo.config.default_batcher is used

Returns
an iterator over fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView instances
def last(self): (source)

Returns the last sample in the collection.

Returns
a fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView
@view_stage
def limit(self, limit): (source)

Returns a view with at most the given number of samples.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            ground_truth=fo.Classification(label="cat"),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            ground_truth=fo.Classification(label="dog"),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            ground_truth=None,
        ),
    ]
)

#
# Only include the first 2 samples in the view
#

view = dataset.limit(2)
Parameters
limitthe maximum number of samples to return. If a non-positive number is provided, an empty view is returned
Returns
a fiftyone.core.view.DatasetView
@view_stage
def limit_labels(self, field, limit): (source)

Limits the number of fiftyone.core.labels.Label instances in the specified labels list field of each sample in the collection.

The specified field must be one of the following types:

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.1, 0.1, 0.5, 0.5],
                        confidence=0.9,
                    ),
                    fo.Detection(
                        label="dog",
                        bounding_box=[0.2, 0.2, 0.3, 0.3],
                        confidence=0.8,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.5, 0.5, 0.4, 0.4],
                        confidence=0.95,
                    ),
                    fo.Detection(label="rabbit"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image4.png",
            predictions=None,
        ),
    ]
)

#
# Only include the first detection in the `predictions` field of
# each sample
#

view = dataset.limit_labels("predictions", 1)
Parameters
fieldthe labels list field to filter
limitthe maximum number of labels to include in each labels list. If a non-positive number is provided, all lists will be empty
Returns
a fiftyone.core.view.DatasetView
def list_annotation_runs(self, type=None, method=None, **kwargs): (source)

Returns a list of annotation keys on this collection.

Parameters
type:None

a specific annotation run type to match, which can be:

  • a string fiftyone.core.annotations.AnnotationMethodConfig.type
  • a fiftyone.core.annotations.AnnotationMethod class or its fully-qualified class name string
method:Nonea specific fiftyone.core.annotations.AnnotationMethodConfig.method string to match
**kwargsoptional config parameters to match
Returns
a list of annotation keys
def list_brain_runs(self, type=None, method=None, **kwargs): (source)

Returns a list of brain keys on this collection.

Parameters
type:None

a specific brain run type to match, which can be:

method:Nonea specific fiftyone.core.brain.BrainMethodConfig.method string to match
**kwargsoptional config parameters to match
Returns
a list of brain keys
def list_evaluations(self, type=None, method=None, **kwargs): (source)

Returns a list of evaluation keys on this collection.

Parameters
type:None

a specific evaluation type to match, which can be:

  • a string fiftyone.core.evaluations.EvaluationMethodConfig.type
  • a fiftyone.core.evaluations.EvaluationMethod class or its fully-qualified class name string
method:Nonea specific fiftyone.core.evaluations.EvaluationMethodConfig.method string to match
**kwargsoptional config parameters to match
Returns
a list of evaluation keys
def list_indexes(self): (source)

Returns the list of index names on this collection.

Single-field indexes are referenced by their field name, while compound indexes are referenced by more complicated strings. See pymongo:pymongo.collection.Collection.index_information for details on the compound format.

Returns
the list of index names
def list_runs(self, **kwargs): (source)

Returns a list of run keys on this collection.

Parameters
**kwargsoptional config parameters to match
Returns
a list of run keys
@aggregation
def list_schema(self, field_or_expr, expr=None): (source)

Extracts the value type(s) in a specified list field across all samples in the collection.

Examples:

from datetime import datetime
import fiftyone as fo

dataset = fo.Dataset()

sample1 = fo.Sample(
    filepath="image1.png",
    ground_truth=fo.Classification(
        label="cat",
        info=[
            fo.DynamicEmbeddedDocument(
                task="initial_annotation",
                author="Alice",
                timestamp=datetime(1970, 1, 1),
                notes=["foo", "bar"],
            ),
            fo.DynamicEmbeddedDocument(
                task="editing_pass",
                author="Bob",
                timestamp=datetime.utcnow(),
            ),
        ],
    ),
)

sample2 = fo.Sample(
    filepath="image2.png",
    ground_truth=fo.Classification(
        label="dog",
        info=[
            fo.DynamicEmbeddedDocument(
                task="initial_annotation",
                author="Bob",
                timestamp=datetime(2018, 10, 18),
                notes=["spam", "eggs"],
            ),
        ],
    ),
)

dataset.add_samples([sample1, sample2])

# Determine that `ground_truth.info` contains embedded documents
print(dataset.list_schema("ground_truth.info"))
# fo.EmbeddedDocumentField

# Determine the fields of the embedded documents in the list
print(dataset.schema("ground_truth.info[]"))
# {'task': StringField, ..., 'notes': ListField}

# Determine the type of the values in the nested `notes` list field
# Since `ground_truth.info` is not yet declared on the dataset's
# schema, we must manually include `[]` to unwind the info lists
print(dataset.list_schema("ground_truth.info[].notes"))
# fo.StringField

# Declare the `ground_truth.info` field
dataset.add_sample_field(
    "ground_truth.info",
    fo.ListField,
    subfield=fo.EmbeddedDocumentField,
    embedded_doc_type=fo.DynamicEmbeddedDocument,
)

# Now we can inspect the nested `notes` field without unwinding
print(dataset.list_schema("ground_truth.info.notes"))
# fo.StringField
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
Returns
a fiftyone.core.fields.Field or list of fiftyone.core.fields.Field instances describing the value type(s) in the list
def load_annotation_results(self, anno_key, cache=True, **kwargs): (source)

Loads the results for the annotation run with the given key on this collection.

The fiftyone.utils.annotations.AnnotationResults object returned by this method will provide a variety of backend-specific methods allowing you to perform actions such as checking the status and deleting this run from the annotation backend.

Use load_annotations to load the labels from an annotation run onto your FiftyOne dataset.

Parameters
anno_keyan annotation key
cache:Truewhether to cache the results on the collection
**kwargskeyword arguments for run's fiftyone.core.annotation.AnnotationMethodConfig.load_credentials method
Returns
a fiftyone.utils.annotations.AnnotationResults
def load_annotation_view(self, anno_key, select_fields=False): (source)

Loads the fiftyone.core.view.DatasetView on which the specified annotation run was performed on this collection.

Parameters
anno_keyan annotation key
select_fields:Falsewhether to exclude fields involved in other annotation runs
Returns
a fiftyone.core.view.DatasetView
def load_annotations(self, anno_key, dest_field=None, unexpected='prompt', cleanup=False, progress=None, **kwargs): (source)

Downloads the labels from the given annotation run from the annotation backend and merges them into this collection.

See :ref:`this page <loading-annotations>` for more information about using this method to import annotations that you have scheduled by calling annotate.

Parameters
anno_keyan annotation key
dest_field:Nonean optional name of a new destination field into which to load the annotations, or a dict mapping field names in the run's label schema to new destination field names
unexpected:"prompt"

how to deal with any unexpected labels that don't match the run's label schema when importing. The supported values are:

  • "prompt": present an interactive prompt to direct/discard unexpected labels
  • "ignore": automatically ignore any unexpected labels
  • "keep": automatically keep all unexpected labels in a field whose name matches the the label type
  • "return": return a dict containing all unexpected labels, or None if there aren't any
cleanup:Falsewhether to delete any informtation regarding this run from the annotation backend after loading the annotations
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargskeyword arguments for the run's fiftyone.core.annotation.AnnotationMethodConfig.load_credentials method
Returns
None, unless unexpected=="return" and unexpected labels are found, in which case a dict containing the extra labels is returned
def load_brain_results(self, brain_key, cache=True, load_view=True, **kwargs): (source)

Loads the results for the brain method run with the given key on this collection.

Parameters
brain_keya brain key
cache:Truewhether to cache the results on the collection
load_view:Truewhether to load the view on which the results were computed (True) or the full dataset (False)
**kwargskeyword arguments for the run's fiftyone.core.brain.BrainMethodConfig.load_credentials method
Returns
a fiftyone.core.brain.BrainResults
def load_brain_view(self, brain_key, select_fields=False): (source)

Loads the fiftyone.core.view.DatasetView on which the specified brain method run was performed on this collection.

Parameters
brain_keya brain key
select_fields:Falsewhether to exclude fields involved in other brain method runs
Returns
a fiftyone.core.view.DatasetView
def load_evaluation_results(self, eval_key, cache=True, **kwargs): (source)

Loads the results for the evaluation with the given key on this collection.

Parameters
eval_keyan evaluation key
cache:Truewhether to cache the results on the collection
**kwargskeyword arguments for the run's fiftyone.core.evaluation.EvaluationMethodConfig.load_credentials method
Returns
a fiftyone.core.evaluation.EvaluationResults
def load_evaluation_view(self, eval_key, select_fields=False): (source)

Loads the fiftyone.core.view.DatasetView on which the specified evaluation was performed on this collection.

Parameters
eval_keyan evaluation key
select_fields:Falsewhether to exclude fields involved in other evaluations
Returns
a fiftyone.core.view.DatasetView
def load_run_results(self, run_key, cache=True, load_view=True, **kwargs): (source)

Loads the results for the run with the given key on this collection.

Parameters
run_keya run key
cache:Truewhether to cache the results on the collection
load_view:Truewhether to load the view on which the results were computed (True) or the full dataset (False)
**kwargskeyword arguments for the run's fiftyone.core.runs.RunConfig.load_credentials method
Returns
a fiftyone.core.runs.RunResults
def load_run_view(self, run_key, select_fields=False): (source)

Loads the fiftyone.core.view.DatasetView on which the specified run was performed on this collection.

Parameters
run_keya run key
select_fields:Falsewhether to exclude fields involved in other runs
Returns
a fiftyone.core.view.DatasetView
def make_unique_field_name(self, root=''): (source)

Makes a unique field name with the given root name for the collection.

Parameters
root:""an optional root for the output field name
Returns
the field name
@view_stage
def map_labels(self, field, map): (source)

Maps the label values of a fiftyone.core.labels.Label field to new values for each sample in the collection.

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            weather=fo.Classification(label="sunny"),
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.1, 0.1, 0.5, 0.5],
                        confidence=0.9,
                    ),
                    fo.Detection(
                        label="dog",
                        bounding_box=[0.2, 0.2, 0.3, 0.3],
                        confidence=0.8,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            weather=fo.Classification(label="cloudy"),
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.5, 0.5, 0.4, 0.4],
                        confidence=0.95,
                    ),
                    fo.Detection(label="rabbit"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            weather=fo.Classification(label="partly cloudy"),
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="squirrel",
                        bounding_box=[0.25, 0.25, 0.5, 0.5],
                        confidence=0.5,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image4.png",
            predictions=None,
        ),
    ]
)

#
# Map the "partly cloudy" weather label to "cloudy"
#

view = dataset.map_labels("weather", {"partly cloudy": "cloudy"})

#
# Map "rabbit" and "squirrel" predictions to "other"
#

view = dataset.map_labels(
    "predictions", {"rabbit": "other", "squirrel": "other"}
)
Parameters
fieldthe labels field to map
mapa dict mapping label values to new label values
Returns
a fiftyone.core.view.DatasetView
@view_stage
def match(self, filter): (source)

Filters the samples in the collection by the given filter.

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            weather=fo.Classification(label="sunny"),
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.1, 0.1, 0.5, 0.5],
                        confidence=0.9,
                    ),
                    fo.Detection(
                        label="dog",
                        bounding_box=[0.2, 0.2, 0.3, 0.3],
                        confidence=0.8,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.jpg",
            weather=fo.Classification(label="cloudy"),
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.5, 0.5, 0.4, 0.4],
                        confidence=0.95,
                    ),
                    fo.Detection(label="rabbit"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            weather=fo.Classification(label="partly cloudy"),
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="squirrel",
                        bounding_box=[0.25, 0.25, 0.5, 0.5],
                        confidence=0.5,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image4.jpg",
            predictions=None,
        ),
    ]
)

#
# Only include samples whose `filepath` ends with ".jpg"
#

view = dataset.match(F("filepath").ends_with(".jpg"))

#
# Only include samples whose `weather` field is "sunny"
#

view = dataset.match(F("weather").label == "sunny")

#
# Only include samples with at least 2 objects in their
# `predictions` field
#

view = dataset.match(F("predictions").detections.length() >= 2)

#
# Only include samples whose `predictions` field contains at least
# one object with area smaller than 0.2
#

# Bboxes are in [top-left-x, top-left-y, width, height] format
bbox = F("bounding_box")
bbox_area = bbox[2] * bbox[3]

small_boxes = F("predictions.detections").filter(bbox_area < 0.2)
view = dataset.match(small_boxes.length() > 0)
Parameters
filtera fiftyone.core.expressions.ViewExpression or MongoDB expression that returns a boolean describing the filter to apply
Returns
a fiftyone.core.view.DatasetView
@view_stage
def match_frames(self, filter, omit_empty=True): (source)

Filters the frames in the video collection by the given filter.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart-video")

#
# Match frames with at least 10 detections
#

num_objects = F("detections.detections").length()
view = dataset.match_frames(num_objects > 10)

print(dataset.count())
print(view.count())

print(dataset.count("frames"))
print(view.count("frames"))
Parameters
filtera fiftyone.core.expressions.ViewExpression or MongoDB aggregation expression that returns a boolean describing the filter to apply
omit_empty:Truewhether to omit samples with no frame labels after filtering
Returns
a fiftyone.core.view.DatasetView
@view_stage
def match_labels(self, labels=None, ids=None, tags=None, filter=None, fields=None, bool=None): (source)

Selects the samples from the collection that contain (or do not contain) at least one label that matches the specified criteria.

Note that, unlike select_labels and filter_labels, this stage will not filter the labels themselves; it only selects the corresponding samples.

You can perform a selection via one or more of the following methods:

  • Provide the labels argument, which should contain a list of dicts in the format returned by fiftyone.core.session.Session.selected_labels, to match specific labels
  • Provide the ids argument to match labels with specific IDs
  • Provide the tags argument to match labels with specific tags
  • Provide the filter argument to match labels based on a boolean fiftyone.core.expressions.ViewExpression that is applied to each individual fiftyone.core.labels.Label element
  • Pass bool=False to negate the operation and instead match samples that do not contain at least one label matching the specified criteria

If multiple criteria are specified, labels must match all of them in order to trigger a sample match.

By default, the selection is applied to all fiftyone.core.labels.Label fields, but you can provide the fields argument to explicitly define the field(s) in which to search.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

#
# Only show samples whose labels are currently selected in the App
#

session = fo.launch_app(dataset)

# Select some labels in the App...

view = dataset.match_labels(labels=session.selected_labels)

#
# Only include samples that contain labels with the specified IDs
#

# Grab some label IDs
ids = [
    dataset.first().ground_truth.detections[0].id,
    dataset.last().predictions.detections[0].id,
]

view = dataset.match_labels(ids=ids)

print(len(view))
print(view.count("ground_truth.detections"))
print(view.count("predictions.detections"))

#
# Only include samples that contain labels with the specified tags
#

# Grab some label IDs
ids = [
    dataset.first().ground_truth.detections[0].id,
    dataset.last().predictions.detections[0].id,
]

# Give the labels a "test" tag
dataset = dataset.clone()  # create copy since we're modifying data
dataset.select_labels(ids=ids).tag_labels("test")

print(dataset.count_values("ground_truth.detections.tags"))
print(dataset.count_values("predictions.detections.tags"))

# Retrieve the labels via their tag
view = dataset.match_labels(tags="test")

print(len(view))
print(view.count("ground_truth.detections"))
print(view.count("predictions.detections"))

#
# Only include samples that contain labels matching a filter
#

filter = F("confidence") > 0.99
view = dataset.match_labels(filter=filter, fields="predictions")

print(len(view))
print(view.count("ground_truth.detections"))
print(view.count("predictions.detections"))
Parameters
labels:Nonea list of dicts specifying the labels to select in the format returned by fiftyone.core.session.Session.selected_labels
ids:Nonean ID or iterable of IDs of the labels to select
tags:Nonea tag or iterable of tags of labels to select
filter:Nonea fiftyone.core.expressions.ViewExpression or MongoDB aggregation expression that returns a boolean describing whether to select a given label. In the case of list fields like fiftyone.core.labels.Detections, the filter is applied to the list elements, not the root field
fields:Nonea field or iterable of fields from which to select
bool:Nonewhether to match samples that have (None or True) or do not have (False) at least one label that matches the specified criteria
Returns
a fiftyone.core.view.DatasetView
@view_stage
def match_tags(self, tags, bool=None, all=False): (source)

Returns a view containing the samples in the collection that have or don't have any/all of the given tag(s).

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(filepath="image1.png", tags=["train"]),
        fo.Sample(filepath="image2.png", tags=["test"]),
        fo.Sample(filepath="image3.png", tags=["train", "test"]),
        fo.Sample(filepath="image4.png"),
    ]
)

#
# Only include samples that have the "test" tag
#

view = dataset.match_tags("test")

#
# Only include samples that do not have the "test" tag
#

view = dataset.match_tags("test", bool=False)

#
# Only include samples that have the "test" or "train" tags
#

view = dataset.match_tags(["test", "train"])

#
# Only include samples that have the "test" and "train" tags
#

view = dataset.match_tags(["test", "train"], all=True)

#
# Only include samples that do not have the "test" or "train" tags
#

view = dataset.match_tags(["test", "train"], bool=False)

#
# Only include samples that do not have the "test" and "train" tags
#

view = dataset.match_tags(["test", "train"], bool=False, all=True)
Parameters
tagsthe tag or iterable of tags to match
bool:Nonewhether to match samples that have (None or True) or do not have (False) the given tags
all:Falsewhether to match samples that have (or don't have) all (True) or any (False) of the given tags
Returns
a fiftyone.core.view.DatasetView
@aggregation
def max(self, field_or_expr, expr=None, safe=False): (source)

Computes the maximum of a numeric field of the collection.

None-valued fields are ignored.

This aggregation is typically applied to numeric or date field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            numeric_field=1.0,
            numeric_list_field=[1, 2, 3],
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            numeric_field=4.0,
            numeric_list_field=[1, 2],
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            numeric_field=None,
            numeric_list_field=None,
        ),
    ]
)

#
# Compute the maximum of a numeric field
#

max = dataset.max("numeric_field")
print(max)  # the max

#
# Compute the maximum of a numeric list field
#

max = dataset.max("numeric_list_field")
print(max)  # the max

#
# Compute the maximum of a transformation of a numeric field
#

max = dataset.max(2 * (F("numeric_field") + 1))
print(max)  # the max
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
Returns
the maximum value
@aggregation
def mean(self, field_or_expr, expr=None, safe=False): (source)

Computes the arithmetic mean of the field values of the collection.

None-valued fields are ignored.

This aggregation is typically applied to numeric field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            numeric_field=1.0,
            numeric_list_field=[1, 2, 3],
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            numeric_field=4.0,
            numeric_list_field=[1, 2],
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            numeric_field=None,
            numeric_list_field=None,
        ),
    ]
)

#
# Compute the mean of a numeric field
#

mean = dataset.mean("numeric_field")
print(mean)  # the mean

#
# Compute the mean of a numeric list field
#

mean = dataset.mean("numeric_list_field")
print(mean)  # the mean

#
# Compute the mean of a transformation of a numeric field
#

mean = dataset.mean(2 * (F("numeric_field") + 1))
print(mean)  # the mean
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
Returns
the mean
def merge_labels(self, in_field, out_field): (source)

Merges the labels from the given input field into the given output field of the collection.

If this collection is a dataset, the input field is deleted after the merge.

If this collection is a view, the input field will still exist on the underlying dataset but will only contain the labels not present in this view.

Parameters
in_fieldthe name of the input label field
out_fieldthe name of the output label field, which will be created if necessary
@aggregation
def min(self, field_or_expr, expr=None, safe=False): (source)

Computes the minimum of a numeric field of the collection.

None-valued fields are ignored.

This aggregation is typically applied to numeric or date field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            numeric_field=1.0,
            numeric_list_field=[1, 2, 3],
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            numeric_field=4.0,
            numeric_list_field=[1, 2],
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            numeric_field=None,
            numeric_list_field=None,
        ),
    ]
)

#
# Compute the minimum of a numeric field
#

min = dataset.min("numeric_field")
print(min)  # the min

#
# Compute the minimum of a numeric list field
#

min = dataset.min("numeric_list_field")
print(min)  # the min

#
# Compute the minimum of a transformation of a numeric field
#

min = dataset.min(2 * (F("numeric_field") + 1))
print(min)  # the min
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
Returns
the minimum value
@view_stage
def mongo(self, pipeline, _needs_frames=None, _group_slices=None): (source)

Adds a view stage defined by a raw MongoDB aggregation pipeline.

See MongoDB aggregation pipelines for more details.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.1, 0.1, 0.5, 0.5],
                        confidence=0.9,
                    ),
                    fo.Detection(
                        label="dog",
                        bounding_box=[0.2, 0.2, 0.3, 0.3],
                        confidence=0.8,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.5, 0.5, 0.4, 0.4],
                        confidence=0.95,
                    ),
                    fo.Detection(label="rabbit"),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            predictions=fo.Detections(
                detections=[
                    fo.Detection(
                        label="squirrel",
                        bounding_box=[0.25, 0.25, 0.5, 0.5],
                        confidence=0.5,
                    ),
                ]
            ),
        ),
        fo.Sample(
            filepath="/path/to/image4.png",
            predictions=None,
        ),
    ]
)

#
# Extract a view containing the second and third samples in the
# dataset
#

view = dataset.mongo([{"$skip": 1}, {"$limit": 2}])

#
# Sort by the number of objects in the `precictions` field
#

view = dataset.mongo([
    {
        "$addFields": {
            "_sort_field": {
                "$size": {"$ifNull": ["$predictions.detections", []]}
            }
        }
    },
    {"$sort": {"_sort_field": -1}},
    {"$project": {"_sort_field": False}},
])
Parameters
pipelinea MongoDB aggregation pipeline (list of dicts)
_needs_framesUndocumented
_group_slicesUndocumented
Returns
a fiftyone.core.view.DatasetView
def one(self, expr, exact=False): (source)

Returns a single sample in this collection matching the expression.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

#
# Get a sample by filepath
#

# A random filepath in the dataset
filepath = dataset.take(1).first().filepath

# Get sample by filepath
sample = dataset.one(F("filepath") == filepath)

#
# Dealing with multiple matches
#

# Get a sample whose image is JPEG
sample = dataset.one(F("filepath").ends_with(".jpg"))

# Raises an error since there are multiple JPEGs
dataset.one(F("filepath").ends_with(".jpg"), exact=True)
Parameters
expra fiftyone.core.expressions.ViewExpression or MongoDB expression that evaluates to True for the sample to match
exact:Falsewhether to raise an error if multiple samples match the expression
Returns
a fiftyone.core.sample.SampleView
Raises
ValueErrorif no samples match the expression or if exact=True
and multiple samples match the expression
@aggregation
def quantiles(self, field_or_expr, quantiles, expr=None, safe=False): (source)

Computes the quantile(s) of the field values of a collection.

None-valued fields are ignored.

This aggregation is typically applied to numeric field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            numeric_field=1.0,
            numeric_list_field=[1, 2, 3],
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            numeric_field=4.0,
            numeric_list_field=[1, 2],
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            numeric_field=None,
            numeric_list_field=None,
        ),
    ]
)

#
# Compute the quantiles of a numeric field
#

quantiles = dataset.quantiles("numeric_field", [0.1, 0.5, 0.9])
print(quantiles)  # the quantiles

#
# Compute the quantiles of a numeric list field
#

quantiles = dataset.quantiles("numeric_list_field", [0.1, 0.5, 0.9])
print(quantiles)  # the quantiles

#
# Compute the mean of a transformation of a numeric field
#

quantiles = dataset.quantiles(2 * (F("numeric_field") + 1), [0.1, 0.5, 0.9])
print(quantiles)  # the quantiles
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate
quantilesthe quantile or iterable of quantiles to compute. Each quantile must be a numeric value in [0, 1]
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
Returns
the quantile or list of quantiles
def register_run(self, run_key, config, results=None, overwrite=False, cleanup=True, cache=True): (source)

Registers a run under the given key on this collection.

Parameters
run_keya run key
configa fiftyone.core.runs.RunConfig
results:Nonean optional fiftyone.core.runs.RunResults
overwrite:Falsewhether to allow overwriting an existing run of the same type
cleanup:Truewhether to execute an existing run's fiftyone.core.runs.Run.cleanup method when overwriting it
cache:Truewhether to cache the results on the collection
def reload(self): (source)

Reloads the collection from the database.

def rename_annotation_run(self, anno_key, new_anno_key): (source)

Replaces the key for the given annotation run with a new key.

Parameters
anno_keyan annotation key
new_anno_keya new annotation key
def rename_brain_run(self, brain_key, new_brain_key): (source)

Replaces the key for the given brain run with a new key.

Parameters
brain_keya brain key
new_brain_keya new brain key
def rename_evaluation(self, eval_key, new_eval_key): (source)

Replaces the key for the given evaluation with a new key.

Parameters
eval_keyan evaluation key
new_eval_keyUndocumented
new_anno_keya new evaluation key
def rename_run(self, run_key, new_run_key): (source)

Replaces the key for the given run with a new key.

Parameters
run_keya run key
new_run_keya new run key
def save_context(self, batch_size=None, batching_strategy=None): (source)

Returns a context that can be used to save samples from this collection according to a configurable batching strategy.

Examples:

import random as r
import string as s

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("cifar10", split="test")

def make_label():
    return "".join(r.choice(s.ascii_letters) for i in range(10))

# No save context
for sample in dataset.iter_samples(progress=True):
    sample.ground_truth.label = make_label()
    sample.save()

# Save using default batching strategy
with dataset.save_context() as context:
    for sample in dataset.iter_samples(progress=True):
        sample.ground_truth.label = make_label()
        context.save(sample)

# Save in batches of 10
with dataset.save_context(batch_size=10) as context:
    for sample in dataset.iter_samples(progress=True):
        sample.ground_truth.label = make_label()
        context.save(sample)

# Save every 0.5 seconds
with dataset.save_context(batch_size=0.5) as context:
    for sample in dataset.iter_samples(progress=True):
        sample.ground_truth.label = make_label()
        context.save(sample)
Parameters
batch_size:Nonethe batch size to use. If a batching_strategy is provided, this parameter configures the strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency")
batching_strategy:None

the batching strategy to use for each save operation. Supported values are:

  • "static": a fixed sample batch size for each save
  • "size": a target batch size, in bytes, for each save
  • "latency": a target latency, in seconds, between saves

By default, fo.config.default_batcher is used

Returns
a SaveContext
def save_run_results(self, run_key, results, overwrite=True, cache=True): (source)

Saves run results for the run with the given key.

Parameters
run_keya run key
resultsa fiftyone.core.runs.RunResults
overwrite:Truewhether to overwrite an existing result with the same key
cache:Truewhether to cache the results on the collection
@aggregation
def schema(self, field_or_expr, expr=None, dynamic_only=False, _doc_type=None, _include_private=False): (source)

Extracts the names and types of the attributes of a specified embedded document field across all samples in the collection.

Schema aggregations are useful for detecting the presence and types of dynamic attributes of fiftyone.core.labels.Label fields across a collection.

Examples:

import fiftyone as fo

dataset = fo.Dataset()

sample1 = fo.Sample(
    filepath="image1.png",
    ground_truth=fo.Detections(
        detections=[
            fo.Detection(
                label="cat",
                bounding_box=[0.1, 0.1, 0.4, 0.4],
                foo="bar",
                hello=True,
            ),
            fo.Detection(
                label="dog",
                bounding_box=[0.5, 0.5, 0.4, 0.4],
                hello=None,
            )
        ]
    )
)

sample2 = fo.Sample(
    filepath="image2.png",
    ground_truth=fo.Detections(
        detections=[
            fo.Detection(
                label="rabbit",
                bounding_box=[0.1, 0.1, 0.4, 0.4],
                foo=None,
            ),
            fo.Detection(
                label="squirrel",
                bounding_box=[0.5, 0.5, 0.4, 0.4],
                hello="there",
            ),
        ]
    )
)

dataset.add_samples([sample1, sample2])

#
# Get schema of all dynamic attributes on the detections in a
# `Detections` field
#

print(dataset.schema("ground_truth.detections", dynamic_only=True))
# {'foo': StringField, 'hello': [BooleanField, StringField]}
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
dynamic_only:Falsewhether to only include dynamically added attributes
_doc_typeUndocumented
_include_privateUndocumented
Returns
a dict mapping field names to fiftyone.core.fields.Field instances. If a field's values takes multiple non-None types, the list of observed types will be returned
@view_stage
def select(self, sample_ids, ordered=False): (source)

Selects the samples with the given IDs from the collection.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

#
# Create a view containing the currently selected samples in the App
#

session = fo.launch_app(dataset)

# Select samples in the App...

view = dataset.select(session.selected)
ordered (False): whether to sort the samples in the returned view to
match the order of the provided IDs
Parameters
sample_ids

the samples to select. Can be any of the following:

orderedUndocumented
Returns
a fiftyone.core.view.DatasetView
@view_stage
def select_by(self, field, values, ordered=False): (source)

Selects the samples with the given field values from the collection.

This stage is typically used to work with categorical fields (strings, ints, and bools). If you want to select samples based on floating point fields, use match.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(filepath="image%d.jpg" % i, int=i, str=str(i))
        for i in range(100)
    ]
)

#
# Create a view containing samples whose `int` field have the given
# values
#

view = dataset.select_by("int", [1, 51, 11, 41, 21, 31])
print(view.head(6))

#
# Create a view containing samples whose `str` field have the given
# values, in order
#

view = dataset.select_by(
    "str", ["1", "51", "11", "41", "21", "31"], ordered=True
)
print(view.head(6))
Parameters
fielda field or embedded.field.name
valuesa value or iterable of values to select by
ordered:Falsewhether to sort the samples in the returned view to match the order of the provided values
Returns
a fiftyone.core.view.DatasetView
@view_stage
def select_fields(self, field_names=None, meta_filter=None, _allow_missing=False): (source)

Selects only the fields with the given names from the samples in the collection. All other fields are excluded.

Note that default sample fields are always selected.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            uniqueness=1.0,
            ground_truth=fo.Detections(
                detections=[
                    fo.Detection(
                        label="cat",
                        bounding_box=[0.1, 0.1, 0.5, 0.5],
                        mood="surly",
                        age=51,
                    ),
                    fo.Detection(
                        label="dog",
                        bounding_box=[0.2, 0.2, 0.3, 0.3],
                        mood="happy",
                        age=52,
                    ),
                ]
            )
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            uniqueness=0.0,
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
        ),
    ]
)

#
# Include only the default fields on each sample
#

view = dataset.select_fields()

#
# Include only the `uniqueness` field (and the default fields) on
# each sample
#

view = dataset.select_fields("uniqueness")

#
# Include only the `mood` attribute (and the default attributes) of
# each `Detection` in the `ground_truth` field
#

view = dataset.select_fields("ground_truth.detections.mood")
Parameters
field_names:Nonea field name or iterable of field names to select. May contain embedded.field.name as well
meta_filter:None

a filter that dynamically selects fields in the collection's schema according to the specified rule, which can be matched against the field's name, type, description, and/or info. For example:

  • Use meta_filter="2023" or meta_filter={"any": "2023"} to select fields that have the string "2023" anywhere in their name, type, description, or info
  • Use meta_filter={"type": "StringField"} or meta_filter={"type": "Classification"} to select all string or classification fields, respectively
  • Use meta_filter={"description": "my description"} to select fields whose description contains the string "my description"
  • Use meta_filter={"info": "2023"} to select fields that have the string "2023" anywhere in their info
  • Use meta_filter={"info.key": "value"}} to select fields that have a specific key/value pair in their info
  • Include meta_filter={"include_nested_fields": True, ...} in your meta filter to include all nested fields in the filter
_allow_missingUndocumented
Returns
a fiftyone.core.view.DatasetView
@view_stage
def select_frames(self, frame_ids, omit_empty=True): (source)

Selects the frames with the given IDs from the video collection.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-video")

#
# Select some specific frames
#

frame_ids = [
    dataset.first().frames.first().id,
    dataset.last().frames.last().id,
]

view = dataset.select_frames(frame_ids)

print(dataset.count())
print(view.count())

print(dataset.count("frames"))
print(view.count("frames"))
Parameters
frame_ids

the frames to select. Can be any of the following:

omit_empty:Truewhether to omit samples that have no frames after selecting the specified frames
Returns
a fiftyone.core.view.DatasetView
@view_stage
def select_group_slices(self, slices=None, media_type=None, _allow_mixed=False, _force_mixed=False): (source)

Selects the samples in the group collection from the given slice(s).

The returned view is a flattened non-grouped view containing only the slice(s) of interest.

Note

This stage performs a $lookup that pulls the requested slice(s) for each sample in the input collection from the source dataset. As a result, this stage always emits unfiltered samples.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_group_field("group", default="ego")

group1 = fo.Group()
group2 = fo.Group()

dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/left-image1.jpg",
            group=group1.element("left"),
        ),
        fo.Sample(
            filepath="/path/to/video1.mp4",
            group=group1.element("ego"),
        ),
        fo.Sample(
            filepath="/path/to/right-image1.jpg",
            group=group1.element("right"),
        ),
        fo.Sample(
            filepath="/path/to/left-image2.jpg",
            group=group2.element("left"),
        ),
        fo.Sample(
            filepath="/path/to/video2.mp4",
            group=group2.element("ego"),
        ),
        fo.Sample(
            filepath="/path/to/right-image2.jpg",
            group=group2.element("right"),
        ),
    ]
)

#
# Retrieve the samples from the "ego" group slice
#

view = dataset.select_group_slices("ego")

#
# Retrieve the samples from the "left" or "right" group slices
#

view = dataset.select_group_slices(["left", "right"])

#
# Retrieve all image samples
#

view = dataset.select_group_slices(media_type="image")
Parameters
slices:Nonea group slice or iterable of group slices to select. If neither argument is provided, a flattened list of all samples is returned
media_type:Nonea media type whose slice(s) to select
_allow_mixedUndocumented
_force_mixedUndocumented
Returns
a fiftyone.core.view.DatasetView
@view_stage
def select_groups(self, group_ids, ordered=False): (source)

Selects the groups with the given IDs from the grouped collection.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

#
# Select some specific groups by ID
#

group_ids = dataset.take(10).values("group.id")

view = dataset.select_groups(group_ids)

assert set(view.values("group.id")) == set(group_ids)

view = dataset.select_groups(group_ids, ordered=True)

assert view.values("group.id") == group_ids
Parameters
group_idsUndocumented
ordered:Falsewhether to sort the groups in the returned view to match the order of the provided IDs
groups_ids

the groups to select. Can be any of the following:

Returns
a fiftyone.core.view.DatasetView
@view_stage
def select_labels(self, labels=None, ids=None, tags=None, fields=None, omit_empty=True): (source)

Selects only the specified labels from the collection.

The returned view will omit samples, sample fields, and individual labels that do not match the specified selection criteria.

You can perform a selection via one or more of the following methods:

  • Provide the labels argument, which should contain a list of dicts in the format returned by fiftyone.core.session.Session.selected_labels, to select specific labels
  • Provide the ids argument to select labels with specific IDs
  • Provide the tags argument to select labels with specific tags

If multiple criteria are specified, labels must match all of them in order to be selected.

By default, the selection is applied to all fiftyone.core.labels.Label fields, but you can provide the fields argument to explicitly define the field(s) in which to select.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

#
# Only include the labels currently selected in the App
#

session = fo.launch_app(dataset)

# Select some labels in the App...

view = dataset.select_labels(labels=session.selected_labels)

#
# Only include labels with the specified IDs
#

# Grab some label IDs
ids = [
    dataset.first().ground_truth.detections[0].id,
    dataset.last().predictions.detections[0].id,
]

view = dataset.select_labels(ids=ids)

print(view.count("ground_truth.detections"))
print(view.count("predictions.detections"))

#
# Only include labels with the specified tags
#

# Grab some label IDs
ids = [
    dataset.first().ground_truth.detections[0].id,
    dataset.last().predictions.detections[0].id,
]

# Give the labels a "test" tag
dataset = dataset.clone()  # create copy since we're modifying data
dataset.select_labels(ids=ids).tag_labels("test")

print(dataset.count_label_tags())

# Retrieve the labels via their tag
view = dataset.select_labels(tags="test")

print(view.count("ground_truth.detections"))
print(view.count("predictions.detections"))
Parameters
labels:Nonea list of dicts specifying the labels to select in the format returned by fiftyone.core.session.Session.selected_labels
ids:Nonean ID or iterable of IDs of the labels to select
tags:Nonea tag or iterable of tags of labels to select
fields:Nonea field or iterable of fields from which to select
omit_empty:Truewhether to omit samples that have no labels after filtering
Returns
a fiftyone.core.view.DatasetView
@view_stage
def set_field(self, field, expr, _allow_missing=False): (source)

Sets a field or embedded field on each sample in a collection by evaluating the given expression.

This method can process embedded list fields. To do so, simply append [] to any list component(s) of the field path.

Note

There are two cases where FiftyOne will automatically unwind array fields without requiring you to explicitly specify this via the [] syntax:

Top-level lists: when you specify a field path that refers to a top-level list field of a dataset; i.e., list_field is automatically coerced to list_field[], if necessary.

List fields: When you specify a field path that refers to the list field of a |Label| class, such as the Detections.detections attribute; i.e., ground_truth.detections.label is automatically coerced to ground_truth.detections[].label, if necessary.

See the examples below for demonstrations of this behavior.

The provided expr is interpreted relative to the document on which the embedded field is being set. For example, if you are setting a nested field field="embedded.document.field", then the expression expr you provide will be applied to the embedded.document document. Note that you can override this behavior by defining an expression that is bound to the root document by prepending "$" to any field name(s) in the expression.

See the examples below for more information.

Note

Note that you cannot set a non-existing top-level field using this stage, since doing so would violate the dataset's schema. You can, however, first declare a new field via fiftyone.core.dataset.Dataset.add_sample_field and then populate it in a view via this stage.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

#
# Replace all values of the `uniqueness` field that are less than
# 0.5 with `None`
#

view = dataset.set_field(
    "uniqueness",
    (F("uniqueness") >= 0.5).if_else(F("uniqueness"), None)
)
print(view.bounds("uniqueness"))

#
# Lower bound all object confidences in the `predictions` field at
# 0.5
#

view = dataset.set_field(
    "predictions.detections.confidence", F("confidence").max(0.5)
)
print(view.bounds("predictions.detections.confidence"))

#
# Add a `num_predictions` property to the `predictions` field that
# contains the number of objects in the field
#

view = dataset.set_field(
    "predictions.num_predictions",
    F("$predictions.detections").length(),
)
print(view.bounds("predictions.num_predictions"))

#
# Set an `is_animal` field on each object in the `predictions` field
# that indicates whether the object is an animal
#

ANIMALS = [
    "bear", "bird", "cat", "cow", "dog", "elephant", "giraffe",
    "horse", "sheep", "zebra"
]

view = dataset.set_field(
    "predictions.detections.is_animal", F("label").is_in(ANIMALS)
)
print(view.count_values("predictions.detections.is_animal"))
Parameters
fieldthe field or embedded.field.name to set
expra fiftyone.core.expressions.ViewExpression or MongoDB expression that defines the field value to set
_allow_missingUndocumented
Returns
a fiftyone.core.view.DatasetView
def set_label_values(self, field_name, values, dynamic=False, skip_none=False, validate=True, progress=False): (source)

Sets the fields of the specified labels in the collection to the given values.

Note

This method is appropriate when you have the IDs of the labels you wish to modify. See set_values and set_field if your updates are not keyed by label ID.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

#
# Populate a new boolean attribute on all high confidence labels
#

view = dataset.filter_labels("predictions", F("confidence") > 0.99)

label_ids = view.values("predictions.detections.id", unwind=True)
values = {_id: True for _id in label_ids}

dataset.set_label_values("predictions.detections.high_conf", values)

print(dataset.count("predictions.detections"))
print(len(label_ids))
print(dataset.count_values("predictions.detections.high_conf"))
Parameters
field_namea field or embedded.field.name
valuesa dict mapping label IDs to values
dynamic:Falsewhether to declare dynamic attributes of embedded document fields that are encountered
skip_none:Falsewhether to treat None data in values as missing data that should not be set
validate:Truewhether to validate that the values are compliant with the dataset schema before adding them
progress:Falsewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
def set_values(self, field_name, values, key_field=None, skip_none=False, expand_schema=True, dynamic=False, validate=True, progress=False, _allow_missing=False, _sample_ids=None, _frame_ids=None): (source)

Sets the field or embedded field on each sample or frame in the collection to the given values.

When setting a sample field embedded.field.name, this function is an efficient implementation of the following loop:

for sample, value in zip(sample_collection, values):
    sample.embedded.field.name = value
    sample.save()

When setting an embedded field that contains an array, say embedded.array.field.name, this function is an efficient implementation of the following loop:

for sample, array_values in zip(sample_collection, values):
    for doc, value in zip(sample.embedded.array, array_values):
        doc.field.name = value

    sample.save()

When setting a frame field frames.embedded.field.name, this function is an efficient implementation of the following loop:

for sample, frame_values in zip(sample_collection, values):
    for frame, value in zip(sample.frames.values(), frame_values):
        frame.embedded.field.name = value

    sample.save()

When setting an embedded frame field that contains an array, say frames.embedded.array.field.name, this function is an efficient implementation of the following loop:

for sample, frame_values in zip(sample_collection, values):
    for frame, array_values in zip(sample.frames.values(), frame_values):
        for doc, value in zip(frame.embedded.array, array_values):
            doc.field.name = value

    sample.save()

When values is a dict mapping keys in key_field to values, then this function is an efficient implementation of the following loop:

for key, value in values.items():
    sample = sample_collection.one(F(key_field) == key)
    sample.embedded.field.name = value
    sample.save()

When setting frame fields using the dict values syntax, each value in values may either be a list corresponding to the frames of the sample matching the given key, or each value may itself be a dict mapping frame numbers to values. In the latter case, this function is an efficient implementation of the following loop:

for key, frame_values in values.items():
    sample = sample_collection.one(F(key_field) == key)
    for frame_number, value in frame_values.items():
        frame = sample[frame_number]
        frame.embedded.field.name = value

    sample.save()

You can also update list fields using the dict values syntax, in which case this method is an efficient implementation of the natural nested list modifications of the above sample/frame loops.

The dual function of set_values is values, which can be used to efficiently extract the values of a field or embedded field of all samples in a collection as lists of values in the same structure expected by this method.

Note

If the values you are setting can be described by a fiftyone.core.expressions.ViewExpression applied to the existing dataset contents, then consider using set_field + save for an even more efficient alternative to explicitly iterating over the dataset or calling values + set_values to perform the update in-memory.

Examples:

import random

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

#
# Create a new sample field
#

values = [random.random() for _ in range(len(dataset))]
dataset.set_values("random", values)

print(dataset.bounds("random"))

#
# Add a tag to all low confidence labels
#

view = dataset.filter_labels("predictions", F("confidence") < 0.06)

detections = view.values("predictions.detections")
for sample_detections in detections:
    for detection in sample_detections:
        detection.tags.append("low_confidence")

view.set_values("predictions.detections", detections)

print(dataset.count_label_tags())
Parameters
field_namea field or embedded.field.name
valuesan iterable of values, one for each sample in the collection. When setting frame fields, each element can either be an iterable of values (one for each existing frame of the sample) or a dict mapping frame numbers to values. If field_name contains array fields, the corresponding elements of values must be arrays of the same lengths. This argument can also be a dict mapping keys to values (each value as described previously), in which case the keys are used to match samples by their key_field
key_field:Nonea key field to use when choosing which samples to update when values is a dict
skip_none:Falsewhether to treat None data in values as missing data that should not be set
expand_schema:Truewhether to dynamically add new sample/frame fields encountered to the dataset schema. If False, an error is raised if the root field_name does not exist
dynamic:Falsewhether to declare dynamic attributes of embedded document fields that are encountered
validate:Truewhether to validate that the values are compliant with the dataset schema before adding them
progress:Falsewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
_allow_missingUndocumented
_sample_idsUndocumented
_frame_idsUndocumented
@view_stage
def shuffle(self, seed=None): (source)

Randomly shuffles the samples in the collection.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            ground_truth=fo.Classification(label="cat"),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            ground_truth=fo.Classification(label="dog"),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            ground_truth=None,
        ),
    ]
)

#
# Return a view that contains a randomly shuffled version of the
# samples in the dataset
#

view = dataset.shuffle()

#
# Shuffle the samples with a fixed random seed
#

view = dataset.shuffle(seed=51)
Parameters
seed:Nonean optional random seed to use when shuffling the samples
Returns
a fiftyone.core.view.DatasetView
@view_stage
def skip(self, skip): (source)

Omits the given number of samples from the head of the collection.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            ground_truth=fo.Classification(label="cat"),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            ground_truth=fo.Classification(label="dog"),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            ground_truth=fo.Classification(label="rabbit"),
        ),
        fo.Sample(
            filepath="/path/to/image4.png",
            ground_truth=None,
        ),
    ]
)

#
# Omit the first two samples from the dataset
#

view = dataset.skip(2)
Parameters
skipthe number of samples to skip. If a non-positive number is provided, no samples are omitted
Returns
a fiftyone.core.view.DatasetView
@view_stage
def sort_by(self, field_or_expr, reverse=False, create_index=True): (source)

Sorts the samples in the collection by the given field(s) or expression(s).

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

#
# Sort the samples by their `uniqueness` field in ascending order
#

view = dataset.sort_by("uniqueness", reverse=False)

#
# Sorts the samples in descending order by the number of detections
# in their `predictions` field whose bounding box area is less than
# 0.2
#

# Bboxes are in [top-left-x, top-left-y, width, height] format
bbox = F("bounding_box")
bbox_area = bbox[2] * bbox[3]

small_boxes = F("predictions.detections").filter(bbox_area < 0.2)
view = dataset.sort_by(small_boxes.length(), reverse=True)

#
# Performs a compound sort where samples are first sorted in
# descending or by number of detections and then in ascending order
# of uniqueness for samples with the same number of predictions
#

view = dataset.sort_by(
    [
        (F("predictions.detections").length(), -1),
        ("uniqueness", 1),
    ]
)

num_objects, uniqueness = view[:5].values(
    [F("predictions.detections").length(), "uniqueness"]
)
print(list(zip(num_objects, uniqueness)))
Parameters
field_or_expr

the field(s) or expression(s) to sort by. This can be any of the following:

  • a field to sort by
  • an embedded.field.name to sort by
  • a fiftyone.core.expressions.ViewExpression or a MongoDB aggregation expression that defines the quantity to sort by
  • a list of (field_or_expr, order) tuples defining a compound sort criteria, where field_or_expr is a field or expression as defined above, and order can be 1 or any string starting with "a" for ascending order, or -1 or any string starting with "d" for descending order
reverse:Falsewhether to return the results in descending order
create_index:Truewhether to create an index, if necessary, to optimize the sort. Only applicable when sorting by field(s), not expressions
Returns
a fiftyone.core.view.DatasetView
@view_stage
def sort_by_similarity(self, query, k=None, reverse=False, dist_field=None, brain_key=None): (source)

Sorts the collection by similarity to a specified query.

In order to use this stage, you must first use fiftyone.brain.compute_similarity to index your dataset by similarity.

Examples:

import fiftyone as fo
import fiftyone.brain as fob
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

fob.compute_similarity(
    dataset, model="clip-vit-base32-torch", brain_key="clip"
)

#
# Sort samples by their similarity to a sample by its ID
#

query_id = dataset.first().id

view = dataset.sort_by_similarity(query_id, k=5)

#
# Sort samples by their similarity to a manually computed vector
#

model = foz.load_zoo_model("clip-vit-base32-torch")
embeddings = dataset.take(2, seed=51).compute_embeddings(model)
query = embeddings.mean(axis=0)

view = dataset.sort_by_similarity(query, k=5)

#
# Sort samples by their similarity to a text prompt
#

query = "kites high in the air"

view = dataset.sort_by_similarity(query, k=5)
Parameters
query

the query, which can be any of the following:

  • an ID or iterable of IDs
  • a num_dims vector or num_queries x num_dims array of vectors
  • a prompt or iterable of prompts (if supported by the index)
k:Nonethe number of matches to return. By default, the entire collection is sorted
reverse:Falsewhether to sort by least similarity (True) or greatest similarity (False). Some backends may not support least similarity
dist_field:Nonethe name of a float field in which to store the distance of each example to the specified query. The field is created if necessary
brain_key:Nonethe brain key of an existing fiftyone.brain.compute_similarity run on the dataset. If not specified, the dataset must have an applicable run, which will be used by default
Returns
a fiftyone.core.view.DatasetView
def split_labels(self, in_field, out_field, filter=None): (source)

Splits the labels from the given input field into the given output field of the collection.

This method is typically invoked on a view that has filtered the contents of the specified input field, so that the labels in the view are moved to the output field and the remaining labels are left in-place.

Alternatively, you can provide a filter expression that selects the labels of interest to move in this collection.

Parameters
in_fieldthe name of the input label field
out_fieldthe name of the output label field, which will be created if necessary
filter:Nonea boolean fiftyone.core.expressions.ViewExpression to apply to each label in the input field to determine whether to move it (True) or leave it (False)
def stats(self, include_media=False, include_indexes=False, compressed=False): (source)

Returns stats about the collection on disk.

The samples keys refer to the sample documents stored in the database.

For video datasets, the frames keys refer to the frame documents stored in the database.

The media keys refer to the raw media associated with each sample on disk.

The index[es] keys refer to the indexes associated with the dataset.

Note that dataset-level metadata such as annotation runs are not included in this computation.

Parameters
include_media:Falsewhether to include stats about the size of the raw media in the collection
include_indexes:Falsewhether to include stats on the dataset's indexes
compressed:Falsewhether to return the sizes of collections in their compressed form on disk (True) or the logical uncompressed size of the collections (False). This option is only supported for datasets (not views)
Returns
a stats dict
@aggregation
def std(self, field_or_expr, expr=None, safe=False, sample=False): (source)

Computes the standard deviation of the field values of the collection.

None-valued fields are ignored.

This aggregation is typically applied to numeric field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            numeric_field=1.0,
            numeric_list_field=[1, 2, 3],
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            numeric_field=4.0,
            numeric_list_field=[1, 2],
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            numeric_field=None,
            numeric_list_field=None,
        ),
    ]
)

#
# Compute the standard deviation of a numeric field
#

std = dataset.std("numeric_field")
print(std)  # the standard deviation

#
# Compute the standard deviation of a numeric list field
#

std = dataset.std("numeric_list_field")
print(std)  # the standard deviation

#
# Compute the standard deviation of a transformation of a numeric field
#

std = dataset.std(2 * (F("numeric_field") + 1))
print(std)  # the standard deviation
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
sample:Falsewhether to compute the sample standard deviation rather than the population standard deviation
Returns
the standard deviation
@aggregation
def sum(self, field_or_expr, expr=None, safe=False): (source)

Computes the sum of the field values of the collection.

None-valued fields are ignored.

This aggregation is typically applied to numeric field types (or lists of such types):

Examples:

import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            numeric_field=1.0,
            numeric_list_field=[1, 2, 3],
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            numeric_field=4.0,
            numeric_list_field=[1, 2],
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            numeric_field=None,
            numeric_list_field=None,
        ),
    ]
)

#
# Compute the sum of a numeric field
#

total = dataset.sum("numeric_field")
print(total)  # the sum

#
# Compute the sum of a numeric list field
#

total = dataset.sum("numeric_list_field")
print(total)  # the sum

#
# Compute the sum of a transformation of a numeric field
#

total = dataset.sum(2 * (F("numeric_field") + 1))
print(total)  # the sum
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
safe:Falsewhether to ignore nan/inf values when dealing with floating point values
Returns
the sum
def summary(self): (source)

Returns a string summary of the collection.

Returns
a string summary
def sync_last_modified_at(self, include_frames=True): (source)

Syncs the last_modified_at property(s) of the dataset.

Updates the last_modified_at property of the dataset if necessary to incorporate any modification timestamps to its samples.

If include_frames==True, the last_modified_at property of each video sample is first updated if necessary to incorporate any modification timestamps to its frames.

Parameters
include_frames:Truewhether to update the last_modified_at property of video samples. Only applicable to datasets that contain videos
def tag_labels(self, tags, label_fields=None): (source)

Adds the tag(s) to all labels in the specified label field(s) of this collection, if necessary.

Parameters
tagsa tag or iterable of tags
label_fields:Nonean optional name or iterable of names of fiftyone.core.labels.Label fields. By default, all label fields are used
def tag_samples(self, tags): (source)

Adds the tag(s) to all samples in this collection, if necessary.

Parameters
tagsa tag or iterable of tags
def tail(self, num_samples=3): (source)

Returns a list of the last few samples in the collection.

If fewer than num_samples samples are in the collection, only the available samples are returned.

Parameters
num_samples:3the number of samples
Returns
a list of fiftyone.core.sample.Sample objects
@view_stage
def take(self, size, seed=None): (source)

Randomly samples the given number of samples from the collection.

Examples:

import fiftyone as fo

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            ground_truth=fo.Classification(label="cat"),
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            ground_truth=fo.Classification(label="dog"),
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            ground_truth=fo.Classification(label="rabbit"),
        ),
        fo.Sample(
            filepath="/path/to/image4.png",
            ground_truth=None,
        ),
    ]
)

#
# Take two random samples from the dataset
#

view = dataset.take(2)

#
# Take two random samples from the dataset with a fixed seed
#

view = dataset.take(2, seed=51)
Parameters
sizethe number of samples to return. If a non-positive number is provided, an empty view is returned
seed:Nonean optional random seed to use when selecting the samples
Returns
a fiftyone.core.view.DatasetView
@view_stage
def to_clips(self, field_or_expr, **kwargs): (source)

Creates a view that contains one sample per clip defined by the given field or expression in the video collection.

The returned view will contain:

  • A sample_id field that records the sample ID from which each clip was taken
  • A support field that records the [first, last] frame support of each clip
  • All frame-level information from the underlying dataset of the input collection

Refer to fiftyone.core.clips.make_clips_dataset to see the available configuration options for generating clips.

Note

The clip generation logic will respect any frame-level modifications defined in the input collection, but the output clips will always contain all frame-level labels.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart-video")

#
# Create a clips view that contains one clip for each contiguous
# segment that contains at least one road sign in every frame
#

clips = (
    dataset
    .filter_labels("frames.detections", F("label") == "road sign")
    .to_clips("frames.detections")
)
print(clips)

#
# Create a clips view that contains one clip for each contiguous
# segment that contains at least two road signs in every frame
#

signs = F("detections.detections").filter(F("label") == "road sign")
clips = dataset.to_clips(signs.length() >= 2)
print(clips)
Parameters
field_or_expr

can be any of the following:

other_fields:None

controls whether sample fields other than the default sample fields are included. Can be any of the following:

  • a field or list of fields to include
  • True to include all other fields
  • None/False to include no other fields
tol:0the maximum number of false frames that can be overlooked when generating clips. Only applicable when field_or_expr is a frame-level list field or expression
min_len:0the minimum allowable length of a clip, in frames. Only applicable when field_or_expr is a frame-level list field or an expression
trajectories:Falsewhether to create clips for each unique object trajectory defined by their (label, index). Only applicable when field_or_expr is a frame-level field
**kwargsUndocumented
Returns
a fiftyone.core.clips.ClipsView
def to_dict(self, rel_dir=None, include_private=False, include_frames=False, frame_labels_dir=None, pretty_print=False, progress=None): (source)

Returns a JSON dictionary representation of the collection.

Parameters
rel_dir:Nonea relative directory to remove from the filepath of each sample, if possible. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path. The typical use case for this argument is that your source data lives in a single directory and you wish to serialize relative, rather than absolute, paths to the data within that directory
include_private:Falsewhether to include private fields
include_frames:Falsewhether to include the frame labels for video samples
frame_labels_dir:Nonea directory in which to write per-sample JSON files containing the frame labels for video samples. If omitted, frame labels will be included directly in the returned JSON dict (which can be quite quite large for video datasets containing many frames). Only applicable to datasets that contain videos when include_frames is True
pretty_print:Falsewhether to render frame labels JSON in human readable format with newlines and indentations. Only applicable to datasets that contain videos when a frame_labels_dir is provided
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
Returns
a JSON dict
@view_stage
def to_evaluation_patches(self, eval_key, **kwargs): (source)

Creates a view based on the results of the evaluation with the given key that contains one sample for each true positive, false positive, and false negative example in the collection, respectively.

True positive examples will result in samples with both their ground truth and predicted fields populated, while false positive/negative examples will only have one of their corresponding predicted/ground truth fields populated, respectively.

If multiple predictions are matched to a ground truth object (e.g., if the evaluation protocol includes a crowd attribute), then all matched predictions will be stored in the single sample along with the ground truth object.

The returned dataset will also have top-level type and iou fields populated based on the evaluation results for that example, as well as a sample_id field recording the sample ID of the example, and a crowd field if the evaluation protocol defines a crowd attribute.

Note

The returned view will contain patches for the contents of this collection, which may differ from the view on which the eval_key evaluation was performed. This may exclude some labels that were evaluated and/or include labels that were not evaluated.

If you would like to see patches for the exact view on which an evaluation was performed, first call load_evaluation_view to load the view and then convert to patches.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")
dataset.evaluate_detections("predictions", eval_key="eval")

session = fo.launch_app(dataset)

#
# Create a patches view for the evaluation results
#

view = dataset.to_evaluation_patches("eval")
print(view)

session.view = view
Parameters
eval_keyan evaluation key that corresponds to the evaluation of ground truth/predicted fields that are of type fiftyone.core.labels.Detections, fiftyone.core.labels.Polylines, or fiftyone.core.labels.Keypoints
other_fields:None

controls whether fields other than the ground truth/predicted fields and the default sample fields are included. Can be any of the following:

  • a field or list of fields to include
  • True to include all other fields
  • None/False to include no other fields
**kwargsUndocumented
Returns
a fiftyone.core.patches.EvaluationPatchesView
@view_stage
def to_frames(self, **kwargs): (source)

Creates a view that contains one sample per frame in the video collection.

The returned view will contain all frame-level fields and the tags of each video as sample-level fields, as well as a sample_id field that records the IDs of the parent sample for each frame.

By default, sample_frames is False and this method assumes that the frames of the input collection have filepath fields populated pointing to each frame image. Any frames without a filepath populated will be omitted from the returned view.

When sample_frames is True, this method samples each video in the collection into a directory of per-frame images and stores the filepaths in the filepath frame field of the source dataset. By default, each folder of images is written using the same basename as the input video. For example, if frames_patt = "%%06d.jpg", then videos with the following paths:

/path/to/video1.mp4
/path/to/video2.mp4
...

would be sampled as follows:

/path/to/video1/
    000001.jpg
    000002.jpg
    ...
/path/to/video2/
    000001.jpg
    000002.jpg
    ...

However, you can use the optional output_dir and rel_dir parameters to customize the location and shape of the sampled frame folders. For example, if output_dir = "/tmp" and rel_dir = "/path/to", then videos with the following paths:

/path/to/folderA/video1.mp4
/path/to/folderA/video2.mp4
/path/to/folderB/video3.mp4
...

would be sampled as follows:

/tmp/folderA/
    video1/
        000001.jpg
        000002.jpg
        ...
    video2/
        000001.jpg
        000002.jpg
        ...
/tmp/folderB/
    video3/
        000001.jpg
        000002.jpg
        ...

By default, samples will be generated for every video frame at full resolution, but this method provides a variety of parameters that can be used to customize the sampling behavior.

Note

If this method is run multiple times with sample_frames set to True, existing frames will not be resampled unless you set force_sample to True.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart-video")

session = fo.launch_app(dataset)

#
# Create a frames view for an entire video dataset
#

frames = dataset.to_frames(sample_frames=True)
print(frames)

session.view = frames

#
# Create a frames view that only contains frames with at least 10
# objects, sampled at a maximum frame rate of 1fps
#

num_objects = F("detections.detections").length()
view = dataset.match_frames(num_objects > 10)

frames = view.to_frames(max_fps=1)
print(frames)

session.view = frames
Parameters
sample_frames:Falsewhether to assume that the frame images have already been sampled at locations stored in the filepath field of each frame (False), or whether to sample the video frames now according to the specified parameters (True)
fps:Nonean optional frame rate at which to sample each video's frames
max_fps:Nonean optional maximum frame rate at which to sample. Videos with frame rate exceeding this value are downsampled
size:Nonean optional (width, height) at which to sample frames. A dimension can be -1, in which case the aspect ratio is preserved. Only applicable when sample_frames=True
min_size:Nonean optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint. Only applicable when sample_frames=True
max_size:Nonean optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint. Only applicable when sample_frames=True
sparse:Falsewhether to only sample frame images for frame numbers for which fiftyone.core.frame.Frame instances exist in the input collection. This parameter has no effect when sample_frames==False since frames must always exist in order to have filepath information use
output_dir:Nonean optional output directory in which to write the sampled frames. By default, the frames are written in folders with the same basename of each video
rel_dir:Nonea relative directory to remove from the filepath of each video, if possible. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path. This argument can be used in conjunction with output_dir to cause the sampled frames to be written in a nested directory structure within output_dir matching the shape of the input video's folder structure
frames_patt:Nonea pattern specifying the filename/format to use to write or check or existing sampled frames, e.g., "%%06d.jpg". The default value is fiftyone.config.default_sequence_idx + fiftyone.config.default_image_ext
force_sample:Falsewhether to resample videos whose sampled frames already exist. Only applicable when sample_frames=True
skip_failures:Truewhether to gracefully continue without raising an error if a video cannot be sampled
verbose:Falsewhether to log information about the frames that will be sampled, if any
**kwargsUndocumented
Returns
a fiftyone.core.video.FramesView
def to_json(self, rel_dir=None, include_private=False, include_frames=False, frame_labels_dir=None, pretty_print=False): (source)

Returns a JSON string representation of the collection.

The samples will be written as a list in a top-level samples field of the returned dictionary.

Parameters
rel_dir:Nonea relative directory to remove from the filepath of each sample, if possible. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path. The typical use case for this argument is that your source data lives in a single directory and you wish to serialize relative, rather than absolute, paths to the data within that directory
include_private:Falsewhether to include private fields
include_frames:Falsewhether to include the frame labels for video samples
frame_labels_dir:Nonea directory in which to write per-sample JSON files containing the frame labels for video samples. If omitted, frame labels will be included directly in the returned JSON dict (which can be quite quite large for video datasets containing many frames). Only applicable to datasets that contain videos when include_frames is True
pretty_print:Falsewhether to render the JSON in human readable format with newlines and indentations
Returns
a JSON string
@view_stage
def to_patches(self, field, **kwargs): (source)

Creates a view that contains one sample per object patch in the specified field of the collection.

Fields other than field and the default sample fields will not be included in the returned view. A sample_id field will be added that records the sample ID from which each patch was taken.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

session = fo.launch_app(dataset)

#
# Create a view containing the ground truth patches
#

view = dataset.to_patches("ground_truth")
print(view)

session.view = view
Parameters
fieldthe patches field, which must be of type fiftyone.core.labels.Detections, fiftyone.core.labels.Polylines, or fiftyone.core.labels.Keypoints
other_fields:None

controls whether fields other than field and the default sample fields are included. Can be any of the following:

  • a field or list of fields to include
  • True to include all other fields
  • None/False to include no other fields
keep_label_lists:Falsewhether to store the patches in label list fields of the same type as the input collection rather than using their single label variants
**kwargsUndocumented
Returns
a fiftyone.core.patches.PatchesView
@view_stage
def to_trajectories(self, field, **kwargs): (source)

Creates a view that contains one clip for each unique object trajectory defined by their (label, index) in a frame-level field of a video collection.

The returned view will contain:

  • A sample_id field that records the sample ID from which each clip was taken
  • A support field that records the [first, last] frame support of each clip
  • A sample-level label field that records the label and index of each trajectory

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart-video")

#
# Create a trajectories view for the vehicles in the dataset
#

trajectories = (
    dataset
    .filter_labels("frames.detections", F("label") == "vehicle")
    .to_trajectories("frames.detections")
)

print(trajectories)
Parameters
field

a frame-level label list field of any of the following types:

**kwargsoptional keyword arguments for fiftyone.core.clips.make_clips_dataset specifying how to perform the conversion
Returns
a fiftyone.core.clips.TrajectoriesView
def untag_labels(self, tags, label_fields=None): (source)

Removes the tag from all labels in the specified label field(s) of this collection, if necessary.

Parameters
tagsa tag or iterable of tags
label_fields:Nonean optional name or iterable of names of fiftyone.core.labels.Label fields. By default, all label fields are used
def untag_samples(self, tags): (source)

Removes the tag(s) from all samples in this collection, if necessary.

Parameters
tagsa tag or iterable of tags
def update_run_config(self, run_key, config): (source)

Updates the run config for the run with the given key.

Parameters
run_keya run key
configa fiftyone.core.runs.RunConfig
def validate_field_type(self, path, ftype=None, embedded_doc_type=None): (source)

Validates that the collection has a field of the given type.

Parameters
patha field name or embedded.field.name
ftype:Nonean optional field type to enforce. Must be a subclass of fiftyone.core.fields.Field
embedded_doc_type:Nonean optional embedded document type or iterable of types to enforce. Must be a subclass(es) of fiftyone.core.odm.BaseEmbeddedDocument
Raises
ValueErrorif the field does not exist or does not have the expected type
def validate_fields_exist(self, fields, include_private=False): (source)

Validates that the collection has field(s) with the given name(s).

If embedded field names are provided, only the root field is checked.

Parameters
fieldsa field name or iterable of field names
include_private:Falsewhether to include private fields when checking for existence
Raises
ValueErrorif one or more of the fields do not exist
@aggregation
def values(self, field_or_expr, expr=None, missing_value=None, unwind=False, _allow_missing=False, _big_result=True, _raw=False, _field=None): (source)

Extracts the values of a field from all samples in the collection.

Values aggregations are useful for efficiently extracting a slice of field or embedded field values across all samples in a collection. See the examples below for more details.

The dual function of values is set_values, which can be used to efficiently set a field or embedded field of all samples in a collection by providing lists of values of same structure returned by this aggregation.

Note

Unlike other aggregations, values does not automatically unwind list fields, which ensures that the returned values match the potentially-nested structure of the documents.

You can opt-in to unwinding specific list fields using the [] syntax, or you can pass the optional unwind=True parameter to unwind all supported list fields. See :ref:`aggregations-list-fields` for more information.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = fo.Dataset()
dataset.add_samples(
    [
        fo.Sample(
            filepath="/path/to/image1.png",
            numeric_field=1.0,
            numeric_list_field=[1, 2, 3],
        ),
        fo.Sample(
            filepath="/path/to/image2.png",
            numeric_field=4.0,
            numeric_list_field=[1, 2],
        ),
        fo.Sample(
            filepath="/path/to/image3.png",
            numeric_field=None,
            numeric_list_field=None,
        ),
    ]
)

#
# Get all values of a field
#

values = dataset.values("numeric_field")
print(values)  # [1.0, 4.0, None]

#
# Get all values of a list field
#

values = dataset.values("numeric_list_field")
print(values)  # [[1, 2, 3], [1, 2], None]

#
# Get all values of transformed field
#

values = dataset.values(2 * (F("numeric_field") + 1))
print(values)  # [4.0, 10.0, None]

#
# Get values from a label list field
#

dataset = foz.load_zoo_dataset("quickstart")

# list of `Detections`
detections = dataset.values("ground_truth")

# list of lists of `Detection` instances
detections = dataset.values("ground_truth.detections")

# list of lists of detection labels
labels = dataset.values("ground_truth.detections.label")
Parameters
field_or_expra field name, embedded.field.name, fiftyone.core.expressions.ViewExpression, or MongoDB expression defining the field or expression to aggregate. This can also be a list or tuple of such arguments, in which case a tuple of corresponding aggregation results (each receiving the same additional keyword arguments, if any) will be returned
expr:Nonea fiftyone.core.expressions.ViewExpression or MongoDB expression to apply to field_or_expr (which must be a field) before aggregating
missing_value:Nonea value to insert for missing or None-valued fields
unwind:Falsewhether to automatically unwind all recognized list fields (True) or unwind all list fields except the top-level sample field (-1)
_allow_missingUndocumented
_big_resultUndocumented
_rawUndocumented
_fieldUndocumented
Returns
the list of values
def write_json(self, json_path, rel_dir=None, include_private=False, include_frames=False, frame_labels_dir=None, pretty_print=False): (source)

Writes the colllection to disk in JSON format.

Parameters
json_paththe path to write the JSON
rel_dir:Nonea relative directory to remove from the filepath of each sample, if possible. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path. The typical use case for this argument is that your source data lives in a single directory and you wish to serialize relative, rather than absolute, paths to the data within that directory
include_private:Falsewhether to include private fields
include_frames:Falsewhether to include the frame labels for video samples
frame_labels_dir:Nonea directory in which to write per-sample JSON files containing the frame labels for video samples. If omitted, frame labels will be included directly in the returned JSON dict (which can be quite quite large for video datasets containing many frames). Only applicable to datasets that contain videos when include_frames is True
pretty_print:Falsewhether to render the JSON in human readable format with newlines and indentations

Dataset-specific settings that customize how this collection is visualized in the :ref:`FiftyOne App <fiftyone-app>`.

The classes of the underlying dataset.

See fiftyone.core.dataset.Dataset.classes for more information.

@property
default_classes = (source)

The default classes of the underlying dataset.

See fiftyone.core.dataset.Dataset.default_classes for more information.

@property
default_group_slice = (source)

The default group slice of the collection, or None if the collection is not grouped.

@property
default_mask_targets = (source)

The default mask targets of the underlying dataset.

See fiftyone.core.dataset.Dataset.default_mask_targets for more information.

@property
default_skeleton = (source)

The default keypoint skeleton of the underlying dataset.

See fiftyone.core.dataset.Dataset.default_skeleton for more information.

@property
description = (source)

A description of the underlying dataset.

See fiftyone.core.dataset.Dataset.description for more information.

@property
group_field = (source)

The group field of the collection, or None if the collection is not grouped.

@property
group_media_types = (source)

A dict mapping group slices to media types, or None if the collection is not grouped.

@property
group_slice = (source)

The current group slice of the collection, or None if the collection is not grouped.

@property
group_slices = (source)

The list of group slices of the collection, or None if the collection is not grouped.

@property
has_annotation_runs = (source)

Whether this collection has any annotation runs.

@property
has_brain_runs = (source)

Whether this collection has any brain runs.

@property
has_evaluations = (source)

Whether this collection has any evaluation results.

Whether this collection has any runs.

The info dict of the underlying dataset.

See fiftyone.core.dataset.Dataset.info for more information.

@property
mask_targets = (source)

The mask targets of the underlying dataset.

See fiftyone.core.dataset.Dataset.mask_targets for more information.

The media type of the collection.

The keypoint skeletons of the underlying dataset.

See fiftyone.core.dataset.Dataset.skeletons for more information.

The list of tags of the underlying dataset.

See fiftyone.core.dataset.Dataset.tags for more information.

def _add_view_stage(self, stage): (source)

Returns a fiftyone.core.view.DatasetView containing the contents of the collection with the given fiftyone.core.stages.ViewStage` appended to its aggregation pipeline.

Subclasses are responsible for performing any validation on the view stage to ensure that it is a valid stage to add to this collection.

Parameters
stagea fiftyone.core.stages.ViewStage`
Returns
a fiftyone.core.view.DatasetView
def _aggregate(self, pipeline=None, media_type=None, attach_frames=False, detach_frames=False, frames_only=False, support=None, group_slice=None, group_slices=None, detach_groups=False, groups_only=False, manual_group_select=False, post_pipeline=None): (source)

Runs the MongoDB aggregation pipeline on the collection and returns the result.

Parameters
pipeline:Nonea MongoDB aggregation pipeline (list of dicts) to append to the current pipeline
media_type:Nonethe media type of the collection, if different than the source dataset's media type
attach_frames:Falsewhether to attach the frame documents immediately prior to executing pipeline. Only applicable to datasets that contain videos
detach_frames:Falsewhether to detach the frame documents at the end of the pipeline. Only applicable to datasets that contain videos
frames_only:Falsewhether to generate a pipeline that contains only the frames in the collection
support:Nonean optional [first, last] range of frames to attach. Only applicable when attaching frames
group_slice:Nonethe current group slice of the collection, if different than the source dataset's group slice. Only applicable for grouped collections
group_slices:Nonean optional list of group slices to attach when groups_only is True
detach_groups:Falsewhether to detach the group documents at the end of the pipeline. Only applicable to grouped collections
groups_only:Falsewhether to generate a pipeline that contains only the flattened group documents for the collection
manual_group_select:Falsewhether the pipeline has manually handled the initial group selection. Only applicable to grouped collections
post_pipeline:Nonea MongoDB aggregation pipeline (list of dicts) to append to the very end of the pipeline, after all other arguments are applied
Returns
the aggregation result dict
async def _async_aggregate(self, aggregations): (source)

Undocumented

def _build_aggregation(self, aggregations): (source)

Undocumented

def _build_batch_pipeline(self, aggs_map): (source)

Undocumented

def _build_big_pipeline(self, aggregation): (source)

Undocumented

def _build_facets(self, aggs_map): (source)

Undocumented

def _contains_media_type(self, media_type, any_slice=False): (source)

Undocumented

def _contains_videos(self, any_slice=False): (source)

Undocumented

def _delete_labels(self, ids, fields=None): (source)
def _do_get_dynamic_field_schema(self, schema, unwind_cache, frames=False, fields=None, new=False): (source)

Undocumented

def _edit_label_tags(self, update_fcn, label_field, ids=None, label_ids=None): (source)

Undocumented

def _edit_sample_tags(self, update): (source)

Undocumented

def _expand_schema_from_values(self, field_name, values, dynamic=False, allow_missing=False, flat=False): (source)

Undocumented

def _get_db_fields_map(self, include_private=False, frames=False, reverse=False): (source)

Undocumented

def _get_default_field(self, path): (source)

Undocumented

def _get_default_frame_fields(self, path=None, include_private=False, use_db_fields=False): (source)

Undocumented

def _get_default_indexes(self, frames=False): (source)

Undocumented

def _get_default_sample_fields(self, path=None, include_private=False, use_db_fields=False, media_types=None): (source)

Undocumented

def _get_dynamic_field_schema(self, frames=False, fields=None, recursive=True): (source)

Undocumented

def _get_extremum(self, path, order): (source)

Undocumented

def _get_frame_label_field_schema(self): (source)

Undocumented

def _get_frames_bytes(self): (source)

Computes the total size of the frame documents in the collection.

def _get_geo_location_field(self): (source)

Undocumented

def _get_group_media_types(self): (source)

Undocumented

def _get_group_slices(self, field_names): (source)

Undocumented

def _get_label_attributes_schema(self, label_field): (source)

Undocumented

def _get_label_field_path(self, field_name, subfield=None): (source)

Undocumented

def _get_label_field_root(self, field_name): (source)

Undocumented

def _get_label_field_schema(self): (source)

Undocumented

def _get_label_field_type(self, field_name): (source)

Undocumented

def _get_label_fields(self): (source)

Undocumented

def _get_label_ids(self, tags=None, fields=None): (source)

Undocumented

def _get_media_fields(self, whitelist=None, blacklist=None, frames=False): (source)

Undocumented

def _get_per_frame_bytes(self): (source)

Returns a dictionary mapping frame IDs to document sizes (in bytes) for each frame in the video collection.

def _get_per_sample_bytes(self): (source)

Returns a dictionary mapping sample IDs to document sizes (in bytes) for each sample in the collection.

def _get_per_sample_frames_bytes(self): (source)

Returns a dictionary mapping sample IDs to total frame document sizes (in bytes) for each sample in the video collection.

def _get_root_field_type(self, field_name, include_private=False): (source)

Undocumented

def _get_root_fields(self, fields): (source)

Undocumented

def _get_samples_bytes(self): (source)

Computes the total size of the sample documents in the collection.

def _get_selected_labels(self, ids=None, tags=None, fields=None): (source)

Undocumented

def _get_store(self, store_name): (source)

Undocumented

def _get_values_by_id(self, path_or_expr, ids, link_field=None): (source)

Undocumented

def _handle_db_field(self, path, frames=False): (source)

Undocumented

def _handle_db_fields(self, paths, frames=False): (source)

Undocumented

def _handle_frame_field(self, field_name): (source)

Undocumented

def _handle_group_field(self, field_name): (source)

Undocumented

def _handle_id_fields(self, field_name): (source)

Undocumented

def _has_field(self, field_path): (source)

Undocumented

def _has_frame_fields(self): (source)

Undocumented

def _has_stores(self): (source)

Undocumented

def _is_default_field(self, path): (source)

Undocumented

def _is_frame_field(self, field_name): (source)

Undocumented

def _is_full_collection(self): (source)

Undocumented

def _is_group_field(self, field_name): (source)

Undocumented

def _is_label_field(self, field_name, label_type_or_types): (source)

Undocumented

def _is_read_only_field(self, path): (source)

Undocumented

def _list_stores(self): (source)

Undocumented

def _make_and_aggregate(self, make, args): (source)

Undocumented

def _make_set_field_pipeline(self, field, expr, embedded_root=False, allow_missing=False, new_field=None, context=None): (source)

Undocumented

def _max(self, path): (source)

Undocumented

def _min(self, path): (source)

Undocumented

def _parse_aggregations(self, aggregations, allow_big=True): (source)

Undocumented

def _parse_big_result(self, aggregation, result): (source)

Undocumented

def _parse_default_mask_targets(self, default_mask_targets): (source)

Undocumented

def _parse_default_skeleton(self, default_skeleton): (source)

Undocumented

def _parse_faceted_result(self, aggregation, result): (source)

Undocumented

def _parse_field(self, path, include_private=False, leaf=False): (source)

Undocumented

def _parse_field_name(self, field_name, auto_unwind=True, omit_terminal_lists=False, allow_missing=False, new_field=None): (source)

Undocumented

def _parse_frame_labels_field(self, frame_labels_field, dataset_exporter=None, allow_coercion=False, force_dict=False, required=False): (source)

Undocumented

def _parse_label_field(self, label_field, dataset_exporter=None, allow_coercion=False, force_dict=False, required=False): (source)

Undocumented

def _parse_mask_targets(self, mask_targets): (source)

Undocumented

def _parse_media_field(self, media_field): (source)

Undocumented

def _parse_skeletons(self, skeletons): (source)

Undocumented

def _pipeline(self, pipeline=None, media_type=None, attach_frames=False, detach_frames=False, frames_only=False, support=None, group_slice=None, group_slices=None, detach_groups=False, groups_only=False, manual_group_select=False, post_pipeline=None): (source)

Returns the MongoDB aggregation pipeline for the collection.

Parameters
pipeline:Nonea MongoDB aggregation pipeline (list of dicts) to append to the current pipeline
media_type:Nonethe media type of the collection, if different than the source dataset's media type
attach_frames:Falsewhether to attach the frame documents immediately prior to executing pipeline. Only applicable to datasets that contain videos
detach_frames:Falsewhether to detach the frame documents at the end of the pipeline. Only applicable to datasets that contain videos
frames_only:Falsewhether to generate a pipeline that contains only the frames in the collection
support:Nonean optional [first, last] range of frames to attach. Only applicable when attaching frames
group_slice:Nonethe current group slice of the collection, if different than the source dataset's group slice. Only applicable for grouped collections
group_slices:Nonean optional list of group slices to attach when groups_only is True
detach_groups:Falsewhether to detach the group documents at the end of the pipeline. Only applicable to grouped collections
groups_only:Falsewhether to generate a pipeline that contains only the flattened group documents for the collection
manual_group_select:Falsewhether the pipeline has manually handled the initial group selection. Only applicable to grouped collections
post_pipeline:Nonea MongoDB aggregation pipeline (list of dicts) to append to the very end of the pipeline, after all other arguments are applied
Returns
the aggregation pipeline
def _process_aggregations(self, aggregations, result, scalar_result): (source)

Undocumented

def _serialize_default_mask_targets(self): (source)

Undocumented

def _serialize_default_skeleton(self): (source)

Undocumented

def _serialize_field_schema(self): (source)

Undocumented

def _serialize_frame_field_schema(self): (source)

Undocumented

def _serialize_mask_targets(self): (source)

Undocumented

def _serialize_schema(self, schema): (source)

Undocumented

def _serialize_skeletons(self): (source)

Undocumented

def _set_doc_values(self, field_name, ids, values, field=None, skip_none=False, validate=True, frames=False, progress=False): (source)

Undocumented

def _set_frame_values(self, field_name, values, list_fields, sample_ids=None, frame_ids=None, field=None, skip_none=False, validate=True, progress=False): (source)

Undocumented

def _set_label_list_values(self, field_name, values, id_map, list_field, field=None, skip_none=None, validate=True, frames=False, progress=False): (source)

Undocumented

def _set_labels(self, field_name, sample_ids, label_docs, progress=False): (source)

Undocumented

def _set_list_values_by_id(self, field_name, ids, elem_ids, values, list_field, field=None, skip_none=False, validate=True, frames=False, progress=False): (source)

Undocumented

def _set_sample_values(self, field_name, values, list_fields, sample_ids=None, field=None, skip_none=False, validate=True, progress=False): (source)

Undocumented

def _set_values(self, field_name, values, key_field=None, skip_none=False, expand_schema=True, dynamic=False, validate=True, progress=False, _allow_missing=False, _sample_ids=None, _frame_ids=None): (source)

Undocumented

def _split_frame_fields(self, fields): (source)

Undocumented

def _sync_dataset_last_modified_at(self): (source)

Undocumented

def _sync_samples_last_modified_at(self): (source)

Undocumented

def _tag_labels(self, tags, label_field, ids=None, label_ids=None): (source)
def _to_fields_str(self, field_schema): (source)

Undocumented

def _untag_labels(self, tags, label_field, ids=None, label_ids=None): (source)
def _unwind_values(self, field_name, values, keep_top_level=False): (source)

Undocumented

def _validate_root_field(self, field_name, include_private=False): (source)

Undocumented

_FRAMES_PREFIX: str = (source)

Undocumented

Value
'frames.'
_GROUPS_PREFIX: str = (source)

Undocumented

Value
'groups.'

The fiftyone.core.dataset.Dataset that serves the samples in this collection.

@property
_element_str = (source)

Undocumented

@property
_elements_str = (source)

Undocumented

Whether this collection contains clips.

@property
_is_dynamic_groups = (source)

Whether this collection contains dynamic groups.

Whether this collection contains frames of a video dataset.

@property
_is_generated = (source)

Whether this collection's contents is generated from another collection.

@property
_is_patches = (source)

Whether this collection contains patches.

@property
_root_dataset = (source)

The root fiftyone.core.dataset.Dataset from which this collection is derived.

This is typically the same as _dataset but may differ in cases such as patches views.