fiftyone.core.dataset.Dataset

class documentation

class Dataset(foc.SampleCollection): (source)

Constructor: Dataset(name, persistent, overwrite, _create, ...)

A FiftyOne dataset.

Datasets represent an ordered collection of fiftyone.core.sample.Sample instances that describe a particular type of raw media (e.g., images or videos) together with a user-defined set of fields.

FiftyOne datasets ingest and store the labels for all samples internally; raw media is stored on disk and the dataset provides paths to the data.

See :ref:`this page <using-datasets>` for an overview of working with FiftyOne datasets.

Parameters
name	the name of the dataset. By default, `get_default_dataset_name` is used
persistent	whether the dataset should persist in the database after the session terminates
overwrite	whether to overwrite an existing dataset of the same name

Class Method	`from_archive`	Creates a `Dataset` from the contents of the given archive.
Class Method	`from_dict`	Loads a `Dataset` from a JSON dictionary generated by `fiftyone.core.collections.SampleCollection.to_dict`.
Class Method	`from_dir`	Creates a `Dataset` from the contents of the given directory.
Class Method	`from_images`	Creates a `Dataset` from the given images.
Class Method	`from_images_dir`	Creates a `Dataset` from the given directory of images.
Class Method	`from_images_patt`	Creates a `Dataset` from the given glob pattern of images.
Class Method	`from_importer`	Creates a `Dataset` by importing the samples in the given `fiftyone.utils.data.importers.DatasetImporter`.
Class Method	`from_json`	Loads a `Dataset` from JSON generated by `fiftyone.core.collections.SampleCollection.write_json` or `fiftyone.core.collections.SampleCollection.to_json`.
Class Method	`from_labeled_images`	Creates a `Dataset` from the given labeled images.
Class Method	`from_labeled_videos`	Creates a `Dataset` from the given labeled videos.
Class Method	`from_videos`	Creates a `Dataset` from the given videos.
Class Method	`from_videos_dir`	Creates a `Dataset` from the given directory of videos.
Class Method	`from_videos_patt`	Creates a `Dataset` from the given glob pattern of videos.
Method	`__copy__`	Undocumented
Method	`__deepcopy__`	Undocumented
Method	`__delitem__`	Undocumented
Method	`__eq__`	Undocumented
Method	`__getattribute__`	Undocumented
Method	`__getitem__`	Undocumented
Method	`__init__`	Undocumented
Method	`__len__`	Undocumented
Method	`add_archive`	Adds the contents of the given archive to the dataset.
Method	`add_collection`	Adds the contents of the given collection to the dataset.
Method	`add_dir`	Adds the contents of the given directory to the dataset.
Method	`add_dynamic_frame_fields`	Adds all dynamic frame fields to the dataset's schema.
Method	`add_dynamic_sample_fields`	Adds all dynamic sample fields to the dataset's schema.
Method	`add_frame_field`	Adds a new frame-level field or embedded field to the dataset, if necessary.
Method	`add_group_field`	Adds a group field to the dataset, if necessary.
Method	`add_group_slice`	Adds a group slice with the given media type to the dataset, if necessary.
Method	`add_images`	Adds the given images to the dataset.
Method	`add_images_dir`	Adds the given directory of images to the dataset.
Method	`add_images_patt`	Adds the given glob pattern of images to the dataset.
Method	`add_importer`	Adds the samples from the given `fiftyone.utils.data.importers.DatasetImporter` to the dataset.
Method	`add_labeled_images`	Adds the given labeled images to the dataset.
Method	`add_labeled_videos`	Adds the given labeled videos to the dataset.
Method	`add_sample`	Adds the given sample to the dataset.
Method	`add_sample_field`	Adds a new sample field or embedded field to the dataset, if necessary.
Method	`add_samples`	Adds the given samples to the dataset.
Method	`add_videos`	Adds the given videos to the dataset.
Method	`add_videos_dir`	Adds the given directory of videos to the dataset.
Method	`add_videos_patt`	Adds the given glob pattern of videos to the dataset.
Method	`app_config.setter`	Undocumented
Method	`check_summary_fields`	Returns a list of summary fields that may need to be updated.
Method	`classes.setter`	Undocumented
Method	`clear`	Removes all samples from the dataset.
Method	`clear_cache`	Clears the dataset's in-memory cache.
Method	`clear_frame_field`	Clears the values of the frame-level field from all samples in the dataset.
Method	`clear_frame_fields`	Clears the values of the frame-level fields from all samples in the dataset.
Method	`clear_frames`	Removes all frame labels from the dataset.
Method	`clear_sample_field`	Clears the values of the field from all samples in the dataset.
Method	`clear_sample_fields`	Clears the values of the fields from all samples in the dataset.
Method	`clone`	Creates a copy of the dataset.
Method	`clone_frame_field`	Clones the frame-level field into a new field.
Method	`clone_frame_fields`	Clones the frame-level fields into new fields.
Method	`clone_sample_field`	Clones the given sample field into a new field of the dataset.
Method	`clone_sample_fields`	Clones the given sample fields into new fields of the dataset.
Method	`create_summary_field`	Populates a sample-level field that records the unique values or numeric ranges that appear in the specified field on each sample in the dataset.
Method	`default_classes.setter`	Undocumented
Method	`default_group_slice.setter`	Undocumented
Method	`default_mask_targets.setter`	Undocumented
Method	`default_skeleton.setter`	Undocumented
Method	`delete`	Deletes the dataset.
Method	`delete_frame_field`	Deletes the frame-level field from all samples in the dataset.
Method	`delete_frame_fields`	Deletes the frame-level fields from all samples in the dataset.
Method	`delete_frames`	Deletes the given frames(s) from the dataset.
Method	`delete_group_slice`	Deletes all samples in the given group slice from the dataset.
Method	`delete_groups`	Deletes the given groups(s) from the dataset.
Method	`delete_labels`	Deletes the specified labels from the dataset.
Method	`delete_sample_field`	Deletes the field from all samples in the dataset.
Method	`delete_sample_fields`	Deletes the fields from all samples in the dataset.
Method	`delete_samples`	Deletes the given sample(s) from the dataset.
Method	`delete_saved_view`	Deletes the saved view with the given name.
Method	`delete_saved_views`	Deletes all saved views from this dataset.
Method	`delete_summary_field`	Deletes the summary field from all samples in the dataset.
Method	`delete_summary_fields`	Deletes the summary fields from all samples in the dataset.
Method	`delete_workspace`	Deletes the saved workspace with the given name.
Method	`delete_workspaces`	Deletes all saved workspaces from this dataset.
Method	`description.setter`	Undocumented
Method	`ensure_frames`	Ensures that the video dataset contains frame instances for every frame of each sample's source video.
Method	`first`	Returns the first sample in the dataset.
Method	`get_field_schema`	Returns a schema dictionary describing the fields of the samples in the dataset.
Method	`get_frame_field_schema`	Returns a schema dictionary describing the fields of the frames of the samples in the dataset.
Method	`get_group`	Returns a dict containing the samples for the given group ID.
Method	`get_saved_view_info`	Loads the editable information about the saved view with the given name.
Method	`get_workspace_info`	Gets the information about the workspace with the given name.
Method	`group_slice.setter`	Undocumented
Method	`has_saved_view`	Whether this dataset has a saved view with the given name.
Method	`has_workspace`	Whether this dataset has a saved workspace with the given name.
Method	`head`	Returns a list of the first few samples in the dataset.
Method	`info.setter`	Undocumented
Method	`ingest_images`	Ingests the given iterable of images into the dataset.
Method	`ingest_labeled_images`	Ingests the given iterable of labeled image samples into the dataset.
Method	`ingest_labeled_videos`	Ingests the given iterable of labeled video samples into the dataset.
Method	`ingest_videos`	Ingests the given iterable of videos into the dataset.
Method	`iter_groups`	Returns an iterator over the groups in the dataset.
Method	`iter_samples`	Returns an iterator over the samples in the dataset.
Method	`last`	Returns the last sample in the dataset.
Method	`list_saved_views`	List saved views on this dataset.
Method	`list_summary_fields`	Lists the summary fields on the dataset.
Method	`list_workspaces`	List saved workspaces on this dataset.
Method	`load_saved_view`	Loads the saved view with the given name.
Method	`load_workspace`	Loads the saved workspace with the given name.
Method	`mask_targets.setter`	Undocumented
Method	`media_type.setter`	Undocumented
Method	`merge_archive`	Merges the contents of the given archive into the dataset.
Method	`merge_dir`	Merges the contents of the given directory into the dataset.
Method	`merge_importer`	Merges the samples from the given `fiftyone.utils.data.importers.DatasetImporter` into the dataset.
Method	`merge_sample`	Merges the fields of the given sample into this dataset.
Method	`merge_samples`	Merges the given samples into this dataset.
Method	`name.setter`	Undocumented
Method	`one`	Returns a single sample in this dataset matching the expression.
Method	`persistent.setter`	Undocumented
Method	`reload`	Reloads the dataset and any in-memory samples from the database.
Method	`remove_dynamic_frame_field`	Removes the dynamic embedded frame field from the dataset's schema.
Method	`remove_dynamic_frame_fields`	Removes the dynamic embedded frame fields from the dataset's schema.
Method	`remove_dynamic_sample_field`	Removes the dynamic embedded sample field from the dataset's schema.
Method	`remove_dynamic_sample_fields`	Removes the dynamic embedded sample fields from the dataset's schema.
Method	`rename_frame_field`	Renames the frame-level field to the given new name.
Method	`rename_frame_fields`	Renames the frame-level fields to the given new names.
Method	`rename_group_slice`	Renames the group slice with the given name.
Method	`rename_sample_field`	Renames the sample field to the given new name.
Method	`rename_sample_fields`	Renames the sample fields to the given new names.
Method	`save`	Saves the dataset to the database.
Method	`save_view`	Saves the given view into this dataset under the given name so it can be loaded later via `load_saved_view`.
Method	`save_workspace`	Saves a workspace into this dataset under the given name so it can be loaded later via `load_workspace`.
Method	`skeletons.setter`	Undocumented
Method	`stats`	Returns stats about the dataset on disk.
Method	`summary`	Returns a string summary of the dataset.
Method	`tags.setter`	Undocumented
Method	`tail`	Returns a list of the last few samples in the dataset.
Method	`update_saved_view_info`	Updates the editable information for the saved view with the given name.
Method	`update_summary_field`	Updates the summary field based on the current values of its source field.
Method	`update_workspace_info`	Updates the editable information for the saved view with the given name.
Method	`view`	Returns a `fiftyone.core.view.DatasetView` containing the entire dataset.
Class Variable	`__slots__`	Undocumented
Instance Variable	`group_slice`	The current group slice of the dataset, or None if the dataset is not grouped.
Instance Variable	`media_type`	The media type of the dataset.
Property	`app_config`	A `fiftyone.core.odm.dataset.DatasetAppConfig` that customizes how this dataset is visualized in the :ref:`FiftyOne App <fiftyone-app>`.
Property	`classes`	A dict mapping field names to list of class label strings for the corresponding fields of the dataset.
Property	`created_at`	The datetime that the dataset was created.
Property	`default_classes`	A list of class label strings for all `fiftyone.core.labels.Label` fields of this dataset that do not have customized classes defined in `classes`.
Property	`default_group_slice`	The default group slice of the dataset, or None if the dataset is not grouped.
Property	`default_mask_targets`	A dict defining a default mapping between pixel values (2D masks) or RGB hex strings (3D masks) and label strings for the segmentation masks of all `fiftyone.core.labels.Segmentation` fields of this dataset that do not have customized mask targets defined in ...
Property	`default_skeleton`	A default `fiftyone.core.odm.dataset.KeypointSkeleton` defining the semantic labels and point connectivity for all `fiftyone.core.labels.Keypoint` fields of this dataset that do not have customized skeletons defined in ...
Property	`deleted`	Whether the dataset is deleted.
Property	`description`	A string description on the dataset.
Property	`group_field`	The group field of the dataset, or None if the dataset is not grouped.
Property	`group_media_types`	A dict mapping group slices to media types, or None if the dataset is not grouped.
Property	`group_slices`	The list of group slices of the dataset, or None if the dataset is not grouped.
Property	`has_saved_views`	Whether this dataset has any saved views.
Property	`has_workspaces`	Whether this dataset has any saved workspaces.
Property	`info`	A user-facing dictionary of information about the dataset.
Property	`last_loaded_at`	The datetime that the dataset was last loaded.
Property	`last_modified_at`	The datetime that the dataset was last modified.
Property	`mask_targets`	A dict mapping field names to mask target dicts, each of which defines a mapping between pixel values (2D masks) or RGB hex strings (3D masks) and label strings for the segmentation masks in the corresponding field of the dataset.
Property	`name`	The name of the dataset.
Property	`persistent`	Whether the dataset persists in the database after a session is terminated.
Property	`skeletons`	A dict mapping field names to `fiftyone.core.odm.dataset.KeypointSkeleton` instances, each of which defines the semantic labels and point connectivity for the `fiftyone.core.labels.Keypoint` instances in the corresponding field of the dataset.
Property	`slug`	The slug of the dataset.
Property	`tags`	A list of tags on the dataset.
Property	`version`	The version of the `fiftyone` package for which the dataset is formatted.
Method	`_add_group_field`	Undocumented
Method	`_add_implied_frame_field`	Undocumented
Method	`_add_implied_sample_field`	Undocumented
Method	`_add_samples_batch`	Writes the given samples and backing docs to the database and returns their IDs.
Method	`_add_view_stage`	Returns a `fiftyone.core.view.DatasetView` containing the contents of the collection with the given fiftyone.core.stages.ViewStage` appended to its aggregation pipeline.
Method	`_aggregate`	Runs the MongoDB aggregation pipeline on the collection and returns the result.
Method	`_apply_frame_field_schema`	Undocumented
Method	`_apply_sample_field_schema`	Undocumented
Method	`_attach_frames_pipeline`	A pipeline that attaches the frame documents for each document.
Method	`_attach_groups_pipeline`	A pipeline that attaches the requested group slice(s) for each document and stores them in under `groups.<slice>` keys.
Method	`_bulk_write`	Undocumented
Method	`_calculate_size`	Undocumented
Method	`_clear`	Undocumented
Method	`_clear_frame_fields`	Undocumented
Method	`_clear_frames`	Undocumented
Method	`_clear_groups`	Undocumented
Method	`_clear_sample_fields`	Undocumented
Method	`_clone`	Undocumented
Method	`_clone_frame_fields`	Undocumented
Method	`_clone_sample_fields`	Undocumented
Method	`_delete`	Undocumented
Method	`_delete_frame_fields`	Undocumented
Method	`_delete_labels`	Undocumented
Method	`_delete_sample_fields`	Undocumented
Method	`_delete_saved_view`	Undocumented
Method	`_delete_summary_fields`	Undocumented
Method	`_delete_workspace`	Undocumented
Method	`_ensure_frames`	Undocumented
Method	`_ensure_label_field`	Undocumented
Method	`_estimated_count`	Undocumented
Method	`_expand_frame_schema`	Undocumented
Method	`_expand_group_schema`	Undocumented
Method	`_expand_schema`	Undocumented
Method	`_frame_collstats`	Undocumented
Method	`_frame_dict_to_doc`	Undocumented
Method	`_get_default_summary_field_name`	Undocumented
Method	`_get_frame_collection`	Undocumented
Method	`_get_sample_collection`	Undocumented
Method	`_get_saved_view_doc`	Undocumented
Method	`_get_summarized_fields_map`	Undocumented
Method	`_get_workspace_doc`	Undocumented
Method	`_group_select_pipeline`	A pipeline that selects only the given slice's documents from the pipeline.
Method	`_groups_only_pipeline`	A pipeline that looks up the requested group slices for each document and returns (only) the unwound group slices.
Method	`_init_frames`	Undocumented
Method	`_iter_groups`	Undocumented
Method	`_iter_samples`	Undocumented
Method	`_keep`	Undocumented
Method	`_keep_fields`	Undocumented
Method	`_keep_frames`	Undocumented
Method	`_load_saved_view_from_doc`	Undocumented
Method	`_make_dict`	Undocumented
Method	`_make_frame`	Undocumented
Method	`_make_sample`	Undocumented
Method	`_merge_doc`	Undocumented
Method	`_merge_frame_field_schema`	Undocumented
Method	`_merge_sample_field_schema`	Undocumented
Method	`_pipeline`	Returns the MongoDB aggregation pipeline for the collection.
Method	`_populate_summary_field`	Undocumented
Method	`_reload`	Undocumented
Method	`_reload_docs`	Undocumented
Method	`_remove_dynamic_frame_fields`	Undocumented
Method	`_remove_dynamic_sample_fields`	Undocumented
Method	`_rename_frame_fields`	Undocumented
Method	`_rename_sample_fields`	Undocumented
Method	`_sample_collstats`	Undocumented
Method	`_sample_dict_to_doc`	Undocumented
Method	`_save`	Undocumented
Method	`_save_field`	Undocumented
Method	`_serialize`	Undocumented
Method	`_set_media_type`	Undocumented
Method	`_transform_sample`	Transforms the given sample and returns the transformed sample and dict as a pair.
Method	`_unwind_frames_pipeline`	A pipeline that returns (only) the unwound `frames` documents.
Method	`_unwind_groups_pipeline`	A pipeline that returns (only) the unwound `groups` documents.
Method	`_update_last_loaded_at`	Undocumented
Method	`_update_last_modified_at`	Undocumented
Method	`_update_metadata_field`	Undocumented
Method	`_upsert_samples`	Undocumented
Method	`_upsert_samples_batch`	Upserts the given samples and their backing docs to the database.
Method	`_validate_sample`	Undocumented
Method	`_validate_saved_view_name`	Undocumented
Method	`_validate_workspace_name`	Undocumented
Instance Variable	`_annotation_cache`	Undocumented
Instance Variable	`_brain_cache`	Undocumented
Instance Variable	`_deleted`	Undocumented
Instance Variable	`_doc`	Undocumented
Instance Variable	`_evaluation_cache`	Undocumented
Instance Variable	`_frame_doc_cls`	Undocumented
Instance Variable	`_group_slice`	Undocumented
Instance Variable	`_run_cache`	Undocumented
Instance Variable	`_sample_doc_cls`	Undocumented
Property	`_dataset`	The `fiftyone.core.dataset.Dataset` that serves the samples in this collection.
Property	`_frame_collection`	Undocumented
Property	`_frame_collection_name`	Undocumented
Property	`_is_clips`	Whether this collection contains clips.
Property	`_is_dynamic_groups`	Whether this collection contains dynamic groups.
Property	`_is_frames`	Whether this collection contains frames of a video dataset.
Property	`_is_generated`	Whether this collection's contents is generated from another collection.
Property	`_is_patches`	Whether this collection contains patches.
Property	`_root_dataset`	The root `fiftyone.core.dataset.Dataset` from which this collection is derived.
Property	`_sample_collection`	Undocumented
Property	`_sample_collection_name`	Undocumented

Inherited from SampleCollection:

Class Method	`list_aggregations`	Returns a list of all available methods on this collection that apply `fiftyone.core.aggregations.Aggregation` operations to this collection.
Class Method	`list_view_stages`	Returns a list of all available methods on this collection that apply `fiftyone.core.stages.ViewStage` operations to this collection.
Method	`__add__`	Undocumented
Method	`__bool__`	Undocumented
Method	`__contains__`	Undocumented
Method	`__iter__`	Undocumented
Method	`__repr__`	Undocumented
Method	`__str__`	Undocumented
Method	`add_stage`	Applies the given `fiftyone.core.stages.ViewStage` to the collection.
Method	`aggregate`	Aggregates one or more `fiftyone.core.aggregations.Aggregation` instances.
Method	`annotate`	Exports the samples and optional label field(s) in this collection to the given annotation backend.
Method	`apply_model`	Applies the model to the samples in the collection.
Method	`bounds`	Computes the bounds of a numeric field of the collection.
Method	`compute_embeddings`	Computes embeddings for the samples in the collection using the given model.
Method	`compute_metadata`	Populates the `metadata` field of all samples in the collection.
Method	`compute_patch_embeddings`	Computes embeddings for the image patches defined by `patches_field` of the samples in the collection using the given model.
Method	`concat`	Concatenates the contents of the given `SampleCollection` to this collection.
Method	`count`	Counts the number of field values in the collection.
Method	`count_label_tags`	Counts the occurrences of all label tags in the specified label field(s) of this collection.
Method	`count_sample_tags`	Counts the occurrences of sample tags in this collection.
Method	`count_values`	Counts the occurrences of field values in the collection.
Method	`create_index`	Creates an index on the given field or with the given specification, if necessary.
Method	`delete_annotation_run`	Deletes the annotation run with the given key from this collection.
Method	`delete_annotation_runs`	Deletes all annotation runs from this collection.
Method	`delete_brain_run`	Deletes the brain method run with the given key from this collection.
Method	`delete_brain_runs`	Deletes all brain method runs from this collection.
Method	`delete_evaluation`	Deletes the evaluation results associated with the given evaluation key from this collection.
Method	`delete_evaluations`	Deletes all evaluation results from this collection.
Method	`delete_run`	Deletes the run with the given key from this collection.
Method	`delete_runs`	Deletes all runs from this collection.
Method	`distinct`	Computes the distinct values of a field in the collection.
Method	`draw_labels`	Renders annotated versions of the media in the collection with the specified label data overlaid to the given directory.
Method	`drop_index`	Drops the index for the given field or name, if necessary.
Method	`evaluate_classifications`	Evaluates the classification predictions in this collection with respect to the specified ground truth labels.
Method	`evaluate_detections`	Evaluates the specified predicted detections in this collection with respect to the specified ground truth detections.
Method	`evaluate_regressions`	Evaluates the regression predictions in this collection with respect to the specified ground truth values.
Method	`evaluate_segmentations`	Evaluates the specified semantic segmentation masks in this collection with respect to the specified ground truth masks.
Method	`exclude`	Excludes the samples with the given IDs from the collection.
Method	`exclude_by`	Excludes the samples with the given field values from the collection.
Method	`exclude_fields`	Excludes the fields with the given names from the samples in the collection.
Method	`exclude_frames`	Excludes the frames with the given IDs from the video collection.
Method	`exclude_group_slices`	Excludes the specified group slice(s) from the grouped collection.
Method	`exclude_groups`	Excludes the groups with the given IDs from the grouped collection.
Method	`exclude_labels`	Excludes the specified labels from the collection.
Method	`exists`	Returns a view containing the samples in the collection that have (or do not have) a non-`None` value for the given field or embedded field.
Method	`export`	Exports the samples in the collection to disk.
Method	`filter_field`	Filters the values of a field or embedded field of each sample in the collection.
Method	`filter_keypoints`	Filters the individual `fiftyone.core.labels.Keypoint.points` elements in the specified keypoints field of each sample in the collection.
Method	`filter_labels`	Filters the `fiftyone.core.labels.Label` field of each sample in the collection.
Method	`flatten`	Returns a flattened view that contains all samples in the dynamic grouped collection.
Method	`geo_near`	Sorts the samples in the collection by their proximity to a specified geolocation.
Method	`geo_within`	Filters the samples in this collection to only include samples whose geolocation is within a specified boundary.
Method	`get_annotation_info`	Returns information about the annotation run with the given key on this collection.
Method	`get_brain_info`	Returns information about the brain method run with the given key on this collection.
Method	`get_classes`	Gets the classes list for the given field, or None if no classes are available.
Method	`get_dynamic_field_schema`	Returns a schema dictionary describing the dynamic fields of the samples in the collection.
Method	`get_dynamic_frame_field_schema`	Returns a schema dictionary describing the dynamic fields of the frames in the collection.
Method	`get_evaluation_info`	Returns information about the evaluation with the given key on this collection.
Method	`get_field`	Returns the field instance of the provided path, or `None` if one does not exist.
Method	`get_index_information`	Returns a dictionary of information about the indexes on this collection.
Method	`get_mask_targets`	Gets the mask targets for the given field, or None if no mask targets are available.
Method	`get_run_info`	Returns information about the run with the given key on this collection.
Method	`get_skeleton`	Gets the keypoint skeleton for the given field, or None if no skeleton is available.
Method	`group_by`	Creates a view that groups the samples in the collection by a specified field or expression.
Method	`has_annotation_run`	Whether this collection has an annotation run with the given key.
Method	`has_brain_run`	Whether this collection has a brain method run with the given key.
Method	`has_classes`	Determines whether this collection has a classes list for the given field.
Method	`has_evaluation`	Whether this collection has an evaluation with the given key.
Method	`has_field`	Determines whether the collection has a field with the given name.
Method	`has_frame_field`	Determines whether the collection has a frame-level field with the given name.
Method	`has_mask_targets`	Determines whether this collection has mask targets for the given field.
Method	`has_run`	Whether this collection has a run with the given key.
Method	`has_sample_field`	Determines whether the collection has a sample field with the given name.
Method	`has_skeleton`	Determines whether this collection has a keypoint skeleton for the given field.
Method	`histogram_values`	Computes a histogram of the field values in the collection.
Method	`init_run`	Initializes a config instance for a new run.
Method	`init_run_results`	Initializes a results instance for the run with the given key.
Method	`limit`	Returns a view with at most the given number of samples.
Method	`limit_labels`	Limits the number of `fiftyone.core.labels.Label` instances in the specified labels list field of each sample in the collection.
Method	`list_annotation_runs`	Returns a list of annotation keys on this collection.
Method	`list_brain_runs`	Returns a list of brain keys on this collection.
Method	`list_evaluations`	Returns a list of evaluation keys on this collection.
Method	`list_indexes`	Returns the list of index names on this collection.
Method	`list_runs`	Returns a list of run keys on this collection.
Method	`list_schema`	Extracts the value type(s) in a specified list field across all samples in the collection.
Method	`load_annotation_results`	Loads the results for the annotation run with the given key on this collection.
Method	`load_annotation_view`	Loads the `fiftyone.core.view.DatasetView` on which the specified annotation run was performed on this collection.
Method	`load_annotations`	Downloads the labels from the given annotation run from the annotation backend and merges them into this collection.
Method	`load_brain_results`	Loads the results for the brain method run with the given key on this collection.
Method	`load_brain_view`	Loads the `fiftyone.core.view.DatasetView` on which the specified brain method run was performed on this collection.
Method	`load_evaluation_results`	Loads the results for the evaluation with the given key on this collection.
Method	`load_evaluation_view`	Loads the `fiftyone.core.view.DatasetView` on which the specified evaluation was performed on this collection.
Method	`load_run_results`	Loads the results for the run with the given key on this collection.
Method	`load_run_view`	Loads the `fiftyone.core.view.DatasetView` on which the specified run was performed on this collection.
Method	`make_unique_field_name`	Makes a unique field name with the given root name for the collection.
Method	`map_labels`	Maps the `label` values of a `fiftyone.core.labels.Label` field to new values for each sample in the collection.
Method	`map_samples`	Applies the given function to each sample in the collection and returns the results as a generator.
Method	`map_values`	Maps the values in the given field to new values for each sample in the collection.
Method	`match`	Filters the samples in the collection by the given filter.
Method	`match_frames`	Filters the frames in the video collection by the given filter.
Method	`match_labels`	Selects the samples from the collection that contain (or do not contain) at least one label that matches the specified criteria.
Method	`match_tags`	Returns a view containing the samples in the collection that have or don't have any/all of the given tag(s).
Method	`max`	Computes the maximum of a numeric field of the collection.
Method	`mean`	Computes the arithmetic mean of the field values of the collection.
Method	`merge_labels`	Merges the labels from the given input field into the given output field of the collection.
Method	`min`	Computes the minimum of a numeric field of the collection.
Method	`mongo`	Adds a view stage defined by a raw MongoDB aggregation pipeline.
Method	`quantiles`	Computes the quantile(s) of the field values of a collection.
Method	`register_run`	Registers a run under the given key on this collection.
Method	`rename_annotation_run`	Replaces the key for the given annotation run with a new key.
Method	`rename_brain_run`	Replaces the key for the given brain run with a new key.
Method	`rename_evaluation`	Replaces the key for the given evaluation with a new key.
Method	`rename_run`	Replaces the key for the given run with a new key.
Method	`save_context`	Returns a context that can be used to save samples from this collection according to a configurable batching strategy.
Method	`save_run_results`	Saves run results for the run with the given key.
Method	`schema`	Extracts the names and types of the attributes of a specified embedded document field across all samples in the collection.
Method	`select`	Selects the samples with the given IDs from the collection.
Method	`select_by`	Selects the samples with the given field values from the collection.
Method	`select_fields`	Selects only the fields with the given names from the samples in the collection. All other fields are excluded.
Method	`select_frames`	Selects the frames with the given IDs from the video collection.
Method	`select_group_slices`	Selects the specified group slice(s) from the grouped collection.
Method	`select_groups`	Selects the groups with the given IDs from the grouped collection.
Method	`select_labels`	Selects only the specified labels from the collection.
Method	`set_field`	Sets a field or embedded field on each sample in a collection by evaluating the given expression.
Method	`set_label_values`	Sets the fields of the specified labels in the collection to the given values.
Method	`set_values`	Sets the field or embedded field on each sample or frame in the collection to the given values.
Method	`shuffle`	Randomly shuffles the samples in the collection.
Method	`skip`	Omits the given number of samples from the head of the collection.
Method	`sort_by`	Sorts the samples in the collection by the given field(s) or expression(s).
Method	`sort_by_similarity`	Sorts the collection by similarity to a specified query.
Method	`split_labels`	Splits the labels from the given input field into the given output field of the collection.
Method	`std`	Computes the standard deviation of the field values of the collection.
Method	`sum`	Computes the sum of the field values of the collection.
Method	`sync_last_modified_at`	Syncs the `last_modified_at` property(s) of the dataset.
Method	`tag_labels`	Adds the tag(s) to all labels in the specified label field(s) of this collection, if necessary.
Method	`tag_samples`	Adds the tag(s) to all samples in this collection, if necessary.
Method	`take`	Randomly samples the given number of samples from the collection.
Method	`to_clips`	Creates a view that contains one sample per clip defined by the given field or expression in the video collection.
Method	`to_dict`	Returns a JSON dictionary representation of the collection.
Method	`to_evaluation_patches`	Creates a view based on the results of the evaluation with the given key that contains one sample for each true positive, false positive, and false negative example in the collection, respectively.
Method	`to_frames`	Creates a view that contains one sample per frame in the video collection.
Method	`to_json`	Returns a JSON string representation of the collection.
Method	`to_patches`	Creates a view that contains one sample per object patch in the specified field of the collection.
Method	`to_torch`	See fo.utils.torch.FiftyOneTorchDataset for documentation.
Method	`to_trajectories`	Creates a view that contains one clip for each unique object trajectory defined by their `(label, index)` in a frame-level field of a video collection.
Method	`untag_labels`	Removes the tag from all labels in the specified label field(s) of this collection, if necessary.
Method	`untag_samples`	Removes the tag(s) from all samples in this collection, if necessary.
Method	`update_run_config`	Updates the run config for the run with the given key.
Method	`update_samples`	Applies the given function to each sample in the collection and saves the resulting sample edits.
Method	`validate_field_type`	Validates that the collection has a field of the given type.
Method	`validate_fields_exist`	Validates that the collection has field(s) with the given name(s).
Method	`values`	Extracts the values of a field from all samples in the collection.
Method	`write_json`	Writes the colllection to disk in JSON format.
Property	`has_annotation_runs`	Whether this collection has any annotation runs.
Property	`has_brain_runs`	Whether this collection has any brain runs.
Property	`has_evaluations`	Whether this collection has any evaluation results.
Property	`has_runs`	Whether this collection has any runs.
Async Method	`_async_aggregate`	Undocumented
Method	`_build_aggregation`	Undocumented
Method	`_build_batch_pipeline`	Undocumented
Method	`_build_big_pipeline`	Undocumented
Method	`_build_facets`	Undocumented
Method	`_contains_media_type`	Undocumented
Method	`_contains_videos`	Undocumented
Method	`_do_get_dynamic_field_schema`	Undocumented
Method	`_edit_label_tags`	Undocumented
Method	`_edit_sample_tags`	Undocumented
Method	`_expand_schema_from_values`	Undocumented
Method	`_get_db_fields_map`	Undocumented
Method	`_get_default_field`	Undocumented
Method	`_get_default_frame_fields`	Undocumented
Method	`_get_default_indexes`	Undocumented
Method	`_get_default_sample_fields`	Undocumented
Method	`_get_dynamic_field_schema`	Undocumented
Method	`_get_extremum`	Undocumented
Method	`_get_frame_label_field_schema`	Undocumented
Method	`_get_frames_bytes`	Computes the total size of the frame documents in the collection.
Method	`_get_geo_location_field`	Undocumented
Method	`_get_group_media_types`	Undocumented
Method	`_get_group_slices`	Undocumented
Method	`_get_label_attributes_schema`	Undocumented
Method	`_get_label_field_path`	Undocumented
Method	`_get_label_field_root`	Undocumented
Method	`_get_label_field_schema`	Undocumented
Method	`_get_label_field_type`	Undocumented
Method	`_get_label_fields`	Undocumented
Method	`_get_label_ids`	Undocumented
Method	`_get_media_fields`	Undocumented
Method	`_get_per_frame_bytes`	Returns a dictionary mapping frame IDs to document sizes (in bytes) for each frame in the video collection.
Method	`_get_per_sample_bytes`	Returns a dictionary mapping sample IDs to document sizes (in bytes) for each sample in the collection.
Method	`_get_per_sample_frames_bytes`	Returns a dictionary mapping sample IDs to total frame document sizes (in bytes) for each sample in the video collection.
Method	`_get_root_field_type`	Undocumented
Method	`_get_root_fields`	Undocumented
Method	`_get_samples_bytes`	Computes the total size of the sample documents in the collection.
Method	`_get_selected_labels`	Undocumented
Method	`_get_store`	Undocumented
Method	`_get_values_by_id`	Undocumented
Method	`_handle_db_field`	Undocumented
Method	`_handle_db_fields`	Undocumented
Method	`_handle_frame_field`	Undocumented
Method	`_handle_group_field`	Undocumented
Method	`_handle_id_fields`	Undocumented
Method	`_has_field`	Undocumented
Method	`_has_frame_fields`	Undocumented
Method	`_has_stores`	Undocumented
Method	`_is_default_field`	Undocumented
Method	`_is_frame_field`	Undocumented
Method	`_is_full_collection`	Undocumented
Method	`_is_group_field`	Undocumented
Method	`_is_label_field`	Undocumented
Method	`_is_read_only_field`	Undocumented
Method	`_list_stores`	Undocumented
Method	`_make_and_aggregate`	Undocumented
Method	`_make_set_field_pipeline`	Undocumented
Method	`_max`	Undocumented
Method	`_min`	Undocumented
Method	`_parse_aggregations`	Undocumented
Method	`_parse_big_result`	Undocumented
Method	`_parse_default_mask_targets`	Undocumented
Method	`_parse_default_skeleton`	Undocumented
Method	`_parse_faceted_result`	Undocumented
Method	`_parse_field`	Undocumented
Method	`_parse_field_name`	Undocumented
Method	`_parse_frame_labels_field`	Undocumented
Method	`_parse_label_field`	Undocumented
Method	`_parse_mask_targets`	Undocumented
Method	`_parse_media_field`	Undocumented
Method	`_parse_skeletons`	Undocumented
Method	`_process_aggregations`	Undocumented
Method	`_serialize_default_mask_targets`	Undocumented
Method	`_serialize_default_skeleton`	Undocumented
Method	`_serialize_field_schema`	Undocumented
Method	`_serialize_frame_field_schema`	Undocumented
Method	`_serialize_mask_targets`	Undocumented
Method	`_serialize_schema`	Undocumented
Method	`_serialize_skeletons`	Undocumented
Method	`_set_doc_values`	Undocumented
Method	`_set_frame_values`	Undocumented
Method	`_set_label_list_values`	Undocumented
Method	`_set_labels`	Undocumented
Method	`_set_list_values_by_id`	Undocumented
Method	`_set_sample_values`	Undocumented
Method	`_set_values`	Undocumented
Method	`_split_frame_fields`	Undocumented
Method	`_sync_dataset_last_modified_at`	Undocumented
Method	`_sync_samples_last_modified_at`	Undocumented
Method	`_tag_labels`	Undocumented
Method	`_to_fields_str`	Undocumented
Method	`_untag_labels`	Undocumented
Method	`_unwind_values`	Undocumented
Method	`_validate_root_field`	Undocumented
Constant	`_FRAMES_PREFIX`	Undocumented
Constant	`_GROUPS_PREFIX`	Undocumented
Property	`_element_str`	Undocumented
Property	`_elements_str`	Undocumented

@classmethod
def from_archive(cls, archive_path, dataset_type=None, data_path=None, labels_path=None, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, cleanup=True, progress=None, **kwargs): (source) ¶

Creates a Dataset from the contents of the given archive.

If a directory with the same root name as archive_path exists, it is assumed that this directory contains the extracted contents of the archive, and thus the archive is not re-extracted.

See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.

Note

The following archive formats are explicitly supported:

.zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz

If an archive not in the above list is found, extraction will be attempted via the patool package, which supports many formats but may require that additional system packages be installed.

Parameters
archive_path	the path to an archive of a dataset directory
dataset_type:`None`	the `fiftyone.types.Dataset` type of the dataset in `archive_path`
data_path:`None`	an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following: a folder name like `"data"` or `"data/"` specifying a subfolder of `dataset_dir` in which the media lies an absolute directory path in which the media lies. In this case, the `archive_path` has no effect on the location of the data a filename like `"data.json"` specifying the filename of a JSON manifest file in `archive_path` that maps UUIDs to media filepaths. Files of this format are generated when passing the `export_media="manifest"` option to `fiftyone.core.collections.SampleCollection.export` an absolute filepath to a JSON manifest file. In this case, `archive_path` has no effect on the location of the data a dict mapping filenames to absolute filepaths By default, it is assumed that the data can be located in the default location within `archive_path` for the dataset type
labels_path:`None`	an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following: a type-specific folder name like `"labels"` or `"labels/"` or a filename like `"labels.json"` or `"labels.xml"` specifying the location in `archive_path` of the labels file(s) an absolute directory or filepath containing the labels file(s). In this case, `archive_path` has no effect on the location of the labels For labeled datasets, this parameter defaults to the location in `archive_path` of the labels for the default layout of the dataset type being imported
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
cleanup:`True`	whether to delete the archive after extracting it
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
**kwargs	optional keyword arguments to pass to the constructor of the `fiftyone.utils.data.importers.DatasetImporter` for the specified `dataset_type`
Returns
a `Dataset`

@classmethod
def from_dict(cls, d, name=None, persistent=False, overwrite=False, rel_dir=None, frame_labels_dir=None, progress=None): (source) ¶

Loads a Dataset from a JSON dictionary generated by fiftyone.core.collections.SampleCollection.to_dict.

The JSON dictionary can contain an export of any fiftyone.core.collections.SampleCollection, e.g., Dataset or fiftyone.core.view.DatasetView.

Parameters
d	a JSON dictionary
name:`None`	a name for the new dataset
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
rel_dir:`None`	a relative directory to prepend to the `filepath` of each sample if the filepath is not absolute (begins with a path separator). The path is converted to an absolute path (if necessary) via `fiftyone.core.storage.normalize_path`
frame_labels_dir:`None`	a directory of per-sample JSON files containing the frame labels for video samples. If omitted, it is assumed that the frame labels are included directly in the provided JSON dict. Only applicable to datasets that contain videos
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_dir(cls, dataset_dir=None, dataset_type=None, data_path=None, labels_path=None, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, progress=None, **kwargs): (source) ¶

Creates a Dataset from the contents of the given directory.

You can create datasets with this method via the following basic patterns:

Provide dataset_dir and dataset_type to import the contents of a directory that is organized in the default layout for the dataset type as documented in :ref:`this guide <loading-datasets-from-disk>`
Provide dataset_type along with data_path, labels_path, or other type-specific parameters to perform a customized import. This syntax provides the flexibility to, for example, perform labels-only imports or imports where the source media lies in a different location than the labels

In either workflow, the remaining parameters of this method can be provided to further configure the import.

See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.

Parameters
dataset_dir:`None`	the dataset directory. This can be omitted if you provide arguments such as `data_path` and `labels_path`
dataset_type:`None`	the `fiftyone.types.Dataset` type of the dataset
data_path:`None`	an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following: a folder name like `"data"` or `"data/"` specifying a subfolder of `dataset_dir` in which the media lies an absolute directory path in which the media lies. In this case, the `dataset_dir` has no effect on the location of the data a filename like `"data.json"` specifying the filename of a JSON manifest file in `dataset_dir` that maps UUIDs to media filepaths. Files of this format are generated when passing the `export_media="manifest"` option to `fiftyone.core.collections.SampleCollection.export` an absolute filepath to a JSON manifest file. In this case, `dataset_dir` has no effect on the location of the data a dict mapping filenames to absolute filepaths By default, it is assumed that the data can be located in the default location within `dataset_dir` for the dataset type
labels_path:`None`	an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following: a type-specific folder name like `"labels"` or `"labels/"` or a filename like `"labels.json"` or `"labels.xml"` specifying the location in `dataset_dir` of the labels file(s) an absolute directory or filepath containing the labels file(s). In this case, `dataset_dir` has no effect on the location of the labels For labeled datasets, this parameter defaults to the location in `dataset_dir` of the labels for the default layout of the dataset type being imported
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
**kwargs	optional keyword arguments to pass to the constructor of the `fiftyone.utils.data.importers.DatasetImporter` for the specified `dataset_type`
Returns
a `Dataset`

@classmethod
def from_images(cls, paths_or_samples, sample_parser=None, name=None, persistent=False, overwrite=False, tags=None, progress=None): (source) ¶

Creates a Dataset from the given images.

This operation does not read the images.

See :ref:`this guide <custom-sample-parser>` for more details about providing a custom UnlabeledImageSampleParser to load image samples into FiftyOne.

Parameters
paths_or_samples	an iterable of data. If no `sample_parser` is provided, this must be an iterable of image paths. If a `sample_parser` is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser
sample_parser:`None`	a `fiftyone.utils.data.parsers.UnlabeledImageSampleParser` instance to use to parse the samples
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
tags:`None`	an optional tag or iterable of tags to attach to each sample
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_images_dir(cls, images_dir, name=None, persistent=False, overwrite=False, tags=None, recursive=True, progress=None): (source) ¶

Creates a Dataset from the given directory of images.

This operation does not read the images.

Parameters
images_dir	a directory of images
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
tags:`None`	an optional tag or iterable of tags to attach to each sample
recursive:`True`	whether to recursively traverse subdirectories
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_images_patt(cls, images_patt, name=None, persistent=False, overwrite=False, tags=None, progress=None): (source) ¶

Creates a Dataset from the given glob pattern of images.

This operation does not read the images.

Parameters
images_patt	a glob pattern of images like `/path/to/images/*.jpg`
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
tags:`None`	an optional tag or iterable of tags to attach to each sample
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_importer(cls, dataset_importer, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, progress=None): (source) ¶

Creates a Dataset by importing the samples in the given fiftyone.utils.data.importers.DatasetImporter.

See :ref:`this guide <custom-dataset-importer>` for more details about providing a custom DatasetImporter to import datasets into FiftyOne.

Parameters
dataset_importer	a `fiftyone.utils.data.importers.DatasetImporter`
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_json(cls, path_or_str, name=None, persistent=False, overwrite=False, rel_dir=None, frame_labels_dir=None, progress=None): (source) ¶

Loads a Dataset from JSON generated by fiftyone.core.collections.SampleCollection.write_json or fiftyone.core.collections.SampleCollection.to_json.

The JSON file can contain an export of any fiftyone.core.collections.SampleCollection, e.g., Dataset or fiftyone.core.view.DatasetView.

Parameters
path_or_str	the path to a JSON file on disk or a JSON string
name:`None`	a name for the new dataset
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
rel_dir:`None`	a relative directory to prepend to the `filepath` of each sample, if the filepath is not absolute (begins with a path separator). The path is converted to an absolute path (if necessary) via `fiftyone.core.storage.normalize_path`
frame_labels_dir	Undocumented
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_labeled_images(cls, samples, sample_parser, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, progress=None): (source) ¶

Creates a Dataset from the given labeled images.

This operation will iterate over all provided samples, but the images will not be read.

See :ref:`this guide <custom-sample-parser>` for more details about providing a custom LabeledImageSampleParser to load labeled image samples into FiftyOne.

Parameters
samples	an iterable of data
sample_parser	a `fiftyone.utils.data.parsers.LabeledImageSampleParser` instance to use to parse the samples
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
label_field:`None`	controls the field(s) in which imported labels are stored. If the parser produces a single `fiftyone.core.labels.Label` instance per sample, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the parser produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_labeled_videos(cls, samples, sample_parser, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, progress=None): (source) ¶

Creates a Dataset from the given labeled videos.

This operation will iterate over all provided samples, but the videos will not be read/decoded/etc.

See :ref:`this guide <custom-sample-parser>` for more details about providing a custom LabeledVideoSampleParser to load labeled video samples into FiftyOne.

Parameters
samples	an iterable of data
sample_parser	a `fiftyone.utils.data.parsers.LabeledVideoSampleParser` instance to use to parse the samples
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
label_field:`None`	controls the field(s) in which imported labels are stored. If the parser produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the parser produces a dictionary of labels per sample/frame, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_videos(cls, paths_or_samples, sample_parser=None, name=None, persistent=False, overwrite=False, tags=None, progress=None): (source) ¶

Creates a Dataset from the given videos.

This operation does not read/decode the videos.

See :ref:`this guide <custom-sample-parser>` for more details about providing a custom UnlabeledVideoSampleParser to load video samples into FiftyOne.

Parameters
paths_or_samples	an iterable of data. If no `sample_parser` is provided, this must be an iterable of video paths. If a `sample_parser` is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser
sample_parser:`None`	a `fiftyone.utils.data.parsers.UnlabeledVideoSampleParser` instance to use to parse the samples
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
tags:`None`	an optional tag or iterable of tags to attach to each sample
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a `Dataset`

@classmethod
def from_videos_dir(cls, videos_dir, name=None, persistent=False, overwrite=False, tags=None, recursive=True, progress=None): (source) ¶

Creates a Dataset from the given directory of videos.

This operation does not read/decode the videos.

Parameters
videos_dir	a directory of videos
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
tags:`None`	an optional tag or iterable of tags to attach to each sample
recursive:`True`	whether to recursively traverse subdirectories
progress	Undocumented
Returns
a `Dataset`

@classmethod
def from_videos_patt(cls, videos_patt, name=None, persistent=False, overwrite=False, tags=None, progress=None): (source) ¶

Creates a Dataset from the given glob pattern of videos.

This operation does not read/decode the videos.

Parameters
videos_patt	a glob pattern of videos like `/path/to/videos/*.mp4`
name:`None`	a name for the dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the dataset should persist in the database after the session terminates
overwrite:`False`	whether to overwrite an existing dataset of the same name
tags:`None`	an optional tag or iterable of tags to attach to each sample
progress	Undocumented
Returns
a `Dataset`

def __copy__(self): (source) ¶

Undocumented

def __deepcopy__(self, memo): (source) ¶

Undocumented

def __delitem__(self, samples_or_ids): (source) ¶

Undocumented

def __eq__(self, other): (source) ¶

Undocumented

def __getattribute__(self, name): (source) ¶

Undocumented

def __getitem__(self, id_filepath_slice): (source) ¶

overrides fiftyone.core.collections.SampleCollection.__getitem__

Undocumented

def __init__(self, name=None, persistent=False, overwrite=False, _create=True, _virtual=False, **kwargs): (source) ¶

Undocumented

def __len__(self): (source) ¶

overrides fiftyone.core.collections.SampleCollection.__len__

Undocumented

def add_archive(self, archive_path, dataset_type=None, data_path=None, labels_path=None, label_field=None, tags=None, expand_schema=True, dynamic=False, add_info=True, cleanup=True, generator=False, progress=None, **kwargs): (source) ¶

Adds the contents of the given archive to the dataset.

If a directory with the same root name as archive_path exists, it is assumed that this directory contains the extracted contents of the archive, and thus the archive is not re-extracted.

See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.

Note

The following archive formats are explicitly supported:

.zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz

If an archive not in the above list is found, extraction will be attempted via the patool package, which supports many formats but may require that additional system packages be installed.

Parameters
archive_path	the path to an archive of a dataset directory
dataset_type:`None`	the `fiftyone.types.Dataset` type of the dataset in `archive_path`
data_path:`None`	an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following: a folder name like `"data"` or `"data/"` specifying a subfolder of `dataset_dir` in which the media lies an absolute directory path in which the media lies. In this case, the `archive_path` has no effect on the location of the data a filename like `"data.json"` specifying the filename of a JSON manifest file in `archive_path` that maps UUIDs to media filepaths. Files of this format are generated when passing the `export_media="manifest"` option to `fiftyone.core.collections.SampleCollection.export` an absolute filepath to a JSON manifest file. In this case, `archive_path` has no effect on the location of the data a dict mapping filenames to absolute filepaths By default, it is assumed that the data can be located in the default location within `archive_path` for the dataset type
labels_path:`None`	an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following: a type-specific folder name like `"labels"` or `"labels/"` or a filename like `"labels.json"` or `"labels.xml"` specifying the location in `archive_path` of the labels file(s) an absolute directory or filepath containing the labels file(s). In this case, `archive_path` has no effect on the location of the labels For labeled datasets, this parameter defaults to the location in `archive_path` of the labels for the default layout of the dataset type being imported
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
add_info:`True`	whether to add dataset info from the importer (if any) to the dataset's `info`
cleanup:`True`	whether to delete the archive after extracting it
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
**kwargs	optional keyword arguments to pass to the constructor of the `fiftyone.utils.data.importers.DatasetImporter` for the specified `dataset_type`
Returns
a list of IDs of the samples that were added to the dataset

def add_collection(self, sample_collection, include_info=True, overwrite_info=False, new_ids=False, progress=None): (source) ¶

Adds the contents of the given collection to the dataset.

This method is a special case of Dataset.merge_samples that adds samples with new IDs to this dataset and omits any samples with existing IDs (the latter would only happen in rare cases).

Use Dataset.merge_samples if you have multiple datasets whose samples refer to the same source media.

Parameters
sample_collection	a `fiftyone.core.collections.SampleCollection`
include_info:`True`	whether to merge dataset-level information such as `info` and `classes`
overwrite_info:`False`	whether to overwrite existing dataset-level information. Only applicable when `include_info` is True
new_ids:`False`	whether to generate new sample/frame/group IDs. By default, the IDs of the input collection are retained
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples that were added to this dataset

def add_dir(self, dataset_dir=None, dataset_type=None, data_path=None, labels_path=None, label_field=None, tags=None, expand_schema=True, dynamic=False, add_info=True, generator=False, progress=None, **kwargs): (source) ¶

Adds the contents of the given directory to the dataset.

You can perform imports with this method via the following basic patterns:

Provide dataset_dir and dataset_type to import the contents of a directory that is organized in the default layout for the dataset type as documented in :ref:`this guide <loading-datasets-from-disk>`
Provide dataset_type along with data_path, labels_path, or other type-specific parameters to perform a customized import. This syntax provides the flexibility to, for example, perform labels-only imports or imports where the source media lies in a different location than the labels

In either workflow, the remaining parameters of this method can be provided to further configure the import.

See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.

Parameters
dataset_dir:`None`	the dataset directory. This can be omitted for certain dataset formats if you provide arguments such as `data_path` and `labels_path`
dataset_type:`None`	the `fiftyone.types.Dataset` type of the dataset
data_path:`None`	an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following: a folder name like `"data"` or `"data/"` specifying a subfolder of `dataset_dir` in which the media lies an absolute directory path in which the media lies. In this case, the `dataset_dir` has no effect on the location of the data a filename like `"data.json"` specifying the filename of a JSON manifest file in `dataset_dir` that maps UUIDs to media filepaths. Files of this format are generated when passing the `export_media="manifest"` option to `fiftyone.core.collections.SampleCollection.export` an absolute filepath to a JSON manifest file. In this case, `dataset_dir` has no effect on the location of the data a dict mapping filenames to absolute filepaths By default, it is assumed that the data can be located in the default location within `dataset_dir` for the dataset type
labels_path:`None`	an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following: a type-specific folder name like `"labels"` or `"labels/"` or a filename like `"labels.json"` or `"labels.xml"` specifying the location in `dataset_dir` of the labels file(s) an absolute directory or filepath containing the labels file(s). In this case, `dataset_dir` has no effect on the location of the labels For labeled datasets, this parameter defaults to the location in `dataset_dir` of the labels for the default layout of the dataset type being imported
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
add_info:`True`	whether to add dataset info from the importer (if any) to the dataset's `info`
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
**kwargs	optional keyword arguments to pass to the constructor of the `fiftyone.utils.data.importers.DatasetImporter` for the specified `dataset_type`
Returns
a list of IDs of the samples that were added to the dataset

def add_dynamic_frame_fields(self, fields=None, recursive=True, add_mixed=False): (source) ¶

Adds all dynamic frame fields to the dataset's schema.

Dynamic fields are embedded document fields with at least one non-None value that have not been declared on the dataset's schema.

Parameters
fields:`None`	an optional field or iterable of fields for which to add dynamic fields. By default, all fields are considered
recursive:`True`	whether to recursively inspect nested lists and embedded documents for dynamic fields
add_mixed:`False`	whether to declare fields that contain values of mixed types as generic `fiftyone.core.fields.Field` instances (True) or to skip such fields (False)

def add_dynamic_sample_fields(self, fields=None, recursive=True, add_mixed=False): (source) ¶

Adds all dynamic sample fields to the dataset's schema.

Dynamic fields are embedded document fields with at least one non-None value that have not been declared on the dataset's schema.

Parameters
fields:`None`	an optional field or iterable of fields for which to add dynamic fields. By default, all fields are considered
recursive:`True`	whether to recursively inspect nested lists and embedded documents for dynamic fields
add_mixed:`False`	whether to declare fields that contain values of mixed types as generic `fiftyone.core.fields.Field` instances (True) or to skip such fields (False)

def add_frame_field(self, field_name, ftype, embedded_doc_type=None, subfield=None, fields=None, description=None, info=None, read_only=False, **kwargs): (source) ¶

Adds a new frame-level field or embedded field to the dataset, if necessary.

Only applicable to datasets that contain videos.

Parameters
field_name	the field name or `embedded.field.name`
ftype	the field type to create. Must be a subclass of `fiftyone.core.fields.Field`
embedded_doc_type:`None`	the `fiftyone.core.odm.BaseEmbeddedDocument` type of the field. Only applicable when `ftype` is `fiftyone.core.fields.EmbeddedDocumentField`
subfield:`None`	the `fiftyone.core.fields.Field` type of the contained field. Only applicable when `ftype` is `fiftyone.core.fields.ListField` or `fiftyone.core.fields.DictField`
fields:`None`	a list of `fiftyone.core.fields.Field` instances defining embedded document attributes. Only applicable when `ftype` is `fiftyone.core.fields.EmbeddedDocumentField`
description:`None`	an optional description
info:`None`	an optional info dict
read_only:`False`	whether the field should be read-only
**kwargs	Undocumented
Raises
`ValueError`	if a field of the same name already exists and it is not compliant with the specified values

def add_group_field(self, field_name, default=None, description=None, info=None, read_only=False): (source) ¶

Adds a group field to the dataset, if necessary.

Parameters
field_name	the field name
default:`None`	a default group slice for the field
description:`None`	an optional description
info:`None`	an optional info dict
read_only:`False`	whether the field should be read-only
Raises
`ValueError`	if a group field with another name already exists

def add_group_slice(self, name, media_type): (source) ¶

Adds a group slice with the given media type to the dataset, if necessary.

Parameters
name	a group slice name
media_type	the media type of the slice

def add_images(self, paths_or_samples, sample_parser=None, tags=None, generator=False, progress=None): (source) ¶

Adds the given images to the dataset.

This operation does not read the images.

See :ref:`this guide <custom-sample-parser>` for more details about adding images to a dataset by defining your own UnlabeledImageSampleParser.

Parameters
paths_or_samples	an iterable of data. If no `sample_parser` is provided, this must be an iterable of image paths. If a `sample_parser` is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser
sample_parser:`None`	a `fiftyone.utils.data.parsers.UnlabeledImageSampleParser` instance to use to parse the samples
tags:`None`	an optional tag or iterable of tags to attach to each sample
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples that were added to the dataset

def add_images_dir(self, images_dir, tags=None, recursive=True, generator=False, progress=None): (source) ¶

Adds the given directory of images to the dataset.

See fiftyone.types.ImageDirectory for format details. In particular, note that files with non-image MIME types are omitted.

This operation does not read the images.

Parameters
images_dir	a directory of images
tags:`None`	an optional tag or iterable of tags to attach to each sample
recursive:`True`	whether to recursively traverse subdirectories
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples in the dataset

def add_images_patt(self, images_patt, tags=None, generator=False, progress=None): (source) ¶

Adds the given glob pattern of images to the dataset.

This operation does not read the images.

Parameters
images_patt	a glob pattern of images like `/path/to/images/*.jpg`
tags:`None`	an optional tag or iterable of tags to attach to each sample
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples in the dataset

def add_importer(self, dataset_importer, label_field=None, tags=None, expand_schema=True, dynamic=False, add_info=True, generator=False, progress=None): (source) ¶

Adds the samples from the given fiftyone.utils.data.importers.DatasetImporter to the dataset.

See :ref:`this guide <custom-dataset-importer>` for more details about importing datasets in custom formats by defining your own DatasetImporter.

Parameters
dataset_importer	a `fiftyone.utils.data.importers.DatasetImporter`
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
add_info:`True`	whether to add dataset info from the importer (if any) to the dataset's `info`
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples that were added to the dataset

def add_labeled_images(self, samples, sample_parser, label_field=None, tags=None, expand_schema=True, dynamic=False, generator=False, progress=None): (source) ¶

Adds the given labeled images to the dataset.

This operation will iterate over all provided samples, but the images will not be read (unless the sample parser requires it in order to compute image metadata).

See :ref:`this guide <custom-sample-parser>` for more details about adding labeled images to a dataset by defining your own LabeledImageSampleParser.

Parameters
samples	an iterable of data
sample_parser	a `fiftyone.utils.data.parsers.LabeledImageSampleParser` instance to use to parse the samples
label_field:`None`	controls the field(s) in which imported labels are stored. If the parser produces a single `fiftyone.core.labels.Label` instance per sample, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the parser produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples that were added to the dataset

def add_labeled_videos(self, samples, sample_parser, label_field=None, tags=None, expand_schema=True, dynamic=False, generator=False, progress=None): (source) ¶

Adds the given labeled videos to the dataset.

This operation will iterate over all provided samples, but the videos will not be read/decoded/etc.

See :ref:`this guide <custom-sample-parser>` for more details about adding labeled videos to a dataset by defining your own LabeledVideoSampleParser.

Parameters
samples	an iterable of data
sample_parser	a `fiftyone.utils.data.parsers.LabeledVideoSampleParser` instance to use to parse the samples
label_field:"ground_truth"	the name (or root name) of the frame field(s) to use for the labels
tags:`None`	an optional tag or iterable of tags to attach to each sample
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples that were added to the dataset

def add_sample(self, sample, expand_schema=True, dynamic=False, validate=True): (source) ¶

Adds the given sample to the dataset.

If the sample instance does not belong to a dataset, it is updated in-place to reflect its membership in this dataset. If the sample instance belongs to another dataset, it is not modified.

Parameters
sample	a `fiftyone.core.sample.Sample`
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if the sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
validate:`True`	whether to validate that the fields of the sample are compliant with the dataset schema before adding it
Returns
the ID of the sample in the dataset

def add_sample_field(self, field_name, ftype, embedded_doc_type=None, subfield=None, fields=None, description=None, info=None, read_only=False, **kwargs): (source) ¶

Adds a new sample field or embedded field to the dataset, if necessary.

Parameters
field_name	the field name or `embedded.field.name`
ftype	the field type to create. Must be a subclass of `fiftyone.core.fields.Field`
embedded_doc_type:`None`	the `fiftyone.core.odm.BaseEmbeddedDocument` type of the field. Only applicable when `ftype` is `fiftyone.core.fields.EmbeddedDocumentField`
subfield:`None`	the `fiftyone.core.fields.Field` type of the contained field. Only applicable when `ftype` is `fiftyone.core.fields.ListField` or `fiftyone.core.fields.DictField`
fields:`None`	a list of `fiftyone.core.fields.Field` instances defining embedded document attributes. Only applicable when `ftype` is `fiftyone.core.fields.EmbeddedDocumentField`
description:`None`	an optional description
info:`None`	an optional info dict
read_only:`False`	whether the field should be read-only
**kwargs	Undocumented
Raises
`ValueError`	if a field of the same name already exists and it is not compliant with the specified values

def add_samples(self, samples, expand_schema=True, dynamic=False, validate=True, generator=False, progress=None, num_samples=None): (source) ¶

Adds the given samples to the dataset.

Any sample instances that do not belong to a dataset are updated in-place to reflect membership in this dataset. Any sample instances that belong to other datasets are not modified.

Parameters
samples	an iterable of `fiftyone.core.sample.Sample` instances or a `fiftyone.core.collections.SampleCollection`
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
validate:`True`	whether to validate that the fields of each sample are compliant with the dataset schema before adding it
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
num_samples:`None`	the number of samples in `samples`. If not provided, this is computed (if possible) via `len(samples)` if needed for progress tracking
Returns
a list of IDs of the samples in the dataset

def add_videos(self, paths_or_samples, sample_parser=None, tags=None, generator=False, progress=None): (source) ¶

Adds the given videos to the dataset.

This operation does not read the videos.

See :ref:`this guide <custom-sample-parser>` for more details about adding videos to a dataset by defining your own UnlabeledVideoSampleParser.

Parameters
paths_or_samples	an iterable of data. If no `sample_parser` is provided, this must be an iterable of video paths. If a `sample_parser` is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser
sample_parser:`None`	a `fiftyone.utils.data.parsers.UnlabeledVideoSampleParser` instance to use to parse the samples
tags:`None`	an optional tag or iterable of tags to attach to each sample
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples that were added to the dataset

def add_videos_dir(self, videos_dir, tags=None, recursive=True, generator=False, progress=None): (source) ¶

Adds the given directory of videos to the dataset.

See fiftyone.types.VideoDirectory for format details. In particular, note that files with non-video MIME types are omitted.

This operation does not read/decode the videos.

Parameters
videos_dir	a directory of videos
tags:`None`	an optional tag or iterable of tags to attach to each sample
recursive:`True`	whether to recursively traverse subdirectories
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples in the dataset

def add_videos_patt(self, videos_patt, tags=None, generator=False, progress=None): (source) ¶

Adds the given glob pattern of videos to the dataset.

This operation does not read/decode the videos.

Parameters
videos_patt	a glob pattern of videos like `/path/to/videos/*.mp4`
tags:`None`	an optional tag or iterable of tags to attach to each sample
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples in the dataset

@app_config.setter
def app_config(self, config): (source) ¶

Undocumented

def check_summary_fields(self): (source) ¶

Returns a list of summary fields that may need to be updated.

Summary fields may need to be updated whenever there have been modifications to the dataset's samples since the summaries were last generated.

Note that inclusion in this list is only a heuristic, as any sample modifications may not have affected the summary's source field.

Returns
list of summary field names

@classes.setter
def classes(self, classes): (source) ¶

Undocumented

def clear(self): (source) ¶

Removes all samples from the dataset.

If reference to a sample exists in memory, the sample will be updated such that sample.in_dataset is False.

def clear_cache(self): (source) ¶

Clears the dataset's in-memory cache.

Dataset caches may contain sample/frame singletons and annotation/brain/evaluation/custom runs.

def clear_frame_field(self, field_name): (source) ¶

Clears the values of the frame-level field from all samples in the dataset.

The field will remain in the dataset's frame schema, and all frames will have the value None for the field.

You can use dot notation (embedded.field.name) to clear embedded frame fields.

Only applicable to datasets that contain videos.

Parameters
field_name	the field name or `embedded.field.name`

def clear_frame_fields(self, field_names): (source) ¶

Clears the values of the frame-level fields from all samples in the dataset.

The fields will remain in the dataset's frame schema, and all frames will have the value None for the field.

You can use dot notation (embedded.field.name) to clear embedded frame fields.

Only applicable to datasets that contain videos.

Parameters
field_names	the field name or iterable of field names

def clear_frames(self): (source) ¶

Removes all frame labels from the dataset.

If reference to a frame exists in memory, the frame will be updated such that frame.in_dataset is False.

def clear_sample_field(self, field_name): (source) ¶

Clears the values of the field from all samples in the dataset.

The field will remain in the dataset's schema, and all samples will have the value None for the field.

You can use dot notation (embedded.field.name) to clear embedded fields.

Parameters
field_name	the field name or `embedded.field.name`

def clear_sample_fields(self, field_names): (source) ¶

Clears the values of the fields from all samples in the dataset.

The field will remain in the dataset's schema, and all samples will have the value None for the field.

You can use dot notation (embedded.field.name) to clear embedded fields.

Parameters
field_names	the field name or iterable of field names

def clone(self, name=None, persistent=False): (source) ¶

Creates a copy of the dataset.

Dataset clones contain deep copies of all samples and dataset-level information in the source dataset. The source media files, however, are not copied.

Parameters
name:`None`	a name for the cloned dataset. By default, `get_default_dataset_name` is used
persistent:`False`	whether the cloned dataset should be persistent
Returns
the new `Dataset`

def clone_frame_field(self, field_name, new_field_name): (source) ¶

Clones the frame-level field into a new field.

You can use dot notation (embedded.field.name) to clone embedded frame fields.

Only applicable to datasets that contain videos.

Parameters
field_name	the field name or `embedded.field.name`
new_field_name	the new field name or `embedded.field.name`

def clone_frame_fields(self, field_mapping): (source) ¶

Clones the frame-level fields into new fields.

You can use dot notation (embedded.field.name) to clone embedded frame fields.

Only applicable to datasets that contain videos.

Parameters
field_mapping	a dict mapping field names to new field names into which to clone each field

def clone_sample_field(self, field_name, new_field_name): (source) ¶

Clones the given sample field into a new field of the dataset.

You can use dot notation (embedded.field.name) to clone embedded fields.

Parameters
field_name	the field name or `embedded.field.name`
new_field_name	the new field name or `embedded.field.name`

def clone_sample_fields(self, field_mapping): (source) ¶

Clones the given sample fields into new fields of the dataset.

You can use dot notation (embedded.field.name) to clone embedded fields.

Parameters
field_mapping	a dict mapping field names to new field names into which to clone each field

def create_summary_field(self, path, field_name=None, sidebar_group=None, include_counts=False, group_by=None, read_only=True, create_index=True): (source) ¶

Populates a sample-level field that records the unique values or numeric ranges that appear in the specified field on each sample in the dataset.

This method is particularly useful for summarizing frame-level fields of video datasets, in which case the sample-level field records the unique values or numeric ranges that appear in the specified frame-level field across all frames of that sample. This summary field can then be efficiently queried to retrieve samples that contain specific values of interest in at least one frame.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart-video")
dataset.set_field("frames.detections.detections.confidence", F.rand()).save()

# Generate a summary field for object labels
dataset.create_summary_field("frames.detections.detections.label")

# Generate a summary field for [min, max] confidences
dataset.create_summary_field("frames.detections.detections.confidence")

# Generate a summary field for object labels and counts
dataset.create_summary_field(
    "frames.detections.detections.label",
    field_name="frames_detections_label2",
    include_counts=True,
)

# Generate a summary field for per-label [min, max] confidences
dataset.create_summary_field(
    "frames.detections.detections.confidence",
    field_name="frames_detections_confidence2",
    group_by="label",
)

print(dataset.list_summary_fields())

Parameters
path	an input field path
field_name:`None`	the sample-level field in which to store the summary data. By default, a suitable name is derived from the given `path`
sidebar_group:`None`	the name of a :ref:`App sidebar group <app-sidebar-groups>` to which to add the summary field. By default, all summary fields are added to a `"summaries"` group. You can pass `False` to skip sidebar group modification
include_counts:`False`	whether to include per-value counts when summarizing categorical fields
group_by:`None`	an optional attribute to group by when `path` is a numeric field to generate per-attribute `[min, max]` ranges. This may either be an absolute path or an attribute name that is interpreted relative to `path`
read_only:`True`	whether to mark the summary field as read-only
create_index:`True`	whether to create database index(es) for the summary field
Returns
the summary field name

@default_classes.setter
def default_classes(self, classes): (source) ¶

Undocumented

@default_group_slice.setter
def default_group_slice(self, slice_name): (source) ¶

Undocumented

@default_mask_targets.setter
def default_mask_targets(self, targets): (source) ¶

Undocumented

@default_skeleton.setter
def default_skeleton(self, skeleton): (source) ¶

Undocumented

def delete(self): (source) ¶

Deletes the dataset.

Once deleted, only the name and deleted attributes of a dataset may be accessed.

If reference to a sample exists in memory, the sample will be updated such that sample.in_dataset is False.

def delete_frame_field(self, field_name, error_level=0): (source) ¶

Deletes the frame-level field from all samples in the dataset.

You can use dot notation (embedded.field.name) to delete embedded frame fields.

Only applicable to datasets that contain videos.

Parameters
field_name	the field name or `embedded.field.name`
error_level:0	the error level to use. Valid values are:
- 0	raise error if a top-level field cannot be deleted
- 1	log warning if a top-level field cannot be deleted
- 2	ignore top-level fields that cannot be deleted

def delete_frame_fields(self, field_names, error_level=0): (source) ¶

Deletes the frame-level fields from all samples in the dataset.

You can use dot notation (embedded.field.name) to delete embedded frame fields.

Only applicable to datasets that contain videos.

Parameters
field_names	a field name or iterable of field names
error_level:0	the error level to use. Valid values are:
- 0	raise error if a top-level field cannot be deleted
- 1	log warning if a top-level field cannot be deleted
- 2	ignore top-level fields that cannot be deleted

def delete_frames(self, frames_or_ids): (source) ¶

Deletes the given frames(s) from the dataset.

If reference to a frame exists in memory, the frame will be updated such that frame.in_dataset is False.

Parameters

frames_or_ids

the frame(s) to delete. Can be any of the following:

a frame ID
an iterable of frame IDs
a fiftyone.core.frame.Frame or fiftyone.core.frame.FrameView
a fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView whose frames to delete
an iterable of fiftyone.core.frame.Frame or fiftyone.core.frame.FrameView instances
an iterable of fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView instances whose frames to delete
a fiftyone.core.collections.SampleCollection whose frames to delete

def delete_group_slice(self, name): (source) ¶

Deletes all samples in the given group slice from the dataset.

Parameters
name	a group slice name

def delete_groups(self, groups_or_ids): (source) ¶

Deletes the given groups(s) from the dataset.

If reference to a sample exists in memory, the sample will be updated such that sample.in_dataset is False.

Parameters

groups_or_ids

the group(s) to delete. Can be any of the following:

a group ID
an iterable of group IDs
a fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView
a group dict returned by get_group()
an iterable of fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView instances
an iterable of group dicts returned by get_group()
a fiftyone.core.collections.SampleCollection

def delete_labels(self, labels=None, ids=None, tags=None, view=None, fields=None): (source) ¶

Deletes the specified labels from the dataset.

You can specify the labels to delete via any of the following methods:

Provide the labels argument, which should contain a list of dicts in the format returned by fiftyone.core.session.Session.selected_labels
Provide the ids or tags arguments to specify the labels to delete via their IDs and/or tags
Provide the view argument to delete all of the labels in a view into this dataset. This syntax is useful if you have constructed a fiftyone.core.view.DatasetView defining the labels to delete

Additionally, you can specify the fields argument to restrict deletion to specific field(s), either for efficiency or to ensure that labels from other fields are not deleted if their contents are included in the other arguments.

Parameters
labels:`None`	a list of dicts specifying the labels to delete in the format returned by `fiftyone.core.session.Session.selected_labels`
ids:`None`	an ID or iterable of IDs of the labels to delete
tags:`None`	a tag or iterable of tags of the labels to delete
view:`None`	a `fiftyone.core.view.DatasetView` into this dataset containing the labels to delete
fields:`None`	a field or iterable of fields from which to delete labels

def delete_sample_field(self, field_name, error_level=0): (source) ¶

Deletes the field from all samples in the dataset.

You can use dot notation (embedded.field.name) to delete embedded fields.

Parameters
field_name	the field name or `embedded.field.name`
error_level:0	the error level to use. Valid values are:
- 0	raise error if a top-level field cannot be deleted
- 1	log warning if a top-level field cannot be deleted
- 2	ignore top-level fields that cannot be deleted

def delete_sample_fields(self, field_names, error_level=0): (source) ¶

Deletes the fields from all samples in the dataset.

You can use dot notation (embedded.field.name) to delete embedded fields.

Parameters
field_names	the field name or iterable of field names
error_level:0	the error level to use. Valid values are:
- 0	raise error if a top-level field cannot be deleted
- 1	log warning if a top-level field cannot be deleted
- 2	ignore top-level fields that cannot be deleted

def delete_samples(self, samples_or_ids): (source) ¶

Deletes the given sample(s) from the dataset.

If reference to a sample exists in memory, the sample will be updated such that sample.in_dataset is False.

Parameters

samples_or_ids

the sample(s) to delete. Can be any of the following:

a sample ID
an iterable of sample IDs
a fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView
an iterable of fiftyone.core.sample.Sample or fiftyone.core.sample.SampleView instances
a fiftyone.core.collections.SampleCollection

def delete_saved_view(self, name): (source) ¶

Deletes the saved view with the given name.

Parameters
name	the name of a saved view

def delete_saved_views(self): (source) ¶

Deletes all saved views from this dataset.

def delete_summary_field(self, field_name, error_level=0): (source) ¶

Deletes the summary field from all samples in the dataset.

Parameters
field_name	the summary field
error_level:0	the error level to use. Valid values are:
- 0	raise error if a summary field cannot be deleted
- 1	log warning if a summary field cannot be deleted
- 2	ignore summary fields that cannot be deleted

def delete_summary_fields(self, field_names, error_level=0): (source) ¶

Deletes the summary fields from all samples in the dataset.

Parameters
field_names	the summary field or iterable of summary fields
error_level:0	the error level to use. Valid values are:
- 0	raise error if a summary field cannot be deleted
- 1	log warning if a summary field cannot be deleted
- 2	ignore summary fields that cannot be deleted

def delete_workspace(self, name): (source) ¶

Deletes the saved workspace with the given name.

Parameters
name	the name of a saved workspace
Raises
`ValueError`	if `name` is not a saved workspace

def delete_workspaces(self): (source) ¶

Deletes all saved workspaces from this dataset.

@description.setter
def description(self, description): (source) ¶

overrides fiftyone.core.collections.SampleCollection.description.setter

Undocumented

def ensure_frames(self): (source) ¶

Ensures that the video dataset contains frame instances for every frame of each sample's source video.

Empty frames will be inserted for missing frames, and already existing frames are left unchanged.

def first(self): (source) ¶

overrides fiftyone.core.collections.SampleCollection.first

Returns the first sample in the dataset.

Returns
a `fiftyone.core.sample.Sample`

def get_field_schema(self, ftype=None, embedded_doc_type=None, subfield=None, read_only=None, info_keys=None, created_after=None, include_private=False, flat=False, unwind=True, mode=None): (source) ¶

overrides fiftyone.core.collections.SampleCollection.get_field_schema

Returns a schema dictionary describing the fields of the samples in the dataset.

Parameters
ftype:`None`	an optional field type or iterable of types to which to restrict the returned schema. Must be subclass(es) of `fiftyone.core.fields.Field`
embedded_doc_type:`None`	an optional embedded document type or iterable of types to which to restrict the returned schema. Must be subclass(es) of `fiftyone.core.odm.BaseEmbeddedDocument`
subfield:`None`	an optional subfield type or iterable of subfield types to which to restrict the returned schema. Must be subclass(es) of `fiftyone.core.fields.Field`
read_only:`None`	whether to restrict to (True) or exclude (False) read-only fields. By default, all fields are included
info_keys:`None`	an optional key or list of keys that must be in the field's `info` dict
created_after:`None`	an optional `datetime` specifying a minimum creation date
include_private:`False`	whether to include fields that start with `_` in the returned schema
flat:`False`	whether to return a flattened schema where all embedded document fields are included as top-level keys
unwind:`True`	whether to traverse into list fields. Only applicable when `flat=True`
mode:`None`	whether to apply the above constraints before and/or after flattening the schema. Only applicable when `flat=True`. Supported values are `("before", "after", "both")`. The default is `"after"`
Returns
a dict mapping field names to `fiftyone.core.fields.Field` instances

def get_frame_field_schema(self, ftype=None, embedded_doc_type=None, subfield=None, read_only=None, info_keys=None, created_after=None, include_private=False, flat=False, unwind=True, mode=None): (source) ¶

overrides fiftyone.core.collections.SampleCollection.get_frame_field_schema

Returns a schema dictionary describing the fields of the frames of the samples in the dataset.

Only applicable for datasets that contain videos.

Parameters
ftype:`None`	an optional field type or iterable of types to which to restrict the returned schema. Must be subclass(es) of `fiftyone.core.fields.Field`
embedded_doc_type:`None`	an optional embedded document type or iterable of types to which to restrict the returned schema. Must be subclass(es) of `fiftyone.core.odm.BaseEmbeddedDocument`
subfield:`None`	an optional subfield type or iterable of subfield types to which to restrict the returned schema. Must be subclass(es) of `fiftyone.core.fields.Field`
read_only:`None`	whether to restrict to (True) or exclude (False) read-only fields. By default, all fields are included
info_keys:`None`	an optional key or list of keys that must be in the field's `info` dict
created_after:`None`	an optional `datetime` specifying a minimum creation date
include_private:`False`	whether to include fields that start with `_` in the returned schema
flat:`False`	whether to return a flattened schema where all embedded document fields are included as top-level keys
unwind:`True`	whether to traverse into list fields. Only applicable when `flat=True`
mode:`None`	whether to apply the above constraints before and/or after flattening the schema. Only applicable when `flat=True`. Supported values are `("before", "after", "both")`. The default is `"after"`
Returns
a dict mapping field names to `fiftyone.core.fields.Field` instances, or `None` if the dataset does not contain videos

def get_group(self, group_id, group_slices=None): (source) ¶

overrides fiftyone.core.collections.SampleCollection.get_group

Returns a dict containing the samples for the given group ID.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

group_id = dataset.take(1).first().group.id
group = dataset.get_group(group_id)

print(group.keys())
# ['left', 'right', 'pcd']

Parameters
group_id	a group ID
group_slices:`None`	an optional subset of group slices to load
Returns
a dict mapping group names to `fiftyone.core.sample.Sample` instances
Raises
`KeyError`	if the group ID is not found

def get_saved_view_info(self, name): (source) ¶

Loads the editable information about the saved view with the given name.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

view = dataset.limit(10)
dataset.save_view("test", view)

print(dataset.get_saved_view_info("test"))

Parameters
name	the name of a saved view
Returns
a dict of editable info

def get_workspace_info(self, name): (source) ¶

Gets the information about the workspace with the given name.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

workspace = fo.Space()
description = "A really cool (apparently empty?) workspace"
dataset.save_workspace("test", workspace, description=description)

print(dataset.get_workspace_info("test"))

Parameters
name	the name of a saved view
Returns
a dict of editable info

@group_slice.setter
def group_slice(self, slice_name): (source) ¶

Undocumented

def has_saved_view(self, name): (source) ¶

Whether this dataset has a saved view with the given name.

Parameters
name	a saved view name
Returns
True/False

def has_workspace(self, name): (source) ¶

Whether this dataset has a saved workspace with the given name.

Parameters
name	a saved workspace name
Returns
True/False

def head(self, num_samples=3): (source) ¶

overrides fiftyone.core.collections.SampleCollection.head

Returns a list of the first few samples in the dataset.

If fewer than num_samples samples are in the dataset, only the available samples are returned.

Parameters
num_samples:3	the number of samples
Returns
a list of `fiftyone.core.sample.Sample` objects

@info.setter
def info(self, info): (source) ¶

Undocumented

def ingest_images(self, paths_or_samples, sample_parser=None, tags=None, dataset_dir=None, image_format=None, generator=False, progress=None): (source) ¶

Ingests the given iterable of images into the dataset.

The images are read in-memory and written to dataset_dir.

See :ref:`this guide <custom-sample-parser>` for more details about ingesting images into a dataset by defining your own UnlabeledImageSampleParser.

Parameters
paths_or_samples	an iterable of data. If no `sample_parser` is provided, this must be an iterable of image paths. If a `sample_parser` is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser
sample_parser:`None`	a `fiftyone.utils.data.parsers.UnlabeledImageSampleParser` instance to use to parse the samples
tags:`None`	an optional tag or iterable of tags to attach to each sample
dataset_dir:`None`	the directory in which the images will be written. By default, `get_default_dataset_dir` is used
image_format:`None`	the image format to use to write the images to disk. By default, `fiftyone.config.default_image_ext` is used
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples in the dataset

def ingest_labeled_images(self, samples, sample_parser, label_field=None, tags=None, expand_schema=True, dynamic=False, dataset_dir=None, image_format=None, generator=False, progress=None): (source) ¶

Ingests the given iterable of labeled image samples into the dataset.

The images are read in-memory and written to dataset_dir.

See :ref:`this guide <custom-sample-parser>` for more details about ingesting labeled images into a dataset by defining your own LabeledImageSampleParser.

Parameters
samples	an iterable of data
sample_parser	a `fiftyone.utils.data.parsers.LabeledImageSampleParser` instance to use to parse the samples
label_field:`None`	controls the field(s) in which imported labels are stored. If the parser produces a single `fiftyone.core.labels.Label` instance per sample, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the parser produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if the sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
dataset_dir:`None`	the directory in which the images will be written. By default, `get_default_dataset_dir` is used
image_format:`None`	the image format to use to write the images to disk. By default, `fiftyone.config.default_image_ext` is used
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples in the dataset

def ingest_labeled_videos(self, samples, sample_parser, tags=None, expand_schema=True, dynamic=False, dataset_dir=None, generator=False, progress=None): (source) ¶

Ingests the given iterable of labeled video samples into the dataset.

The videos are copied to dataset_dir.

See :ref:`this guide <custom-sample-parser>` for more details about ingesting labeled videos into a dataset by defining your own LabeledVideoSampleParser.

Parameters
samples	an iterable of data
sample_parser	a `fiftyone.utils.data.parsers.LabeledVideoSampleParser` instance to use to parse the samples
tags:`None`	an optional tag or iterable of tags to attach to each sample
expand_schema:`True`	whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if the sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
dataset_dir:`None`	the directory in which the videos will be written. By default, `get_default_dataset_dir` is used
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples in the dataset

def ingest_videos(self, paths_or_samples, sample_parser=None, tags=None, dataset_dir=None, generator=False, progress=None): (source) ¶

Ingests the given iterable of videos into the dataset.

The videos are copied to dataset_dir.

See :ref:`this guide <custom-sample-parser>` for more details about ingesting videos into a dataset by defining your own UnlabeledVideoSampleParser.

Parameters
paths_or_samples	an iterable of data. If no `sample_parser` is provided, this must be an iterable of video paths. If a `sample_parser` is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser
sample_parser:`None`	a `fiftyone.utils.data.parsers.UnlabeledVideoSampleParser` instance to use to parse the samples
tags:`None`	an optional tag or iterable of tags to attach to each sample
dataset_dir:`None`	the directory in which the videos will be written. By default, `get_default_dataset_dir` is used
generator:`False`	whether to yield ID batches as a generator as samples are added to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
Returns
a list of IDs of the samples in the dataset

def iter_groups(self, group_slices=None, progress=False, autosave=False, batch_size=None, batching_strategy=None): (source) ¶

overrides fiftyone.core.collections.SampleCollection.iter_groups

Returns an iterator over the groups in the dataset.

Examples:

import random as r
import string as s

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

def make_label():
    return "".join(r.choice(s.ascii_letters) for i in range(10))

# No save context
for group in dataset.iter_groups(progress=True):
    for sample in group.values():
        sample["test"] = make_label()
        sample.save()

# Save using default batching strategy
for group in dataset.iter_groups(progress=True, autosave=True):
    for sample in group.values():
        sample["test"] = make_label()

# Save in batches of 10
for group in dataset.iter_groups(
    progress=True, autosave=True, batch_size=10
):
    for sample in group.values():
        sample["test"] = make_label()

# Save every 0.5 seconds
for group in dataset.iter_groups(
    progress=True, autosave=True, batch_size=0.5
):
    for sample in group.values():
        sample["test"] = make_label()

Parameters
group_slices:`None`	an optional subset of group slices to load
progress:`False`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
autosave:`False`	whether to automatically save changes to samples emitted by this iterator
batch_size:`None`	the batch size to use when autosaving samples. If a `batching_strategy` is provided, this parameter configures the strategy as described below. If no `batching_strategy` is provided, this can either be an integer specifying the number of samples to save in a batch (in which case `batching_strategy` is implicitly set to `"static"`) or a float number of seconds between batched saves (in which case `batching_strategy` is implicitly set to `"latency"`)
batching_strategy:`None`	the batching strategy to use for each save operation when autosaving samples. Supported values are: `"static"`: a fixed sample batch size for each save `"size"`: a target batch size, in bytes, for each save `"latency"`: a target latency, in seconds, between saves By default, `fo.config.default_batcher` is used
Returns
an iterator that emits dicts mapping group slice names to `fiftyone.core.sample.Sample` instances, one per group

def iter_samples(self, progress=False, autosave=False, batch_size=None, batching_strategy=None): (source) ¶

overrides fiftyone.core.collections.SampleCollection.iter_samples

Returns an iterator over the samples in the dataset.

Examples:

import random as r
import string as s

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("cifar10", split="test")

def make_label():
    return "".join(r.choice(s.ascii_letters) for i in range(10))

# No save context
for sample in dataset.iter_samples(progress=True):
    sample.ground_truth.label = make_label()
    sample.save()

# Save using default batching strategy
for sample in dataset.iter_samples(progress=True, autosave=True):
    sample.ground_truth.label = make_label()

# Save in batches of 10
for sample in dataset.iter_samples(
    progress=True, autosave=True, batch_size=10
):
    sample.ground_truth.label = make_label()

# Save every 0.5 seconds
for sample in dataset.iter_samples(
    progress=True, autosave=True, batch_size=0.5
):
    sample.ground_truth.label = make_label()

Parameters
progress:`False`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
autosave:`False`	whether to automatically save changes to samples emitted by this iterator
batch_size:`None`	the batch size to use when autosaving samples. If a `batching_strategy` is provided, this parameter configures the strategy as described below. If no `batching_strategy` is provided, this can either be an integer specifying the number of samples to save in a batch (in which case `batching_strategy` is implicitly set to `"static"`) or a float number of seconds between batched saves (in which case `batching_strategy` is implicitly set to `"latency"`)
batching_strategy:`None`	the batching strategy to use for each save operation when autosaving samples. Supported values are: `"static"`: a fixed sample batch size for each save `"size"`: a target batch size, in bytes, for each save `"latency"`: a target latency, in seconds, between saves By default, `fo.config.default_batcher` is used
Returns
an iterator over `fiftyone.core.sample.Sample` instances

def last(self): (source) ¶

overrides fiftyone.core.collections.SampleCollection.last

Returns the last sample in the dataset.

Returns
a `fiftyone.core.sample.Sample`

def list_saved_views(self, info=False): (source) ¶

List saved views on this dataset.

Parameters
info:`False`	whether to return info dicts describing each saved view rather than just their names
Returns
a list of saved view names or info dicts

def list_summary_fields(self): (source) ¶

Lists the summary fields on the dataset.

Use create_summary_field to create summary fields, and use delete_summary_field to delete them.

Returns
a list of summary field names

def list_workspaces(self, info=False): (source) ¶

List saved workspaces on this dataset.

Parameters
info:`False`	whether to return info dicts describing each saved workspace rather than just their names
Returns
a list of saved workspace names or info dicts

def load_saved_view(self, name): (source) ¶

Loads the saved view with the given name.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")
view = dataset.filter_labels("ground_truth", F("label") == "cat")

dataset.save_view("cats", view)

also_view = dataset.load_saved_view("cats")
assert view == also_view

Parameters
name	the name of a saved view
Returns
a `fiftyone.core.view.DatasetView`

def load_workspace(self, name): (source) ¶

Loads the saved workspace with the given name.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

embeddings_panel = fo.Panel(
    type="Embeddings",
    state=dict(brainResult="img_viz", colorByField="metadata.size_bytes"),
)
workspace = fo.Space(children=[embeddings_panel])
workspace_name = "embeddings-workspace"
dataset.save_workspace(workspace_name, workspace)

# Some time later ... load the workspace
loaded_workspace = dataset.load_workspace(workspace_name)
assert workspace == loaded_workspace

# Launch app with the loaded workspace!
session = fo.launch_app(dataset, spaces=loaded_workspace)

# Or set via session later on
session.spaces = loaded_workspace

Parameters
name	the name of a saved workspace
Returns
a `fiftyone.core.odm.workspace.Space`
Raises
`ValueError`	if `name` is not a saved workspace

@mask_targets.setter
def mask_targets(self, targets): (source) ¶

Undocumented

@media_type.setter
def media_type(self, media_type): (source) ¶

Undocumented

def merge_archive(self, archive_path, dataset_type=None, data_path=None, labels_path=None, label_field=None, tags=None, key_field='filepath', key_fcn=None, skip_existing=False, insert_new=True, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, dynamic=False, add_info=True, cleanup=True, progress=None, **kwargs): (source) ¶

Merges the contents of the given archive into the dataset.

Note

This method requires the ability to create unique indexes on the key_field of each collection.

See add_archive if you want to add samples without a uniqueness constraint.

If a directory with the same root name as archive_path exists, it is assumed that this directory contains the extracted contents of the archive, and thus the archive is not re-extracted.

See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.

Note

The following archive formats are explicitly supported:

.zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz

If an archive not in the above list is found, extraction will be attempted via the patool package, which supports many formats but may require that additional system packages be installed.

By default, samples with the same absolute filepath are merged, but you can customize this behavior via the key_field and key_fcn parameters. For example, you could set key_fcn = lambda sample: os.path.basename(sample.filepath) to merge samples with the same base filename.

The behavior of this method is highly customizable. By default, all top-level fields from the imported samples are merged in, overwriting any existing values for those fields, with the exception of list fields (e.g., tags) and label list fields (e.g., fiftyone.core.labels.Detections fields), in which case the elements of the lists themselves are merged. In the case of label list fields, labels with the same id in both collections are updated rather than duplicated.

To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.

This method can be configured in numerous ways, including:

Whether existing samples should be modified or skipped
Whether new samples should be added or omitted
Whether new fields can be added to the dataset schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input fields to different field names of this dataset

Parameters
archive_path	the path to an archive of a dataset directory
dataset_type:`None`	the `fiftyone.types.Dataset` type of the dataset in `archive_path`
data_path:`None`	an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following: a folder name like `"data"` or `"data/"` specifying a subfolder of `dataset_dir` in which the media lies an absolute directory path in which the media lies. In this case, the `archive_path` has no effect on the location of the data a filename like `"data.json"` specifying the filename of a JSON manifest file in `archive_path` that maps UUIDs to media filepaths. Files of this format are generated when passing the `export_media="manifest"` option to `fiftyone.core.collections.SampleCollection.export` an absolute filepath to a JSON manifest file. In this case, `archive_path` has no effect on the location of the data a dict mapping filenames to absolute filepaths By default, it is assumed that the data can be located in the default location within `archive_path` for the dataset type
labels_path:`None`	an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following: a type-specific folder name like `"labels"` or `"labels/"` or a filename like `"labels.json"` or `"labels.xml"` specifying the location in `archive_path` of the labels file(s) an absolute directory or filepath containing the labels file(s). In this case, `archive_path` has no effect on the location of the labels For labeled datasets, this parameter defaults to the location in `archive_path` of the labels for the default layout of the dataset type being imported
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
key_field:"filepath"	the sample field to use to decide whether to join with an existing sample
key_fcn:`None`	a function that accepts a `fiftyone.core.sample.Sample` instance and computes a key to decide if two samples should be merged. If a `key_fcn` is provided, `key_field` is ignored
skip_existing:`False`	whether to skip existing samples (True) or merge them (False)
insert_new:`True`	whether to insert new samples (True) or skip them (False)
fields:`None`	an optional field or iterable of fields to which to restrict the merge. If provided, fields other than these are omitted from `samples` when merging or adding samples. One exception is that `filepath` is always included when adding new samples, since the field is required. This can also be a dict mapping field names of the input collection to field names of this dataset
omit_fields:`None`	an optional field or iterable of fields to exclude from the merge. If provided, these fields are omitted from imported samples, if present. One exception is that `filepath` is always included when adding new samples, since the field is required
merge_lists:`True`	whether to merge the elements of list fields (e.g., `tags`) and label list fields (e.g., `fiftyone.core.labels.Detections` fields) rather than merging the entire top-level field like other field types. For label lists fields, existing `fiftyone.core.label.Label` elements are either replaced (when `overwrite` is True) or kept (when `overwrite` is False) when their `id` matches a label from the provided samples
overwrite:`True`	whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema:`True`	whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
add_info:`True`	whether to add dataset info from the importer (if any) to the dataset
cleanup:`True`	whether to delete the archive after extracting it
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
**kwargs	optional keyword arguments to pass to the constructor of the `fiftyone.utils.data.importers.DatasetImporter` for the specified `dataset_type`

def merge_dir(self, dataset_dir=None, dataset_type=None, data_path=None, labels_path=None, label_field=None, tags=None, key_field='filepath', key_fcn=None, skip_existing=False, insert_new=True, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, dynamic=False, add_info=True, progress=None, **kwargs): (source) ¶

Merges the contents of the given directory into the dataset.

Note

This method requires the ability to create unique indexes on the key_field of each collection.

See add_dir if you want to add samples without a uniqueness constraint.

You can perform imports with this method via the following basic patterns:

Provide dataset_dir and dataset_type to import the contents of a directory that is organized in the default layout for the dataset type as documented in :ref:`this guide <loading-datasets-from-disk>`
Provide dataset_type along with data_path, labels_path, or other type-specific parameters to perform a customized import. This syntax provides the flexibility to, for example, perform labels-only imports or imports where the source media lies in a different location than the labels

In either workflow, the remaining parameters of this method can be provided to further configure the import.

See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.

To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.

This method can be configured in numerous ways, including:

Whether existing samples should be modified or skipped
Whether new samples should be added or omitted
Whether new fields can be added to the dataset schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input fields to different field names of this dataset

Parameters
dataset_dir:`None`	the dataset directory. This can be omitted for certain dataset formats if you provide arguments such as `data_path` and `labels_path`
dataset_type:`None`	the `fiftyone.types.Dataset` type of the dataset
data_path:`None`	an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following: a folder name like `"data"` or `"data/"` specifying a subfolder of `dataset_dir` in which the media lies an absolute directory path in which the media lies. In this case, the `dataset_dir` has no effect on the location of the data a filename like `"data.json"` specifying the filename of a JSON manifest file in `dataset_dir` that maps UUIDs to media filepaths. Files of this format are generated when passing the `export_media="manifest"` option to `fiftyone.core.collections.SampleCollection.export` an absolute filepath to a JSON manifest file. In this case, `dataset_dir` has no effect on the location of the data a dict mapping filenames to absolute filepaths By default, it is assumed that the data can be located in the default location within `dataset_dir` for the dataset type
labels_path:`None`	an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following: a type-specific folder name like `"labels"` or `"labels/"` or a filename like `"labels.json"` or `"labels.xml"` specifying the location in `dataset_dir` of the labels file(s) an absolute directory or filepath containing the labels file(s). In this case, `dataset_dir` has no effect on the location of the labels For labeled datasets, this parameter defaults to the location in `dataset_dir` of the labels for the default layout of the dataset type being imported
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
key_field:"filepath"	the sample field to use to decide whether to join with an existing sample
key_fcn:`None`	a function that accepts a `fiftyone.core.sample.Sample` instance and computes a key to decide if two samples should be merged. If a `key_fcn` is provided, `key_field` is ignored
skip_existing:`False`	whether to skip existing samples (True) or merge them (False)
insert_new:`True`	whether to insert new samples (True) or skip them (False)
fields:`None`	an optional field or iterable of fields to which to restrict the merge. If provided, fields other than these are omitted from `samples` when merging or adding samples. One exception is that `filepath` is always included when adding new samples, since the field is required. This can also be a dict mapping field names of the input collection to field names of this dataset
omit_fields:`None`	an optional field or iterable of fields to exclude from the merge. If provided, these fields are omitted from imported samples, if present. One exception is that `filepath` is always included when adding new samples, since the field is required
merge_lists:`True`	whether to merge the elements of list fields (e.g., `tags`) and label list fields (e.g., `fiftyone.core.labels.Detections` fields) rather than merging the entire top-level field like other field types. For label lists fields, existing `fiftyone.core.label.Label` elements are either replaced (when `overwrite` is True) or kept (when `overwrite` is False) when their `id` matches a label from the provided samples
overwrite:`True`	whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema:`True`	whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
add_info:`True`	whether to add dataset info from the importer (if any) to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
**kwargs	optional keyword arguments to pass to the constructor of the `fiftyone.utils.data.importers.DatasetImporter` for the specified `dataset_type`

def merge_importer(self, dataset_importer, label_field=None, tags=None, key_field='filepath', key_fcn=None, skip_existing=False, insert_new=True, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, dynamic=False, add_info=True, progress=None): (source) ¶

Merges the samples from the given fiftyone.utils.data.importers.DatasetImporter into the dataset.

Note

This method requires the ability to create unique indexes on the key_field of each collection.

See add_importer if you want to add samples without a uniqueness constraint.

See :ref:`this guide <custom-dataset-importer>` for more details about importing datasets in custom formats by defining your own DatasetImporter.

To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.

This method can be configured in numerous ways, including:

Whether existing samples should be modified or skipped
Whether new samples should be added or omitted
Whether new fields can be added to the dataset schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input fields to different field names of this dataset

Parameters
dataset_importer	a `fiftyone.utils.data.importers.DatasetImporter`
label_field:`None`	controls the field(s) in which imported labels are stored. Only applicable if `dataset_importer` is a `fiftyone.utils.data.importers.LabeledImageDatasetImporter` or `fiftyone.utils.data.importers.LabeledVideoDatasetImporter`. If the importer produces a single `fiftyone.core.labels.Label` instance per sample/frame, this argument specifies the name of the field to use; the default is `"ground_truth"`. If the importer produces a dictionary of labels per sample, this argument can be either a string prefix to prepend to each label key or a dict mapping label keys to field names; the default in this case is to directly use the keys of the imported label dictionaries as field names
tags:`None`	an optional tag or iterable of tags to attach to each sample
key_field:"filepath"	the sample field to use to decide whether to join with an existing sample
key_fcn:`None`	a function that accepts a `fiftyone.core.sample.Sample` instance and computes a key to decide if two samples should be merged. If a `key_fcn` is provided, `key_field` is ignored
skip_existing:`False`	whether to skip existing samples (True) or merge them (False)
insert_new:`True`	whether to insert new samples (True) or skip them (False)
fields:`None`	an optional field or iterable of fields to which to restrict the merge. If provided, fields other than these are omitted from `samples` when merging or adding samples. One exception is that `filepath` is always included when adding new samples, since the field is required. This can also be a dict mapping field names of the input collection to field names of this dataset
omit_fields:`None`	an optional field or iterable of fields to exclude from the merge. If provided, these fields are omitted from imported samples, if present. One exception is that `filepath` is always included when adding new samples, since the field is required
merge_lists:`True`	whether to merge the elements of list fields (e.g., `tags`) and label list fields (e.g., `fiftyone.core.labels.Detections` fields) rather than merging the entire top-level field like other field types. For label lists fields, existing `fiftyone.core.label.Label` elements are either replaced (when `overwrite` is True) or kept (when `overwrite` is False) when their `id` matches a label from the provided samples
overwrite:`True`	whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema:`True`	whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered
add_info:`True`	whether to add dataset info from the importer (if any) to the dataset
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead

def merge_sample(self, sample, key_field='filepath', skip_existing=False, insert_new=True, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, validate=True, dynamic=False): (source) ¶

Merges the fields of the given sample into this dataset.

By default, the sample is merged with an existing sample with the same absolute filepath, if one exists. Otherwise a new sample is inserted. You can customize this behavior via the key_field, skip_existing, and insert_new parameters.

The behavior of this method is highly customizable. By default, all top-level fields from the provided sample are merged in, overwriting any existing values for those fields, with the exception of list fields (e.g., tags) and label list fields (e.g., fiftyone.core.labels.Detections fields), in which case the elements of the lists themselves are merged. In the case of label list fields, labels with the same id in both samples are updated rather than duplicated.

To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.

This method can be configured in numerous ways, including:

Whether new fields can be added to the dataset schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input sample fields to different field names of this sample

Parameters
sample	a `fiftyone.core.sample.Sample`
key_field:"filepath"	the sample field to use to decide whether to join with an existing sample
skip_existing:`False`	whether to skip existing samples (True) or merge them (False)
insert_new:`True`	whether to insert new samples (True) or skip them (False)
fields:`None`	an optional field or iterable of fields to which to restrict the merge. May contain frame fields for video samples. This can also be a dict mapping field names of the input sample to field names of this dataset
omit_fields:`None`	an optional field or iterable of fields to exclude from the merge. May contain frame fields for video samples
merge_lists:`True`	whether to merge the elements of list fields (e.g., `tags`) and label list fields (e.g., `fiftyone.core.labels.Detections` fields) rather than merging the entire top-level field like other field types. For label lists fields, existing `fiftyone.core.label.Label` elements are either replaced (when `overwrite` is True) or kept (when `overwrite` is False) when their `id` matches a label from the provided sample
overwrite:`True`	whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema:`True`	whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if any fields are not in the dataset schema
validate:`True`	whether to validate values for existing fields
dynamic:`False`	whether to declare dynamic embedded document fields

def merge_samples(self, samples, key_field='filepath', key_fcn=None, skip_existing=False, insert_new=True, fields=None, omit_fields=None, merge_lists=True, overwrite=True, expand_schema=True, dynamic=False, include_info=True, overwrite_info=False, progress=None, num_samples=None): (source) ¶

Merges the given samples into this dataset.

Note

This method requires the ability to create unique indexes on the key_field of each collection.

See add_collection if you want to add samples from one collection to another dataset without a uniqueness constraint.

The behavior of this method is highly customizable. By default, all top-level fields from the provided samples are merged in, overwriting any existing values for those fields, with the exception of list fields (e.g., tags) and label list fields (e.g., fiftyone.core.labels.Detections fields), in which case the elements of the lists themselves are merged. In the case of label list fields, labels with the same id in both collections are updated rather than duplicated.

To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.

This method can be configured in numerous ways, including:

Whether existing samples should be modified or skipped
Whether new samples should be added or omitted
Whether new fields can be added to the dataset schema
Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
Whether to merge only specific fields, or all but certain fields
Mapping input fields to different field names of this dataset

Parameters
samples	a `fiftyone.core.collections.SampleCollection` or iterable of `fiftyone.core.sample.Sample` instances
key_field:"filepath"	the sample field to use to decide whether to join with an existing sample
key_fcn:`None`	a function that accepts a `fiftyone.core.sample.Sample` instance and computes a key to decide if two samples should be merged. If a `key_fcn` is provided, `key_field` is ignored
skip_existing:`False`	whether to skip existing samples (True) or merge them (False)
insert_new:`True`	whether to insert new samples (True) or skip them (False)
fields:`None`	an optional field or iterable of fields to which to restrict the merge. If provided, fields other than these are omitted from `samples` when merging or adding samples. One exception is that `filepath` is always included when adding new samples, since the field is required. This can also be a dict mapping field names of the input collection to field names of this dataset
omit_fields:`None`	an optional field or iterable of fields to exclude from the merge. If provided, these fields are omitted from `samples`, if present, when merging or adding samples. One exception is that `filepath` is always included when adding new samples, since the field is required
merge_lists:`True`	whether to merge the elements of list fields (e.g., `tags`) and label list fields (e.g., `fiftyone.core.labels.Detections` fields) rather than merging the entire top-level field like other field types. For label lists fields, existing `fiftyone.core.label.Label` elements are either replaced (when `overwrite` is True) or kept (when `overwrite` is False) when their `id` matches a label from the provided samples
overwrite:`True`	whether to overwrite (True) or skip (False) existing fields and label elements
expand_schema:`True`	whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema
dynamic:`False`	whether to declare dynamic attributes of embedded document fields that are encountered. Only applicable when `samples` is not a `fiftyone.core.collections.SampleCollection`
include_info:`True`	whether to merge dataset-level information such as `info` and `classes`. Only applicable when `samples` is a `fiftyone.core.collections.SampleCollection`
overwrite_info:`False`	whether to overwrite existing dataset-level information. Only applicable when `samples` is a `fiftyone.core.collections.SampleCollection` and `include_info` is True
progress:`None`	whether to render a progress bar (True/False), use the default value `fiftyone.config.show_progress_bars` (None), or a progress callback function to invoke instead
num_samples:`None`	the number of samples in `samples`. If not provided, this is computed (if possible) via `len(samples)` if needed for progress tracking

@name.setter
def name(self, name): (source) ¶

Undocumented

def one(self, expr, exact=False): (source) ¶

overrides fiftyone.core.collections.SampleCollection.one

Returns a single sample in this dataset matching the expression.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

#
# Get a sample by filepath
#

# A random filepath in the dataset
filepath = dataset.take(1).first().filepath

# Get sample by filepath
sample = dataset.one(F("filepath") == filepath)

#
# Dealing with multiple matches
#

# Get a sample whose image is JPEG
sample = dataset.one(F("filepath").ends_with(".jpg"))

# Raises an error since there are multiple JPEGs
dataset.one(F("filepath").ends_with(".jpg"), exact=True)

Parameters
expr	a `fiftyone.core.expressions.ViewExpression` or MongoDB expression that evaluates to `True` for the sample to match
exact:`False`	whether to raise an error if multiple samples match the expression
Returns
a `fiftyone.core.sample.Sample`

@persistent.setter
def persistent(self, value): (source) ¶

Undocumented

def reload(self): (source) ¶

overrides fiftyone.core.collections.SampleCollection.reload

Reloads the dataset and any in-memory samples from the database.

def remove_dynamic_frame_field(self, field_name, error_level=0): (source) ¶

Removes the dynamic embedded frame field from the dataset's schema.

The underlying data is not deleted from the frames.

Parameters
field_name	the `embedded.field.name`
error_level:0	the error level to use. Valid values are:
- 0	raise error if a top-level field cannot be removed
- 1	log warning if a top-level field cannot be removed
- 2	ignore top-level fields that cannot be removed

def remove_dynamic_frame_fields(self, field_names, error_level=0): (source) ¶

Removes the dynamic embedded frame fields from the dataset's schema.

The underlying data is not deleted from the frames.

Parameters
field_names	the `embedded.field.name` or iterable of field names
error_level:0	the error level to use. Valid values are:
- 0	raise error if a top-level field cannot be removed
- 1	log warning if a top-level field cannot be removed
- 2	ignore top-level fields that cannot be removed

def remove_dynamic_sample_field(self, field_name, error_level=0): (source) ¶

Removes the dynamic embedded sample field from the dataset's schema.

The underlying data is not deleted from the samples.

Parameters
field_name	the `embedded.field.name`
error_level:0	the error level to use. Valid values are:
- 0	raise error if a top-level field cannot be removed
- 1	log warning if a top-level field cannot be removed
- 2	ignore top-level fields that cannot be removed

def remove_dynamic_sample_fields(self, field_names, error_level=0): (source) ¶

Removes the dynamic embedded sample fields from the dataset's schema.

The underlying data is not deleted from the samples.

Parameters
field_names	the `embedded.field.name` or iterable of field names
error_level:0	the error level to use. Valid values are:
- 0	raise error if a top-level field cannot be removed
- 1	log warning if a top-level field cannot be removed
- 2	ignore top-level fields that cannot be removed

def rename_frame_field(self, field_name, new_field_name): (source) ¶

Renames the frame-level field to the given new name.

You can use dot notation (embedded.field.name) to rename embedded frame fields.

Only applicable to datasets that contain videos.

Parameters
field_name	the field name or `embedded.field.name`
new_field_name	the new field name or `embedded.field.name`

def rename_frame_fields(self, field_mapping): (source) ¶

Renames the frame-level fields to the given new names.

You can use dot notation (embedded.field.name) to rename embedded frame fields.

Parameters
field_mapping	a dict mapping field names to new field names

def rename_group_slice(self, name, new_name): (source) ¶

Renames the group slice with the given name.

Parameters
name	the group slice name
new_name	the new group slice name

def rename_sample_field(self, field_name, new_field_name): (source) ¶

Renames the sample field to the given new name.

You can use dot notation (embedded.field.name) to rename embedded fields.

Parameters
field_name	the field name or `embedded.field.name`
new_field_name	the new field name or `embedded.field.name`

def rename_sample_fields(self, field_mapping): (source) ¶

Renames the sample fields to the given new names.

You can use dot notation (embedded.field.name) to rename embedded fields.

Parameters
field_mapping	a dict mapping field names to new field names

def save(self): (source) ¶

Saves the dataset to the database.

This only needs to be called when dataset-level information such as its Dataset.info is modified.

def save_view(self, name, view, description=None, color=None, overwrite=False): (source) ¶

Saves the given view into this dataset under the given name so it can be loaded later via load_saved_view.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")
view = dataset.filter_labels("ground_truth", F("label") == "cat")

dataset.save_view("cats", view)

also_view = dataset.load_saved_view("cats")
assert view == also_view

Parameters
name	a name for the saved view
view	a `fiftyone.core.view.DatasetView`
description:`None`	an optional string description
color:`None`	an optional RGB hex string like `'#FF6D04'`
overwrite:`False`	whether to overwrite an existing saved view with the same name

def save_workspace(self, name, workspace, description=None, color=None, overwrite=False): (source) ¶

Saves a workspace into this dataset under the given name so it can be loaded later via load_workspace.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

embeddings_panel = fo.Panel(
    type="Embeddings",
    state=dict(
        brainResult="img_viz",
        colorByField="metadata.size_bytes"
    ),
)
workspace = fo.Space(children=[embeddings_panel])

workspace_name = "embeddings-workspace"
description = "Show embeddings only"
dataset.save_workspace(
    workspace_name,
    workspace,
    description=description
)
assert dataset.has_workspace(workspace_name)

also_workspace = dataset.load_workspace(workspace_name)
assert workspace == also_workspace

Parameters
name	a name for the saved workspace
workspace	a `fiftyone.core.odm.workspace.Space`
description:`None`	an optional string description
color:`None`	an optional RGB hex string like `'#FF6D04'`
overwrite:`False`	whether to overwrite an existing workspace with the same name
Raises
`ValueError`	if `overwrite==False` and workspace with `name` already exists

@skeletons.setter
def skeletons(self, skeletons): (source) ¶

Undocumented

def stats(self, include_media=False, include_indexes=False, compressed=False): (source) ¶

overrides fiftyone.core.collections.SampleCollection.stats

Returns stats about the dataset on disk.

The samples keys refer to the sample documents stored in the database.

For video datasets, the frames keys refer to the frame documents stored in the database.

The media keys refer to the raw media associated with each sample on disk.

The index[es] keys refer to the indexes associated with the dataset.

Note that dataset-level metadata such as annotation runs are not included in this computation.

Parameters
include_media:`False`	whether to include stats about the size of the raw media in the dataset
include_indexes:`False`	whether to include stats on the dataset's indexes
compressed:`False`	whether to return the sizes of collections in their compressed form on disk (True) or the logical uncompressed size of the collections (False)
Returns
a stats dict

def summary(self): (source) ¶

overrides fiftyone.core.collections.SampleCollection.summary

Returns a string summary of the dataset.

Returns
a string summary

@tags.setter
def tags(self, value): (source) ¶

overrides fiftyone.core.collections.SampleCollection.tags.setter

Undocumented

def tail(self, num_samples=3): (source) ¶

overrides fiftyone.core.collections.SampleCollection.tail

Returns a list of the last few samples in the dataset.

If fewer than num_samples samples are in the dataset, only the available samples are returned.

Parameters
num_samples:3	the number of samples
Returns
a list of `fiftyone.core.sample.Sample` objects

def update_saved_view_info(self, name, info): (source) ¶

Updates the editable information for the saved view with the given name.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

view = dataset.limit(10)
dataset.save_view("test", view)

# Update the saved view's name and add a description
info = dict(
    name="a new name",
    description="a description",
)
dataset.update_saved_view_info("test", info)

Parameters
name	the name of a saved view
info	a dict whose keys are a subset of the keys returned by `get_saved_view_info`

def update_summary_field(self, field_name): (source) ¶

Updates the summary field based on the current values of its source field.

Parameters
field_name	the summary field

def update_workspace_info(self, name, info): (source) ¶

Updates the editable information for the saved view with the given name.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

workspace = fo.Space()
dataset.save_workspace("test", view)

# Update the workspace's name and add a description, color
info = dict(
    name="a new name",
    color="#FF6D04",
    description="a description",
)
dataset.update_workspace_info("test", info)

Parameters
name	the name of a saved workspace
info	a dict whose keys are a subset of the keys returned by `get_workspace_info`

def view(self): (source) ¶

overrides fiftyone.core.collections.SampleCollection.view

Returns a fiftyone.core.view.DatasetView containing the entire dataset.

Returns
a `fiftyone.core.view.DatasetView`

__slots__: tuple[str, ...] = (source) ¶

overrides fiftyone.core.collections.SampleCollection.__slots__

Undocumented

@property
group_slice = (source) ¶

overrides fiftyone.core.collections.SampleCollection.group_slice

The current group slice of the dataset, or None if the dataset is not grouped.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

print(dataset.group_slices)
# ['left', 'right', 'pcd']

print(dataset.group_slice)
# left

# Change the current group slice
dataset.group_slice = "right"

print(dataset.group_slice)
# right

@property
media_type = (source) ¶

overrides fiftyone.core.collections.SampleCollection.media_type

The media type of the dataset.

@property
app_config = (source) ¶

overrides fiftyone.core.collections.SampleCollection.app_config

A fiftyone.core.odm.dataset.DatasetAppConfig that customizes how this dataset is visualized in the :ref:`FiftyOne App <fiftyone-app>`.

Examples:

import fiftyone as fo
import fiftyone.utils.image as foui
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

# View the dataset's current App config
print(dataset.app_config)

# Generate some thumbnail images
foui.transform_images(
    dataset,
    size=(-1, 32),
    output_field="thumbnail_path",
    output_dir="/tmp/thumbnails",
)

# Modify the dataset's App config
dataset.app_config.media_fields = ["filepath", "thumbnail_path"]
dataset.app_config.grid_media_field = "thumbnail_path"
dataset.save()  # must save after edits

session = fo.launch_app(dataset)

@property
classes = (source) ¶

overrides fiftyone.core.collections.SampleCollection.classes

A dict mapping field names to list of class label strings for the corresponding fields of the dataset.

Examples:

import fiftyone as fo

dataset = fo.Dataset()

# Set classes for the `ground_truth` and `predictions` fields
dataset.classes = {
    "ground_truth": ["cat", "dog"],
    "predictions": ["cat", "dog", "other"],
}

# Edit an existing classes list
dataset.classes["ground_truth"].append("other")
dataset.save()  # must save after edits

@property
created_at = (source) ¶

The datetime that the dataset was created.

@property
default_classes = (source) ¶

overrides fiftyone.core.collections.SampleCollection.default_classes

A list of class label strings for all fiftyone.core.labels.Label fields of this dataset that do not have customized classes defined in classes.

Examples:

import fiftyone as fo

dataset = fo.Dataset()

# Set default classes
dataset.default_classes = ["cat", "dog"]

# Edit the default classes
dataset.default_classes.append("rabbit")
dataset.save()  # must save after edits

@property
default_group_slice = (source) ¶

overrides fiftyone.core.collections.SampleCollection.default_group_slice

The default group slice of the dataset, or None if the dataset is not grouped.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

print(dataset.default_group_slice)
# left

# Change the default group slice
dataset.default_group_slice = "right"

print(dataset.default_group_slice)
# right

@property
default_mask_targets = (source) ¶

overrides fiftyone.core.collections.SampleCollection.default_mask_targets

A dict defining a default mapping between pixel values (2D masks) or RGB hex strings (3D masks) and label strings for the segmentation masks of all fiftyone.core.labels.Segmentation fields of this dataset that do not have customized mask targets defined in mask_targets.

Examples:

import fiftyone as fo

#
# 2D masks
#

dataset = fo.Dataset()

# Set default mask targets
dataset.default_mask_targets = {1: "cat", 2: "dog"}

# Or, for RGB mask targets
dataset.default_mask_targets = {"#3f0a44": "road", "#eeffee": "building", "#ffffff": "other"}

# Edit the default mask targets
dataset.default_mask_targets[255] = "other"
dataset.save()  # must save after edits

#
# 3D masks
#

dataset = fo.Dataset()

# Set default mask targets
dataset.default_mask_targets = {"#499CEF": "cat", "#6D04FF": "dog"}

# Edit the default mask targets
dataset.default_mask_targets["#FF6D04"] = "person"
dataset.save()  # must save after edits

@property
default_skeleton = (source) ¶

overrides fiftyone.core.collections.SampleCollection.default_skeleton

A default fiftyone.core.odm.dataset.KeypointSkeleton defining the semantic labels and point connectivity for all fiftyone.core.labels.Keypoint fields of this dataset that do not have customized skeletons defined in skeleton.

Examples:

import fiftyone as fo

dataset = fo.Dataset()

# Set default keypoint skeleton
dataset.default_skeleton = fo.KeypointSkeleton(
    labels=[
        "left hand" "left shoulder", "right shoulder", "right hand",
        "left eye", "right eye", "mouth",
    ],
    edges=[[0, 1, 2, 3], [4, 5, 6]],
)

# Edit the default skeleton
dataset.default_skeleton.labels[-1] = "lips"
dataset.save()  # must save after edits

@property
deleted = (source) ¶

Whether the dataset is deleted.

@property
description = (source) ¶

overrides fiftyone.core.collections.SampleCollection.description

A string description on the dataset.

Examples:

import fiftyone as fo

dataset = fo.Dataset()

# Store a description on the dataset
dataset.description = "Your description here"

@property
group_field = (source) ¶

overrides fiftyone.core.collections.SampleCollection.group_field

The group field of the dataset, or None if the dataset is not grouped.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

print(dataset.group_field)
# group

@property
group_media_types = (source) ¶

overrides fiftyone.core.collections.SampleCollection.group_media_types

A dict mapping group slices to media types, or None if the dataset is not grouped.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

print(dataset.group_media_types)
# {'left': 'image', 'right': 'image', 'pcd': 'point-cloud'}

@property
group_slices = (source) ¶

overrides fiftyone.core.collections.SampleCollection.group_slices

The list of group slices of the dataset, or None if the dataset is not grouped.

Examples:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-groups")

print(dataset.group_slices)
# ['left', 'right', 'pcd']

@property
has_saved_views = (source) ¶

Whether this dataset has any saved views.

@property
has_workspaces = (source) ¶

Whether this dataset has any saved workspaces.

@property
info = (source) ¶

overrides fiftyone.core.collections.SampleCollection.info

A user-facing dictionary of information about the dataset.

Examples:

import fiftyone as fo

dataset = fo.Dataset()

# Store a class list in the dataset's info
dataset.info = {"classes": ["cat", "dog"]}

# Edit the info
dataset.info["other_classes"] = ["bird", "plane"]
dataset.save()  # must save after edits

@property
last_loaded_at = (source) ¶

The datetime that the dataset was last loaded.

@property
last_modified_at = (source) ¶

The datetime that the dataset was last modified.

@property
mask_targets = (source) ¶

overrides fiftyone.core.collections.SampleCollection.mask_targets

A dict mapping field names to mask target dicts, each of which defines a mapping between pixel values (2D masks) or RGB hex strings (3D masks) and label strings for the segmentation masks in the corresponding field of the dataset.

Examples:

import fiftyone as fo

#
# 2D masks
#

dataset = fo.Dataset()

# Set mask targets for the `ground_truth` and `predictions` fields
dataset.mask_targets = {
    "ground_truth": {1: "cat", 2: "dog"},
    "predictions": {1: "cat", 2: "dog", 255: "other"},
}

# Or, for RGB mask targets
dataset.mask_targets = {
    "segmentations": {"#3f0a44": "road", "#eeffee": "building", "#ffffff": "other"}
}

# Edit an existing mask target
dataset.mask_targets["ground_truth"][255] = "other"
dataset.save()  # must save after edits

#
# 3D masks
#

dataset = fo.Dataset()

# Set mask targets for the `ground_truth` and `predictions` fields
dataset.mask_targets = {
    "ground_truth": {"#499CEF": "cat", "#6D04FF": "dog"},
    "predictions": {
        "#499CEF": "cat", "#6D04FF": "dog", "#FF6D04": "person"
    },
}

# Edit an existing mask target
dataset.mask_targets["ground_truth"]["#FF6D04"] = "person"
dataset.save()  # must save after edits

@property
name = (source) ¶

overrides fiftyone.core.collections.SampleCollection.name

The name of the dataset.

@property
persistent = (source) ¶

Whether the dataset persists in the database after a session is terminated.

@property
skeletons = (source) ¶

overrides fiftyone.core.collections.SampleCollection.skeletons

A dict mapping field names to fiftyone.core.odm.dataset.KeypointSkeleton instances, each of which defines the semantic labels and point connectivity for the fiftyone.core.labels.Keypoint instances in the corresponding field of the dataset.

Examples:

import fiftyone as fo

dataset = fo.Dataset()

# Set keypoint skeleton for the `ground_truth` field
dataset.skeletons = {
    "ground_truth": fo.KeypointSkeleton(
        labels=[
            "left hand" "left shoulder", "right shoulder", "right hand",
            "left eye", "right eye", "mouth",
        ],
        edges=[[0, 1, 2, 3], [4, 5, 6]],
    )
}

# Edit an existing skeleton
dataset.skeletons["ground_truth"].labels[-1] = "lips"
dataset.save()  # must save after edits

@property
slug = (source) ¶

The slug of the dataset.

@property
tags = (source) ¶

overrides fiftyone.core.collections.SampleCollection.tags

A list of tags on the dataset.

Examples:

import fiftyone as fo

dataset = fo.Dataset()

# Add some tags
dataset.tags = ["test", "projectA"]

# Edit the tags
dataset.tags.pop()
dataset.tags.append("projectB")
dataset.save()  # must save after edits

@property
version = (source) ¶

The version of the fiftyone package for which the dataset is formatted.

def _add_group_field(self, field_name, default=None, **kwargs): (source) ¶

Undocumented

def _add_implied_frame_field(self, field_name, value, dynamic=False, validate=True): (source) ¶

Undocumented

def _add_implied_sample_field(self, field_name, value, dynamic=False, validate=True): (source) ¶

Undocumented

def _add_samples_batch(self, samples_and_docs): (source) ¶

Writes the given samples and backing docs to the database and returns their IDs.

Parameters
samples_and_docs	a list of tuples of the form `(sample, dict)`, where the dict is the sample's backing document
Returns
a tuple of `pymongo.results.InsertManyResult` a list of IDs of the samples that were added to this dataset

def _add_view_stage(self, stage): (source) ¶

overrides fiftyone.core.collections.SampleCollection._add_view_stage

Returns a fiftyone.core.view.DatasetView containing the contents of the collection with the given fiftyone.core.stages.ViewStage` appended to its aggregation pipeline.

Subclasses are responsible for performing any validation on the view stage to ensure that it is a valid stage to add to this collection.

Parameters
stage	a fiftyone.core.stages.ViewStage`
Returns
a `fiftyone.core.view.DatasetView`

def _aggregate(self, pipeline=None, media_type=None, attach_frames=False, detach_frames=False, frames_only=False, support=None, group_slice=None, group_slices=None, detach_groups=False, groups_only=False, manual_group_select=False, post_pipeline=None): (source) ¶

overrides fiftyone.core.collections.SampleCollection._aggregate

Runs the MongoDB aggregation pipeline on the collection and returns the result.

Parameters
pipeline:`None`	a MongoDB aggregation pipeline (list of dicts) to append to the current pipeline
media_type:`None`	the media type of the collection, if different than the source dataset's media type
attach_frames:`False`	whether to attach the frame documents immediately prior to executing `pipeline`. Only applicable to datasets that contain videos
detach_frames:`False`	whether to detach the frame documents at the end of the pipeline. Only applicable to datasets that contain videos
frames_only:`False`	whether to generate a pipeline that contains only the frames in the collection
support:`None`	an optional `[first, last]` range of frames to attach. Only applicable when attaching frames
group_slice:`None`	the current group slice of the collection, if different than the source dataset's group slice. Only applicable for grouped collections
group_slices:`None`	an optional list of group slices to attach when `groups_only` is True
detach_groups:`False`	whether to detach the group documents at the end of the pipeline. Only applicable to grouped collections
groups_only:`False`	whether to generate a pipeline that contains only the flattened group documents for the collection
manual_group_select:`False`	whether the pipeline has manually handled the initial group selection. Only applicable to grouped collections
post_pipeline:`None`	a MongoDB aggregation pipeline (list of dicts) to append to the very end of the pipeline, after all other arguments are applied
Returns
the aggregation result dict

def _apply_frame_field_schema(self, schema): (source) ¶

Undocumented

def _apply_sample_field_schema(self, schema): (source) ¶

Undocumented

def _attach_frames_pipeline(self, limit=None, support=None): (source) ¶

A pipeline that attaches the frame documents for each document.

def _attach_groups_pipeline(self, group_slices=None): (source) ¶

A pipeline that attaches the requested group slice(s) for each document and stores them in under groups.<slice> keys.

def _bulk_write(self, ops, ids=None, frames=False, ordered=False, progress=False): (source) ¶

Undocumented

def _calculate_size(self, sample): (source) ¶

Undocumented

def _clear(self, view=None, sample_ids=None): (source) ¶

Undocumented

def _clear_frame_fields(self, field_names, view=None): (source) ¶

Undocumented

def _clear_frames(self, view=None, sample_ids=None, frame_ids=None): (source) ¶

Undocumented

def _clear_groups(self, view=None, group_ids=None): (source) ¶

Undocumented

def _clear_sample_fields(self, field_names, view=None): (source) ¶

Undocumented

def _clone(self, name=None, persistent=False, view=None): (source) ¶

Undocumented

def _clone_frame_fields(self, field_mapping, view=None): (source) ¶

Undocumented

def _clone_sample_fields(self, field_mapping, view=None): (source) ¶

Undocumented

def _delete(self): (source) ¶

Undocumented

def _delete_frame_fields(self, field_names, error_level): (source) ¶

Undocumented

def _delete_labels(self, labels, fields=None): (source) ¶

overrides fiftyone.core.collections.SampleCollection._delete_labels

Undocumented

def _delete_sample_fields(self, field_names, error_level): (source) ¶

Undocumented

def _delete_saved_view(self, name): (source) ¶

Undocumented

def _delete_summary_fields(self, field_names, error_level): (source) ¶

Undocumented

def _delete_workspace(self, name): (source) ¶

Undocumented

def _ensure_frames(self, view=None): (source) ¶

Undocumented

def _ensure_label_field(self, label_field, label_cls): (source) ¶

Undocumented

def _estimated_count(self, frames=False): (source) ¶

Undocumented

def _expand_frame_schema(self, frames, dynamic): (source) ¶

Undocumented

def _expand_group_schema(self, field_name, slice_name, media_type): (source) ¶

Undocumented

def _expand_schema(self, sample, dynamic): (source) ¶

Undocumented

def _frame_collstats(self): (source) ¶

Undocumented

def _frame_dict_to_doc(self, d): (source) ¶

Undocumented

def _get_default_summary_field_name(self, path): (source) ¶

Undocumented

def _get_frame_collection(self, write_concern=None): (source) ¶

Undocumented

def _get_sample_collection(self, write_concern=None): (source) ¶

Undocumented

def _get_saved_view_doc(self, name, pop=False, slug=False): (source) ¶

Undocumented

def _get_summarized_fields_map(self): (source) ¶

Undocumented

def _get_workspace_doc(self, name, pop=False, slug=False): (source) ¶

Undocumented

def _group_select_pipeline(self, slice_name): (source) ¶

A pipeline that selects only the given slice's documents from the pipeline.

def _groups_only_pipeline(self, group_slices=None): (source) ¶

A pipeline that looks up the requested group slices for each document and returns (only) the unwound group slices.

def _init_frames(self): (source) ¶

Undocumented

def _iter_groups(self, group_slices=None, pipeline=None): (source) ¶

Undocumented

def _iter_samples(self, pipeline=None): (source) ¶

Undocumented

def _keep(self, view=None, sample_ids=None): (source) ¶

Undocumented

def _keep_fields(self, view=None): (source) ¶

Undocumented

def _keep_frames(self, view=None, frame_ids=None): (source) ¶

Undocumented

def _load_saved_view_from_doc(self, view_doc): (source) ¶

Undocumented

def _make_dict(self, sample, include_id=False, created_at=None, last_modified_at=None): (source) ¶

Undocumented

def _make_frame(self, d): (source) ¶

Undocumented

def _make_sample(self, d): (source) ¶

Undocumented

def _merge_doc(self, doc, fields=None, omit_fields=None, expand_schema=True, merge_info=True, overwrite_info=False): (source) ¶

Undocumented

def _merge_frame_field_schema(self, schema, expand_schema=True, recursive=True, validate=True): (source) ¶

Undocumented

def _merge_sample_field_schema(self, schema, expand_schema=True, recursive=True, validate=True): (source) ¶

Undocumented

def _pipeline(self, pipeline=None, media_type=None, attach_frames=False, detach_frames=False, limit_frames=None, frames_only=False, support=None, group_slice=None, group_slices=None, detach_groups=False, groups_only=False, manual_group_select=False, post_pipeline=None): (source) ¶

overrides fiftyone.core.collections.SampleCollection._pipeline

Returns the MongoDB aggregation pipeline for the collection.

Parameters
pipeline:`None`	a MongoDB aggregation pipeline (list of dicts) to append to the current pipeline
media_type:`None`	the media type of the collection, if different than the source dataset's media type
attach_frames:`False`	whether to attach the frame documents immediately prior to executing `pipeline`. Only applicable to datasets that contain videos
detach_frames:`False`	whether to detach the frame documents at the end of the pipeline. Only applicable to datasets that contain videos
limit_frames	Undocumented
frames_only:`False`	whether to generate a pipeline that contains only the frames in the collection
support:`None`	an optional `[first, last]` range of frames to attach. Only applicable when attaching frames
group_slice:`None`	the current group slice of the collection, if different than the source dataset's group slice. Only applicable for grouped collections
group_slices:`None`	an optional list of group slices to attach when `groups_only` is True
detach_groups:`False`	whether to detach the group documents at the end of the pipeline. Only applicable to grouped collections
groups_only:`False`	whether to generate a pipeline that contains only the flattened group documents for the collection
manual_group_select:`False`	whether the pipeline has manually handled the initial group selection. Only applicable to grouped collections
post_pipeline:`None`	a MongoDB aggregation pipeline (list of dicts) to append to the very end of the pipeline, after all other arguments are applied
Returns
the aggregation pipeline

def _populate_summary_field(self, field_name, summary_info): (source) ¶

Undocumented

def _reload(self, hard=False): (source) ¶

Undocumented

def _reload_docs(self, hard=False): (source) ¶

Undocumented

def _remove_dynamic_frame_fields(self, field_names, error_level): (source) ¶

Undocumented

def _remove_dynamic_sample_fields(self, field_names, error_level): (source) ¶

Undocumented

def _rename_frame_fields(self, field_mapping, view=None): (source) ¶

Undocumented

def _rename_sample_fields(self, field_mapping, view=None): (source) ¶

Undocumented

def _sample_collstats(self): (source) ¶

Undocumented

def _sample_dict_to_doc(self, d): (source) ¶

Undocumented

def _save(self, view=None, fields=None): (source) ¶

Undocumented

def _save_field(self, field, _enforce_read_only=True): (source) ¶

Undocumented

def _serialize(self): (source) ¶

overrides fiftyone.core.collections.SampleCollection._serialize

Undocumented

def _set_media_type(self, media_type): (source) ¶

Undocumented

def _transform_sample(self, sample, expand_schema=True, dynamic=False, validate=True, copy=False, include_id=False): (source) ¶

Transforms the given sample and returns the transformed sample and dict as a pair.

This method handles schema expansion, validation, and preparing the sample's backing document before adding it to the database.

Parameters
sample	the sample to transform
expand_schema:`True`	whether to dynamically add new sample fields encountered
dynamic	Undocumented
validate:`True`	whether to validate the sample against the dataset schema
copy:`False`	whether to create a copy of the sample if it's already in a dataset
include_id:`False`	whether to include the sample's ID in the backing document
dynamic (False	whether to declare dynamic attributes of embedded document fields
Returns
a tuple of `transformed_sample` `backing_document_dict`