module documentation

Dataset exporters.

Copyright 2017-2025, Voxel51, Inc.

Class BatchDatasetExporter Base interface for exporters that export entire fiftyone.core.collections.SampleCollection instances in a single batch.
Class DatasetExporter Base interface for exporting datasets.
Class ExportPathsMixin Mixin for DatasetExporter classes that provides convenience methods for parsing the data_path, labels_path, and export_media parameters supported by many exporters.
Class FiftyOneDatasetExporter Exporter that writes an entire FiftyOne dataset to disk in a serialized JSON format along with its source media.
Class FiftyOneImageClassificationDatasetExporter Exporter that writes an image classification dataset to disk in a simple JSON format.
Class FiftyOneImageDetectionDatasetExporter Exporter that writes an image detection dataset to disk in a simple JSON format.
Class FiftyOneImageLabelsDatasetExporter Exporter that writes a labeled image dataset to disk with labels stored in ETA ImageLabels format.
Class FiftyOneTemporalDetectionDatasetExporter Exporter that writes a temporal video detection dataset to disk in a simple JSON format.
Class FiftyOneVideoLabelsDatasetExporter Exporter that writes a labeled video dataset with labels stored in ETA VideoLabels format.
Class GenericSampleDatasetExporter Interface for exporting datasets of arbitrary fiftyone.core.sample.Sample instances.
Class GroupDatasetExporter Interface for exporting grouped datasets.
Class ImageClassificationDirectoryTreeExporter Exporter that writes an image classification directory tree to disk.
Class ImageDirectoryExporter Exporter that writes a directory of images to disk.
Class ImageExporter Utility class for DatasetExporter instances that export images.
Class ImageSegmentationDirectoryExporter Exporter that writes an image segmentation dataset to disk.
Class LabeledImageDatasetExporter Interface for exporting datasets of labeled image samples.
Class LabeledVideoDatasetExporter Interface for exporting datasets of labeled video samples.
Class LegacyFiftyOneDatasetExporter Legacy exporter that writes an entire FiftyOne dataset to disk in a serialized JSON format along with its source media.
Class MediaDirectoryExporter Exporter that writes a directory of media files of arbitrary type to disk.
Class MediaExporter Base class for DatasetExporter utilities that provide support for populating a directory or manifest of media files.
Class UnlabeledImageDatasetExporter Interface for exporting datasets of unlabeled image samples.
Class UnlabeledMediaDatasetExporter Interface for exporting datasets of unlabeled samples.
Class UnlabeledVideoDatasetExporter Interface for exporting datasets of unlabeled video samples.
Class VideoClassificationDirectoryTreeExporter Exporter that writes a video classification directory tree to disk.
Class VideoDirectoryExporter Exporter that writes a directory of videos to disk.
Class VideoExporter Utility class for DatasetExporter instances that export videos.
Function build_dataset_exporter Builds the DatasetExporter instance for the given parameters.
Function export_samples Exports the given samples to disk.
Function write_dataset Writes the samples to disk as a dataset in the specified format.
Variable logger Undocumented
Function _check_for_clips_export Undocumented
Function _check_for_patches_export Undocumented
Function _classification_to_detections Undocumented
Function _export_annotation_results Undocumented
Function _export_brain_results Undocumented
Function _export_evaluation_results Undocumented
Function _export_run_results Undocumented
Function _make_label_coercion_functions Undocumented
Function _make_single_label_to_list_fcn Undocumented
Function _parse_attributes Undocumented
Function _parse_classifications Undocumented
Function _parse_detections Undocumented
Function _parse_temporal_detections Undocumented
Function _to_labels_map_rev Undocumented
Function _write_batch_dataset Undocumented
Function _write_generic_sample_dataset Undocumented
Function _write_group_dataset Undocumented
Function _write_image_dataset Undocumented
Function _write_unlabeled_dataset Undocumented
Function _write_video_dataset Undocumented
def build_dataset_exporter(dataset_type, strip_none=True, warn_unused=True, **kwargs): (source)

Builds the DatasetExporter instance for the given parameters.

Parameters
dataset_typethe fiftyone.types.Dataset type
strip_none:Truewhether to exclude None-valued items from kwargs
warn_unused:Truewhether to issue warnings for any non-None unused parameters encountered
**kwargskeyword arguments to pass to the dataset exporter's constructor via DatasetExporter(**kwargs)
Returns
a tuple of
def export_samples(samples, export_dir=None, dataset_type=None, data_path=None, labels_path=None, export_media=None, rel_dir=None, dataset_exporter=None, label_field=None, frame_labels_field=None, progress=None, num_samples=None, **kwargs): (source)

Exports the given samples to disk.

You can perform exports with this method via the following basic patterns:

  1. Provide export_dir and dataset_type to export the content to a directory in the default layout for the specified format, as documented in :ref:`this page <exporting-datasets>`
  2. Provide dataset_type along with data_path, labels_path, and/or export_media to directly specify where to export the source media and/or labels (if applicable) in your desired format. This syntax provides the flexibility to, for example, perform workflows like labels-only exports
  3. Provide a dataset_exporter to which to feed samples to perform a fully-customized export

In all workflows, the remaining parameters of this method can be provided to further configure the export.

See :ref:`this page <exporting-datasets>` for more information about the available export formats and examples of using this method.

See :ref:`this guide <custom-dataset-exporter>` for more details about exporting datasets in custom formats by defining your own fiftyone.utils.data.exporters.DatasetExporter.

This method will automatically coerce the data to match the requested export in the following cases:

Parameters
samplesa fiftyone.core.collections.SampleCollection
export_dir:Nonethe directory to which to export the samples in format dataset_type
dataset_type:Nonethe fiftyone.types.Dataset type to write
data_path:None

an optional parameter that enables explicit control over the location of the exported media for certain export formats. Can be any of the following:

  • a folder name like "data" or "data/" specifying a subfolder of export_dir in which to export the media
  • an absolute directory path in which to export the media. In this case, the export_dir has no effect on the location of the data
  • a filename like "data.json" specifying the filename of a JSON manifest file in export_dir generated when export_media is "manifest"
  • an absolute filepath specifying the location to write the JSON manifest file when export_media is "manifest". In this case, export_dir has no effect on the location of the data

If None, a default value of this parameter will be chosen based on the value of the export_media parameter. Note that this parameter is not applicable to certain export formats such as binary types like TF records

labels_path:None

an optional parameter that enables explicit control over the location of the exported labels. Only applicable when exporting in certain labeled dataset formats. Can be any of the following:

  • a type-specific folder name like "labels" or "labels/" or a filename like "labels.json" or "labels.xml" specifying the location in export_dir in which to export the labels
  • an absolute directory or filepath in which to export the labels. In this case, the export_dir has no effect on the location of the labels

For labeled datasets, the default value of this parameter will be chosen based on the export format so that the labels will be exported into export_dir

export_media:None

controls how to export the raw media. The supported values are:

  • True: copy all media files into the output directory
  • False: don't export media. This option is only useful when exporting labeled datasets whose label format stores sufficient information to locate the associated media
  • "move": move all media files into the output directory
  • "symlink": create symlinks to the media files in the output directory
  • "manifest": create a data.json in the output directory that maps UUIDs used in the labels files to the filepaths of the source media, rather than exporting the actual media

If None, an appropriate default value of this parameter will be chosen based on the value of the data_path parameter. Note that some dataset formats may not support certain values for this parameter (e.g., when exporting in binary formats such as TF records, "symlink" is not an option)

rel_dir:Nonean optional relative directory to strip from each input filepath to generate a unique identifier for each media. When exporting media, this identifier is joined with data_path to generate an output path for each exported media. This argument allows for populating nested subdirectories that match the shape of the input paths. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path
dataset_exporter:Nonea DatasetExporter to use to write the dataset
label_field:Nonethe name of the label field to export, or a dictionary mapping field names to output keys describing the label fields to export. Only applicable if dataset_exporter is a LabeledImageDatasetExporter or LabeledVideoDatasetExporter, or if you are exporting image patches
frame_labels_field:Nonethe name of the frame label field to export, or a dictionary mapping field names to output keys describing the frame label fields to export. Only applicable if dataset_exporter is a LabeledVideoDatasetExporter
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
num_samples:Nonethe number of samples in samples. If omitted, this is computed (if possible) via len(samples) if needed for progress tracking
**kwargsoptional keyword arguments to pass to the dataset exporter's constructor. If you are exporting image patches, this can also contain keyword arguments for fiftyone.utils.patches.ImagePatchesExtractor
def write_dataset(samples, sample_parser, dataset_exporter, sample_collection=None, progress=None, num_samples=None): (source)

Writes the samples to disk as a dataset in the specified format.

Parameters
samplesan iterable of samples that can be parsed by sample_parser
sample_parsera fiftyone.utils.data.parsers.SampleParser to use to parse the samples
dataset_exportera DatasetExporter to use to write the dataset
sample_collection:Nonethe fiftyone.core.collections.SampleCollection from which samples were extracted. If samples is itself a fiftyone.core.collections.SampleCollection, this parameter defaults to samples. This parameter is optional and is only passed to DatasetExporter.log_collection
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
num_samples:Nonethe number of samples in samples. If omitted, this is computed (if possible) via len(samples) if needed for progress tracking

Undocumented

def _check_for_clips_export(samples, dataset_exporter, label_field, kwargs): (source)

Undocumented

def _check_for_patches_export(samples, dataset_exporter, label_field, kwargs): (source)

Undocumented

def _classification_to_detections(label): (source)

Undocumented

def _export_annotation_results(sample_collection, anno_dir): (source)

Undocumented

def _export_brain_results(sample_collection, brain_dir): (source)

Undocumented

def _export_evaluation_results(sample_collection, eval_dir): (source)

Undocumented

def _export_run_results(sample_collection, runs_dir): (source)

Undocumented

def _make_label_coercion_functions(label_field_or_dict, sample_collection, dataset_exporter, frames=False, validate=True): (source)

Undocumented

def _make_single_label_to_list_fcn(label_cls): (source)

Undocumented

def _parse_attributes(label_dict, label, include_confidence=None, include_attributes=None): (source)

Undocumented

def _parse_classifications(label, labels_map_rev=None, include_confidence=False, include_attributes=None): (source)

Undocumented

def _parse_detections(detections, labels_map_rev=None, include_confidence=None, include_attributes=None): (source)

Undocumented

def _parse_temporal_detections(temporal_detections, labels_map_rev=None, metadata=None, use_timestamps=False, include_confidence=None, include_attributes=None): (source)

Undocumented

def _to_labels_map_rev(classes): (source)

Undocumented

def _write_batch_dataset(dataset_exporter, samples, progress=None): (source)

Undocumented

def _write_generic_sample_dataset(dataset_exporter, samples, sample_collection=None, progress=None, num_samples=None): (source)

Undocumented

def _write_group_dataset(dataset_exporter, samples, sample_collection=None, progress=None, num_samples=None): (source)

Undocumented

def _write_image_dataset(dataset_exporter, samples, sample_parser, sample_collection=None, progress=None, num_samples=None): (source)

Undocumented

def _write_unlabeled_dataset(dataset_exporter, samples, sample_parser, sample_collection=None, progress=None, num_samples=None): (source)

Undocumented

def _write_video_dataset(dataset_exporter, samples, sample_parser, sample_collection=None, progress=None, num_samples=None): (source)

Undocumented