Skip to content

Loading data into FiftyOne

The first step to using FiftyOne is to load your data into a dataset. FiftyOne supports automatic loading of datasets stored in various common formats. If your dataset is stored in a custom format, don’t worry, FiftyOne also provides support for easily loading datasets in custom formats.

Check out the sections below to see which import pattern is the best fit for your data.

Note

Did you know? You can import media and/or labels from within the FiftyOne App by installing the @voxel51/io plugin!

Note

When you create a Dataset, its samples and all of their fields (metadata, labels, custom fields, etc.) are written to FiftyOne’s backing database.

Important: Samples only store the filepath to the media, not the raw media itself. FiftyOne does not create duplicate copies of your data!

Common formats

If your data is stored on disk in one of the many common formats supported natively by FiftyOne, then you can automatically load your data into a Dataset via the following simple pattern:

import fiftyone as fo

# A name for the dataset
name = "my-dataset"

# The directory containing the dataset to import
dataset_dir = "/path/to/dataset"

# The type of the dataset being imported
dataset_type = fo.types.COCODetectionDataset  # for example

dataset = fo.Dataset.from_dir(
    dataset_dir=dataset_dir,
    dataset_type=dataset_type,
    name=name,
)

Note

Check out this page for more details about loading datasets from disk in common formats!

Custom formats

The simplest and most flexible approach to loading your data into FiftyOne is to iterate over your data in a simple Python loop, create a Sample for each data + label(s) pair, and then add those samples to a Dataset.

FiftyOne provides label types for common tasks such as classification, detection, segmentation, and many more. The examples below give you a sense of the basic workflow for a few tasks:

Note that using Dataset.add_samples() to add batches of samples to your datasets can be significantly more efficient than adding samples one-by-one via Dataset.add_sample().

Note

If you use the same custom data format frequently in your workflows, then writing a custom dataset importer is a great way to abstract and streamline the loading of your data into FiftyOne.

Loading images

If you’re just getting started with a project and all you have is a bunch of image files, you can easily load them into a FiftyOne dataset and start visualizing them in the App:

Loading videos

If you’re just getting started with a project and all you have is a bunch of video files, you can easily load them into a FiftyOne dataset and start visualizing them in the App:

Model predictions

Once you’ve created a dataset and ground truth labels, you can easily add model predictions to take advantage of FiftyOne’s evaluation capabilities.

Need data?

The FiftyOne Dataset Zoo contains dozens of popular public datasets that you can load into FiftyOne in a single line of code:

import fiftyone.zoo as foz

# List available datasets
print(foz.list_zoo_datasets())
# ['coco-2014', ...,  'kitti', ..., 'voc-2012', ...]

# Load a split of a zoo dataset
dataset = foz.load_zoo_dataset("cifar10", split="train")

Note

Check out the available zoo datasets!