The FiftyOne Dataset Zoo.
This package defines a collection of open source datasets made available for download via FiftyOne.
Module | base |
FiftyOne Zoo Datasets provided natively by the library. |
Module | tf |
FiftyOne Zoo Datasets provided by tensorflow_datasets. |
Module | torch |
FiftyOne Zoo Datasets provided by torchvision:torchvision.datasets . |
From __init__.py
:
Class |
|
Class representing a zoo dataset that no longer exists in the FiftyOne Dataset Zoo. |
Class |
|
Class for working with remotely-sourced datasets that are compatible with the FiftyOne Dataset Zoo. |
Class |
|
Base class for datasets made available in the FiftyOne Dataset Zoo. |
Class |
|
Class containing info about a dataset in the FiftyOne Dataset Zoo. |
Class |
|
Class containing info about a split of a dataset in the FiftyOne Dataset Zoo. |
Function | delete |
Deletes the zoo dataset from local disk, if necessary. |
Function | download |
Downloads the specified dataset from the FiftyOne Dataset Zoo. |
Function | find |
Returns the directory containing the given zoo dataset. |
Function | get |
Returns the ZooDataset instance for the given dataset. |
Function | list |
Returns information about the zoo datasets that have been downloaded. |
Function | list |
Returns the list of available zoo dataset sources. |
Function | list |
Lists the available datasets in the FiftyOne Dataset Zoo. |
Function | load |
Loads the specified dataset from the FiftyOne Dataset Zoo. |
Function | load |
Loads the ZooDatasetInfo for the specified zoo dataset. |
Constant | DATASET |
Undocumented |
Variable | logger |
Undocumented |
Function | _download |
Undocumented |
Function | _download |
Undocumented |
Function | _find |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _init |
Undocumented |
Function | _list |
Undocumented |
Function | _load |
Undocumented |
Function | _load |
Undocumented |
Function | _migrate |
Undocumented |
Function | _normalize |
Undocumented |
Function | _overwrite |
Undocumented |
Function | _parse |
Undocumented |
Function | _parse |
Undocumented |
Function | _parse |
Undocumented |
Deletes the zoo dataset from local disk, if necessary.
If a split is provided, only that split is deleted.
Parameters | |
name | the name of the zoo dataset, or its remote source, which can be:
|
split:None |
Downloads the specified dataset from the FiftyOne Dataset Zoo.
Any dataset splits that have already been downloaded are not re-downloaded, unless overwrite == True is specified.
Note
To download from a private GitHub repository that you have access to, provide your GitHub personal access token by setting the GITHUB_TOKEN environment variable.
Parameters | |
name | the name of the zoo dataset to download, or the remote source to download it from, which can be:
|
split:None | ("train", "validation", "test"). If neither split nor
splits are provided, all available splits are downloaded.
Consult the documentation for the ZooDataset you specified
to see the supported splits |
splits:None | a list of splits to download, if applicable. Typical
values are ("train", "validation", "test"). If neither
split nor splits are provided, all available splits are
downloaded. Consult the documentation for the ZooDataset
you specified to see the supported splits |
overwrite:False | whether to overwrite any existing files |
cleanup:True | whether to cleanup any temporary files generated during download |
**kwargs | optional arguments for the ZooDataset constructor
or the remote dataset's download_and_prepare() method |
Returns | |
a tuple of |
|
Returns the directory containing the given zoo dataset.
If a split is provided, the path to the dataset split is returned; otherwise, the path to the root directory is returned.
The dataset must be downloaded. Use download_zoo_dataset
to
download datasets.
Parameters | |
name | the name of the zoo dataset or its remote source, which can be:
|
split:None | a specific split to locate |
Returns | |
the directory containing the dataset or split | |
Raises | |
ValueError | if the dataset or split does not exist or has not been downloaded |
Returns the ZooDataset
instance for the given dataset.
If the dataset is available from multiple sources, the default source is used.
Parameters | |
name | the name of the zoo dataset, or its remote source, which can be:
|
overwrite:False | whether to overwrite existing metadata if it has already been downloaded. Only applicable when name_or_url is a remote source |
**kwargs | optional arguments for ZooDataset |
Returns | |
the ZooDataset instance |
Returns information about the zoo datasets that have been downloaded.
Returns | |
a dict mapping dataset names to
(dataset_dir, ZooDatasetInfo ) tuples |
Lists the available datasets in the FiftyOne Dataset Zoo.
Also includes any remotely-sourced zoo datasets that you've downloaded.
Example usage:
import fiftyone as fo import fiftyone.zoo as foz # # List all zoo datasets # names = foz.list_zoo_datasets() print(names) # # List all zoo datasets with (both of) the specified tags # names = foz.list_zoo_datasets(tags=["image", "detection"]) print(names) # # List all zoo datasets available via the given source # names = foz.list_zoo_datasets(source="torch") print(names)
Parameters | |
tags:None | only include datasets that have the specified tag or list of tags |
source:None | only include datasets available via the given source or list of sources |
Returns | |
a sorted list of dataset names |
Loads the specified dataset from the FiftyOne Dataset Zoo.
By default, the dataset will be downloaded if necessary.
Note
To download from a private GitHub repository that you have access to, provide your GitHub personal access token by setting the GITHUB_TOKEN environment variable.
If you do not specify a custom dataset_name and you have previously loaded the same zoo dataset and split(s) into FiftyOne, the existing dataset will be returned.
Parameters | |
name | the name of the zoo dataset to load, or the remote source to load it from, which can be:
|
split:None | ("train", "validation", "test"). If neither split nor
splits are provided, all available splits are loaded. Consult
the documentation for the ZooDataset you specified to see
the supported splits |
splits:None | a list of splits to load, if applicable. Typical values
are ("train", "validation", "test"). If neither split nor
splits are provided, all available splits are loaded. Consult
the documentation for the ZooDataset you specified to see
the supported splits |
labelNone | the label field (or prefix, if the dataset contains multiple label fields) in which to store the dataset's labels. By default, this is "ground_truth" if the dataset contains a single label field. If the dataset contains multiple label fields and this value is not provided, the labels will be stored under dataset-specific field names |
datasetNone | an optional name to give the returned
fiftyone.core.dataset.Dataset . By default, a name will be
constructed based on the dataset and split(s) you are loading |
downloadTrue | whether to download the dataset if it is not found in the specified dataset directory |
dropFalse | whether to drop an existing dataset with the same name if it exists |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite any existing files if the dataset is to be downloaded |
cleanup:True | whether to cleanup any temporary files generated during download |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional arguments to pass to the
fiftyone.utils.data.importers.DatasetImporter constructor
or the remote dataset's load_dataset()` method. If
``download_if_necessary == True, then kwargs can also contain
arguments for download_zoo_dataset |
Returns | |
a fiftyone.core.dataset.Dataset |
Loads the ZooDatasetInfo
for the specified zoo dataset.
The dataset must be downloaded. Use download_zoo_dataset
to
download datasets.
Parameters | |
name | the name of the zoo dataset or its remote source, which can be:
|
Returns | |
the ZooDatasetInfo for the dataset | |
Raises | |
ValueError | if the dataset has not been downloaded |