class OpenImagesV7Dataset(FiftyOneDataset): (source)
Constructor: OpenImagesV7Dataset(label_types, classes, attrs, image_ids, ...)
Open Images V7 is a dataset of ~9 million images, roughly 2 million of which are annotated and available via this zoo dataset.
The dataset contains annotations for classification, detection, segmentation, point labels, and visual relationship tasks for the 600 boxable object classes.
This dataset supports partial downloads:
- You can specify subsets of data to download via the``label_types``, classes, attrs, and max_samples parameters
- You can specify specific images to load via the image_ids parameter
See :ref:`this page <dataset-zoo-open-images-v6>` for more information about partial downloads of this dataset.
Full split stats:
- Train split: 1,743,042 images (513 GB)
- Test split: 125,436 images (36 GB)
- Validation split: 41,620 images (12 GB)
Notes:
- Not all images contain all types of labels
- All images have been rescaled so that their largest dimension is at most 1024 pixels
Example usage:
# # Load 50 random samples from the validation split # # By default, all label types are loaded, including "points" # dataset = foz.load_zoo_dataset( "open-images-v7", split="validation", max_samples=50, shuffle=True, ) session = fo.launch_app(dataset) # # Load detections, classifications, and points for 25 samples from the # validation split that contain fedoras and pianos # # Images that contain all `label_types` and `classes` will be # prioritized first, followed by images that contain at least one of # the required `classes`. If there are not enough images matching # `classes` in the split to meet `max_samples`, only the available # images will be loaded. # # Images will only be downloaded if necessary # dataset = foz.load_zoo_dataset( "open-images-v7", split="validation", label_types=["detections", "classifications", "points"], classes=["Fedora", "Piano"], max_samples=25, ) session.dataset = dataset # # Download the entire validation split and load detections # # Subsequent partial loads of the validation split will never require # downloading any images # dataset = foz.load_zoo_dataset( "open-images-v7", split="validation", label_types=["detections"], ) session.dataset = dataset
- Dataset size
- 561 GB
- Source
- https://storage.googleapis.com/openimages/web/index.html
Parameters | |
label | a label type or list of label types to load. The supported values are ("detections", "classifications", "points", "relationships", "segmentations"). By default, all label types are loaded |
classes | a string or list of strings specifying required classes to load. If provided, only samples containing at least one instance of a specified class will be loaded |
attrs | a string or list of strings specifying required relationship attributes to load. Only applicable when label_types includes "relationships". If provided, only samples containing at least one instance of a specified attribute will be loaded |
image | an optional list of specific image IDs to load. Can be provided in any of the following formats:
|
num | a suggested number of threads to use when downloading individual images |
shuffle | whether to randomly shuffle the order in which samples are chosen for partial downloads |
seed | a random seed to use when shuffling |
max | a maximum number of samples to load per split. If label_types, classes, and/or attrs are also specified, first priority will be given to samples that contain all of the specified label types, classes, and/or attributes, followed by samples that contain at least one of the specified labels types or classes. The actual number of samples loaded may be less than this maximum value if the dataset does not contain sufficient samples matching your requirements. By default, all matching samples are loaded |
Method | __init__ |
Undocumented |
Instance Variable | attrs |
Undocumented |
Instance Variable | classes |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | label |
Undocumented |
Instance Variable | max |
Undocumented |
Instance Variable | num |
Undocumented |
Instance Variable | seed |
Undocumented |
Instance Variable | shuffle |
Undocumented |
Property | name |
The name of the dataset. |
Property | supported |
A tuple of supported splits for the dataset, or None if the dataset does not have splits. |
Property | supports |
Whether the dataset supports downloading partial subsets of its splits. |
Property | tags |
A tuple of tags for the dataset. |
Method | _download |
Internal implementation of downloading the dataset and preparing it for use in the given directory. |
Inherited from ZooDataset
(via FiftyOneDataset
):
Static Method | get |
Returns the path to the ZooDatasetInfo for the dataset. |
Static Method | has |
Determines whether the directory contains ZooDatasetInfo . |
Static Method | load |
Loads the ZooDatasetInfo from the given dataset directory. |
Method | download |
Downloads the dataset and prepares it for use. |
Method | get |
Returns the directory for the given split of the dataset. |
Method | has |
Whether the dataset has the given split. |
Method | has |
Whether the dataset has the given tag. |
Property | has |
Whether the dataset has patches that may need to be applied to already downloaded files. |
Property | has |
Whether the dataset has splits. |
Property | has |
Whether the dataset has tags. |
Property | importer |
A dict of default kwargs to pass to this dataset's fiftyone.utils.data.importers.DatasetImporter . |
Property | is |
Whether the dataset is remotely-sourced. |
Property | parameters |
An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded. |
Property | requires |
Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded. |
Method | _get |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _patch |
Internal method called when an already downloaded dataset may need to be patched. |
Undocumented
Internal implementation of downloading the dataset and preparing it for use in the given directory.
Parameters | |
dataset | the directory in which to construct the dataset. If a split is provided, this is the directory for the split |
_ | Undocumented |
split | the split to download, or None if the dataset does not have splits |
scratch | a scratch directory to use to download and prepare any required intermediate files |
Returns | |
tuple of |
|