fiftyone.zoo.datasets.base.ActivityNet100Dataset

class documentation

class ActivityNet100Dataset(FiftyOneDataset): (source)

Constructor: ActivityNet100Dataset(source_dir, classes, max_duration, copy_files, ...)

ActivityNet is a large-scale video dataset for human activity understanding supporting the tasks of global video classification, trimmed activity classification, and temporal activity detection.

This version contains videos and temporal activity detections for the 100 class version of the dataset.

Notes:

ActivityNet 100 and 200 differ in the number of activity classes and videos per split
Partial downloads will download videos (if still available) from YouTube
Full splits can be loaded by first downloading the official source files from the ActivityNet maintainers
The test set does not have annotations

Full split stats:

Train split: 4,819 videos (7,151 instances)
Test split: 2,480 videos (labels withheld)
Validation split: 2,383 videos (3,582 instances)

Partial downloads:

You can specify subsets of data to download via the max_duration, classes, and max_samples parameters

Full split downloads:

Many videos have been removed from YouTube since the creation of ActivityNet. As a result, if you do not specify any partial download parameters described below, you must first download the official source files from the ActivityNet maintainers in order to load a full split into FiftyOne.

To download the source files, you must fill out this form.

Refer to :ref:`this page <activitynet-full-split-downloads>` to see how to load full splits by passing the source_dir parameter to load_zoo_dataset().

Example usage:

import fiftyone as fo
import fiftyone.zoo as foz

#
# Load 10 random samples from the validation split
#
# Only the required videos will be downloaded (if necessary)
#

dataset = foz.load_zoo_dataset(
    "activitynet-100",
    split="validation",
    max_samples=10,
    shuffle=True,
)

session = fo.launch_app(dataset)

#
# Load 10 samples from the validation split that
# contain the actions "Bathing dog" and "Walking the dog"
#
# Videos that contain all `classes` will be prioritized first, followed
# by videos that contain at least one of the required `classes`. If
# there are not enough videos matching `classes` in the split to meet
# `max_samples`, only the available videos will be loaded.
#
# Videos will only be downloaded if necessary
#
# Subsequent partial loads of the validation split will never require
# downloading any videos
#

dataset = foz.load_zoo_dataset(
    "activitynet-100",
    split="validation",
    classes=["Bathing dog", "Walking the dog"],
    max_samples=10,
)

session.dataset = dataset

Dataset size: 223 GB
Source: http://activity-net.org/index.html

Parameters
source_dir	the directory containing the manually downloaded ActivityNet files used to avoid downloading videos from YouTube
classes	a string or list of strings specifying required classes to load. If provided, only samples containing at least one instance of a specified class will be loaded
max_duration	only videos with a duration in seconds that is less than or equal to the `max_duration` will be downloaded. By default, all videos are downloaded
copy_files	whether to move (False) or create copies (True) of the source files when populating `dataset_dir`. This is only relevant when a `source_dir` is provided
num_workers	a suggested number of threads to use when downloading individual images
shuffle	whether to randomly shuffle the order in which samples are chosen for partial downloads
seed	a random seed to use when shuffling
max_samples	a maximum number of samples to load per split. If `classes` are also specified, only up to the number of samples that contain at least one specified class will be loaded. By default, all matching samples are loaded

Method	`__init__`	Undocumented
Instance Variable	`classes`	Undocumented
Instance Variable	`copy_files`	Undocumented
Instance Variable	`max_duration`	Undocumented
Instance Variable	`max_samples`	Undocumented
Instance Variable	`num_workers`	Undocumented
Instance Variable	`seed`	Undocumented
Instance Variable	`shuffle`	Undocumented
Instance Variable	`source_dir`	Undocumented
Property	`license`	The license or list,of,licenses under which the dataset is distributed, or None if unknown.
Property	`name`	The name of the dataset.
Property	`supported_splits`	A tuple of supported splits for the dataset, or None if the dataset does not have splits.
Property	`supports_partial_downloads`	Whether the dataset supports downloading partial subsets of its splits.
Property	`tags`	A tuple of tags for the dataset.
Method	`_download_and_prepare`	Internal implementation of downloading the dataset and preparing it for use in the given directory.

Inherited from ZooDataset (via FiftyOneDataset):

Static Method	`get_info_path`	Returns the path to the `ZooDatasetInfo` for the dataset.
Static Method	`has_info`	Determines whether the directory contains `ZooDatasetInfo`.
Static Method	`load_info`	Loads the `ZooDatasetInfo` from the given dataset directory.
Method	`download_and_prepare`	Downloads the dataset and prepares it for use.
Method	`get_split_dir`	Returns the directory for the given split of the dataset.
Method	`has_split`	Whether the dataset has the given split.
Method	`has_tag`	Whether the dataset has the given tag.
Property	`has_patches`	Whether the dataset has patches that may need to be applied to already downloaded files.
Property	`has_splits`	Whether the dataset has splits.
Property	`has_tags`	Whether the dataset has tags.
Property	`importer_kwargs`	A dict of default kwargs to pass to this dataset's `fiftyone.utils.data.importers.DatasetImporter`.
Property	`is_remote`	Whether the dataset is remotely-sourced.
Property	`parameters`	An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
Property	`requires_manual_download`	Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
Method	`_get_splits_to_download`	Undocumented
Method	`_is_dataset_ready`	Undocumented
Method	`_is_split_ready`	Undocumented
Method	`_patch_if_necessary`	Internal method called when an already downloaded dataset may need to be patched.

def __init__(self, source_dir=None, classes=None, max_duration=None, copy_files=True, num_workers=None, shuffle=None, seed=None, max_samples=None): (source) ¶

Undocumented

classes: None = (source) ¶

Undocumented

copy_files: True = (source) ¶

Undocumented

max_duration: None = (source) ¶

Undocumented

max_samples: None = (source) ¶

Undocumented

num_workers: None = (source) ¶

Undocumented

seed: None = (source) ¶

Undocumented

shuffle: False = (source) ¶

Undocumented

source_dir: None = (source) ¶

Undocumented

@property
license = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.license

The license or list,of,licenses under which the dataset is distributed, or None if unknown.

@property
name = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.name

The name of the dataset.

@property
supported_splits = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.supported_splits

A tuple of supported splits for the dataset, or None if the dataset does not have splits.

@property
supports_partial_downloads = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.supports_partial_downloads

Whether the dataset supports downloading partial subsets of its splits.

@property
tags = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.tags

A tuple of tags for the dataset.

def _download_and_prepare(self, dataset_dir, _, split): (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset._download_and_prepare

Internal implementation of downloading the dataset and preparing it for use in the given directory.

Parameters
dataset_dir	the directory in which to construct the dataset. If a `split` is provided, this is the directory for the split
_	Undocumented
split	the split to download, or None if the dataset does not have splits
scratch_dir	a scratch directory to use to download and prepare any required intermediate files
Returns
tuple of	dataset_type: the `fiftyone.types.Dataset` type of the dataset num_samples: the number of samples in the split. For datasets that support partial downloads, this can be `None`, which indicates that all content was already downloaded classes: an optional list of class label strings