module documentation

Utilities for working with the ActivityNet dataset.

Copyright 2017-2025, Voxel51, Inc.

Class ActivityNet100DatasetInfo ActivityNet 100 dataset info.
Class ActivityNet200DatasetInfo ActivityNet 200 dataset info.
Class ActivityNetDatasetImporter Class for importing AcitivityNet dataset splits downloaded via download_activitynet_split.
Class ActivityNetDatasetInfo Class that stores information related to paths, labels, and sample IDs for an ActivityNet dataset download.
Class ActivityNetDatasetManager Class that manages the sample IDs and labels that need to be downloaded to load the specified subset of an ActivityNet dataset.
Class ActivityNetDownloadConfig Configuration class for downloading full or partial splits from the ActivityNet dataset.
Class ActivityNetInfo Necessary information used to parse and format annotations.
Class ActivityNetSplitInfo Class that contains information related to paths, labels, and sample IDs of a single ActivityNet split.
Function download_activitynet_split Utility that downloads full or partial splits of the ActivityNet dataset.
Variable logger Undocumented
Function _flatten_list Undocumented
Function _get_all_classes Undocumented
Constant _ANNOTATION_DOWNLOAD_LINKS Undocumented
Constant _NUM_TOTAL_SAMPLES Undocumented
Constant _SOURCE_DIR_NAMES Undocumented
Constant _SOURCE_ZIPS Undocumented
Constant _SPLIT_MAP Undocumented
Constant _SPLIT_MAP_REV Undocumented
def download_activitynet_split(dataset_dir, split, source_dir=None, classes=None, max_duration=None, copy_files=True, num_workers=None, shuffle=None, seed=None, max_samples=None, version='200'): (source)

Utility that downloads full or partial splits of the ActivityNet dataset.

Parameters
dataset_dirthe directory to download the dataset
splitthe split to download. Supported values are ("train", "validation", "test")
source_dir:Nonethe directory containing the manually downloaded ActivityNet files
classes:Nonea string or list of strings specifying required classes to load. If provided, only samples containing at least one instance of a specified class will be loaded
max_duration:Noneonly videos with a duration in seconds that is less than or equal to the max_duration will be downloaded. By default, all videos are downloaded
copy_files:Truewhether to move (False) or create copies (True) of the source files when populating dataset_dir. This is only relevant when a source_dir is provided
num_workers:Nonea suggested number of threads to use when downloading individual videos
shuffle:Falsewhether to randomly shuffle the order in which samples are chosen for partial downloads
seed:Nonea random seed to use when shuffling
max_samples:Nonea maximum number of samples to load per split. If classes are also specified, only up to the number of samples that contain at least one specified class will be loaded. By default, all matching samples are loaded
version:"200"the ActivityNet dataset version to download. The supported values are ("100", "200")
Returns
a tuple of
  • num_samples: the total number of downloaded videos, or None if everything was already downloaded
  • classes: the list of all classes, or None if everything was already downloaded
  • did_download: whether any content was downloaded (True) or if all necessary files were already downloaded (False)

Undocumented

def _flatten_list(l): (source)

Undocumented

def _get_all_classes(taxonomy): (source)

Undocumented

_ANNOTATION_DOWNLOAD_LINKS: dict[str, str] = (source)

Undocumented

Value
{'100': 'http://ec2-52-25-205-214.us-west-2.compute.amazonaws.com/files/activity
_net.v1-2.min.json',
 '200': 'http://ec2-52-25-205-214.us-west-2.compute.amazonaws.com/files/activity
_net.v1-3.min.json'}
_NUM_TOTAL_SAMPLES: dict = (source)

Undocumented

Value
{'100': {'train': 4819, 'test': 2480, 'validation': 2383},
 '200': {'train': 10024, 'test': 5044, 'validation': 4926}}
_SOURCE_DIR_NAMES: dict = (source)

Undocumented

Value
{'missing_files': [],
 'missing_files_v1-2_test': [],
 'missing_files_v1-3_test': [],
 'v1-2': ['test', 'train', 'val'],
 'v1-3': ['test', 'train_val']}
_SOURCE_ZIPS: list[str] = (source)

Undocumented

Value
['missing_files.zip',
 'missing_files_v1-2_test.zip',
 'missing_files_v1-3_test.zip',
 'v1-2_test.tar.gz',
 'v1-2_train.tar.gz',
 'v1-2_val.tar.gz',
 'v1-3_test.tar.gz',
...
_SPLIT_MAP: dict[str, str] = (source)

Undocumented

Value
{'train': 'training', 'test': 'testing', 'validation': 'validation'}
_SPLIT_MAP_REV = (source)

Undocumented

Value
{v: k for k, v in _SPLIT_MAP.items()}