module documentation
Utilities for working with Hugging Face.
Class |
|
Config for a Hugging Face Hub dataset. |
Class |
|
Config for a Hugging Face Hub dataset that is stored as parquet files. |
Class |
|
Undocumented |
Function | list |
Lists all FiftyOne datasets available on the Hugging Face Hub. |
Function | load |
Loads a dataset from the Hugging Face Hub into FiftyOne. |
Function | push |
Push a FiftyOne dataset to the Hugging Face Hub. |
Constant | DATASET |
Undocumented |
Constant | DATASET |
Undocumented |
Constant | DATASETS |
Undocumented |
Constant | DATASETS |
Undocumented |
Constant | DEFAULT |
Undocumented |
Constant | DEFAULT |
Undocumented |
Constant | DEFAULT |
Undocumented |
Constant | FIFTYONE |
Undocumented |
Constant | SUPPORTED |
Undocumented |
Variable | hfh |
Undocumented |
Variable | logger |
Undocumented |
Function | _add |
Undocumented |
Function | _add |
Undocumented |
Function | _build |
Undocumented |
Function | _build |
Undocumented |
Function | _build |
Undocumented |
Function | _build |
Undocumented |
Function | _build |
Undocumented |
Function | _build |
Undocumented |
Function | _configure |
Undocumented |
Function | _convert |
Undocumented |
Function | _count |
Undocumented |
Function | _create |
Undocumented |
Function | _download |
Undocumented |
Function | _download |
Undocumented |
Function | _download |
Undocumented |
Function | _ensure |
Undocumented |
Function | _extract |
Undocumented |
Function | _generate |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _is |
Undocumented |
Function | _is |
Undocumented |
Function | _load |
Undocumented |
Function | _load |
Undocumented |
Function | _load |
Undocumented |
Function | _no |
Undocumented |
Function | _parse |
Undocumented |
Function | _parse |
Undocumented |
Function | _parse |
Undocumented |
Function | _populate |
Undocumented |
Function | _resolve |
Undocumented |
Function | _upload |
Undocumented |
Lists all FiftyOne datasets available on the Hugging Face Hub.
This method includes all datasets that are tagged to the FiftyOne library in Hugging Face.
Examples:
from fiftyone.utils.huggingface import list_hub_datasets datasets = list_hub_datasets() print(datasets)
Parameters | |
info:False | whether to return dataset names (False) or huggingface_hub.hf_api.DatasetInfo objects (True) |
Returns | |
a list of dataset names or objects |
def load_from_hub(repo_id, revision=None, split=None, splits=None, subset=None, subsets=None, max_samples=None, batch_size=None, num_workers=None, overwrite=False, persistent=False, name=None, token=None, config_file=None, **kwargs):
(source)
¶
Loads a dataset from the Hugging Face Hub into FiftyOne.
Parameters | |
repo | the Hugging Face Hub identifier of the dataset |
revision:None | the revision of the dataset to load |
split:None | the split of the dataset to load |
splits:None | the splits of the dataset to load |
subset:None | the subset of the dataset to load |
subsets:None | the subsets of the dataset to load |
maxNone | the maximum number of samples to load |
batchNone | the batch size to use when loading samples |
numNone | a suggested number of threads to use when downloading media |
overwrite:True | whether to overwrite an existing dataset with the same name |
persistent:False | whether the dataset should be persistent |
name:None | an optional name to give the dataset |
token:None | a Hugging Face API token to use. May also be provided via the HF_TOKEN environment variable |
configNone | the path to a config file on disk specifying how to load the dataset if the repo has no fiftyone.yml file |
**kwargs | keyword arguments specifying config parameters to load the dataset if the repo has no fiftyone.yml file |
Returns | |
a fiftyone.core.dataset.Dataset |
def push_to_hub(dataset, repo_name, description=None, license=None, tags=None, private=False, exist_ok=False, dataset_type=None, min_fiftyone_version=None, label_field=None, frame_labels_field=None, token=None, preview_path=None, chunk_size=None, **data_card_kwargs):
(source)
¶
Push a FiftyOne dataset to the Hugging Face Hub.
Parameters | |
dataset | a FiftyOne dataset |
repo | the name of the dataset repo to create. The repo ID will be {your_username}/{repo_name} |
description:None | a description of the dataset |
license:None | the license of the dataset |
tags:None | a list of tags for the dataset |
private:True | whether the repo should be private |
existFalse | if True, do not raise an error if repo already exists. |
datasetNone | the type of the dataset to create |
minNone | the minimum version of FiftyOne required to load the dataset. For example "0.23.0". |
labelNone | controls the label field(s) to export. Only applicable to labeled datasets. Can be any of the following:
|
frameNone | controls the frame label field(s) to export. The "frames." prefix is optional. Only applicable to labeled video datasets. Can be any of the following:
|
token:None | a Hugging Face API token to use. May also be provided via the HF_TOKEN environment variable |
previewNone | a path to a preview image or video to display on the readme of the dataset repo. |
chunkNone | the number of media files to put in each
subdirectory, to avoid having too many files in a single directory.
If None, no chunking is performed. If the dataset has more than
10,000 samples, it will be chunked by default to avoid exceeding
the maximum number of files in a directory on Hugging Face Hub.
This parameter is only applicable to
fiftyone.types.dataset_types.FiftyOneDataset datasets. |
**data | additional keyword arguments to pass to the
DatasetCard constructor |
Undocumented
Value |
|
def _build_media_field_converter(media_field_key, media_field_name, feature, download_dir):
(source)
¶
Undocumented
def _build_rows_request_url(repo_id, split=None, subset='default', revision=None, offset=0, length=100):
(source)
¶
Undocumented
def _create_dataset_card(repo_id, dataset, tags=None, license=None, preview_path=None, **dataset_card_kwargs):
(source)
¶
Undocumented
def _download_files_in_batches(filepaths, download_dir, batch_size, **init_download_kwargs):
(source)
¶
Undocumented
def _get_rows(repo_id, split, subset, start_index=0, end_index=100, revision=None, **kwargs):
(source)
¶
Undocumented