module documentation

File storage utilities.

Copyright 2017-2025, Voxel51, Inc.

Class FileSystem Enumeration of the available file systems.
Class TempDir Context manager that creates and destroys a temporary directory.
Function abspath Converts the given path to an absolute path, resolving relative path indicators such as . and ...
Function copy_dir Copies the input directory to the output directory.
Function copy_file Copies the input file to the output location.
Function copy_files Copies the files to the given locations.
Function delete_dir Deletes the given directory and recursively deletes any empty directories from the resulting directory tree.
Function delete_file Deletes the file at the given path.
Function delete_files Deletes the files from the given locations.
Function ensure_basedir Makes the base directory of the given path, if necessary.
Function ensure_dir Makes the given directory, if necessary.
Function ensure_empty_dir Ensures that the given directory exists and is empty.
Function ensure_local Ensures that the given path is local.
Function exists Determines whether the given file or directory exists.
Function extract_archive Extracts the contents of an archive.
Function get_bucket_name Gets the bucket name from the given path.
Function get_file_system Returns the file system enum for the given path.
Function get_glob_matches Returns a list of file paths matching the given glob pattern.
Function get_glob_root Finds the root directory of the given glob pattern, i.e., the deepest subdirectory that contains no glob characters.
Function is_local Determines whether the given path is local.
Function isabs Determines whether the given path is absolute.
Function isdir Determines whether the given directory exists.
Function isfile Determines whether the given file exists.
Function join Joins the given path components into a single path.
Function list_available_file_systems Lists the file systems that are currently available for use with methods like list_files and list_buckets.
Function list_buckets Lists the available buckets in the given file system.
Function list_files Lists the files in the given directory.
Function list_subdirs Lists the subdirectories in the given directory, sorted alphabetically and excluding hidden directories.
Function load_json Loads JSON from the input argument.
Function load_ndjson Loads NDJSON from the input argument.
Function make_archive Makes an archive containing the given directory.
Function make_temp_dir Makes a temporary directory.
Function move_dir Moves the contents of the given directory into the given output directory.
Function move_file Moves the given file to a new location.
Function move_files Moves the files to the given locations.
Function normalize_path Normalizes the given path by converting it to an absolute path and expanding the user directory, if necessary.
Function normpath Normalizes the given path by converting all slashes to forward slashes on Unix and backslashes on Windows and removing duplicate slashes.
Function open_file Opens the given file for reading or writing.
Function open_files Opens the given files for reading or writing.
Function read_file Reads the file.
Function read_files Reads the specified files into memory.
Function read_json Reads a JSON file.
Function read_ndjson Reads an NDJSON file.
Function read_yaml Reads a YAML file.
Function realpath Converts the given path to absolute, resolving symlinks and relative path indicators such as . and ...
Function run Applies the given function to each element of the given tasks.
Function sep Returns the path separator for the given path.
Function split_prefix Splits the file system prefix from the given path.
Function write_file Writes the given string/bytes to a file.
Function write_json Writes JSON object to file.
Function write_ndjson Writes the list of JSON dicts in NDJSON format.
Function write_yaml Writes the object to a YAML file.
Variable logger Undocumented
Function _copy_file Undocumented
Function _copy_files Undocumented
Function _delete_file Undocumented
Function _do_copy_file Undocumented
Function _do_delete_file Undocumented
Function _do_move_file Undocumented
Function _do_open_file Undocumented
Function _do_read_file Undocumented
Function _get_local_metadata Undocumented
Function _open_file Undocumented
Function _read_file Undocumented
Function _run Undocumented
Function _to_bytes Undocumented
def abspath(path): (source)

Converts the given path to an absolute path, resolving relative path indicators such as . and ...

Parameters
paththe filepath
Returns
the absolute path
def copy_dir(indir, outdir, overwrite=True, skip_failures=False, progress=None): (source)

Copies the input directory to the output directory.

Parameters
indirthe input directory
outdirthe output directory
overwrite:Truewhether to delete an existing output directory (True) or merge its contents (False)
skip_failures:Falsewhether to gracefully continue without raising an error if an operation fails
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
def copy_file(inpath, outpath): (source)

Copies the input file to the output location.

Parameters
inpaththe input path
outpaththe output path
def copy_files(inpaths, outpaths, skip_failures=False, progress=None): (source)

Copies the files to the given locations.

Parameters
inpathsa list of input paths
outpathsa list of output paths
skip_failures:Falsewhether to gracefully continue without raising an error if an operation fails
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
def delete_dir(dirpath): (source)

Deletes the given directory and recursively deletes any empty directories from the resulting directory tree.

Parameters
dirpaththe directory path
def delete_file(path): (source)

Deletes the file at the given path.

Any empty directories are also recursively deleted from the resulting directory tree.

Parameters
paththe filepath
def delete_files(paths, skip_failures=False, progress=None): (source)

Deletes the files from the given locations.

Any empty directories are also recursively deleted from the resulting directory tree.

Parameters
pathsa list of paths
skip_failures:Falsewhether to gracefully continue without raising an error if an operation fails
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
def ensure_basedir(path): (source)

Makes the base directory of the given path, if necessary.

Parameters
paththe filepath
def ensure_dir(dirpath): (source)

Makes the given directory, if necessary.

Parameters
dirpaththe directory path
def ensure_empty_dir(dirpath, cleanup=False): (source)

Ensures that the given directory exists and is empty.

Parameters
dirpaththe directory path
cleanup:Falsewhether to delete any existing directory contents
Raises
ValueErrorif the directory is not empty and cleanup is False
def ensure_local(path): (source)

Ensures that the given path is local.

Parameters
patha path
def exists(path): (source)

Determines whether the given file or directory exists.

Parameters
paththe file or directory path
Returns
True/False
def extract_archive(archive_path, outdir=None, cleanup=False): (source)

Extracts the contents of an archive.

The following formats are guaranteed to work: .zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz.

If an archive not in the above list is found, extraction will be attempted via the patool package, which supports many formats but may require that additional system packages be installed.

Parameters
archive_paththe archive path
outdir:Nonethe directory into which to extract the archive. By default, the directory containing the archive is used
cleanup:Falsewhether to delete the archive after extraction
def get_bucket_name(path): (source)

Gets the bucket name from the given path.

The bucket name for local paths is "".

Example usages:

import fiftyone.core.storage as fos

fos.get_bucket_name("/path/to/file")       # ''
fos.get_bucket_name("a/file")              # ''
Parameters
patha path
Returns
the bucket name string
def get_file_system(path): (source)

Returns the file system enum for the given path.

Parameters
patha path
Returns
a FileSystem value
def get_glob_matches(glob_patt): (source)

Returns a list of file paths matching the given glob pattern.

The matches are returned in sorted order.

Parameters
glob_patta glob pattern like /path/to/files-*.jpg
Returns
a list of file paths
def get_glob_root(glob_patt): (source)

Finds the root directory of the given glob pattern, i.e., the deepest subdirectory that contains no glob characters.

Parameters
glob_patta glob pattern like /path/to/files-*.jpg
Returns
a tuple of
  • the root
  • True/False whether the pattern contains any special characters
def is_local(path): (source)

Determines whether the given path is local.

Parameters
patha path
Returns
True/False
def isabs(path): (source)

Determines whether the given path is absolute.

Parameters
paththe filepath
Returns
True/False
def isdir(dirpath): (source)

Determines whether the given directory exists.

Cloud "folders" are deemed to exist only if they are non-empty.

Parameters
dirpaththe directory path
Returns
True/False
def isfile(path): (source)

Determines whether the given file exists.

Parameters
paththe filepath
Returns
True/False
def join(a, *p): (source)

Joins the given path components into a single path.

Parameters
athe root
*padditional path components
Returns
the joined path
def list_available_file_systems(): (source)

Lists the file systems that are currently available for use with methods like list_files and list_buckets.

Returns
a list of FileSystem values
def list_buckets(fs, abs_paths=False): (source)

Lists the available buckets in the given file system.

This method returns subdirectories of / (or the current drive on Windows).

Parameters
fsa FileSystem value
abs_paths:Falsewhether to return absolute paths
Returns
a list of buckets
def list_files(dirpath, abs_paths=False, recursive=False, include_hidden_files=False, return_metadata=False, sort=True): (source)

Lists the files in the given directory.

If the directory does not exist, an empty list is returned.

Parameters
dirpaththe path to the directory to list
abs_paths:Falsewhether to return the absolute paths to the files
recursive:Falsewhether to recursively traverse subdirectories
include_hidden_files:Falsewhether to include dot files
return_metadata:Falsewhether to return metadata dicts for each file instead of filepaths
sort:Truewhether to sort the list of files
Returns
a list of filepaths or metadata dicts
def list_subdirs(dirpath, abs_paths=False, recursive=False): (source)

Lists the subdirectories in the given directory, sorted alphabetically and excluding hidden directories.

Parameters
dirpaththe path to the directory to list
abs_paths:Falsewhether to return absolute paths
recursive:Falsewhether to recursively traverse subdirectories
Returns
a list of subdirectories
def load_json(path_or_str): (source)

Loads JSON from the input argument.

Parameters
path_or_strthe filepath or JSON string
Returns
the loaded JSON
def load_ndjson(path_or_str): (source)

Loads NDJSON from the input argument.

Parameters
path_or_strthe filepath or NDJSON string
Returns
a list of JSON dicts
def make_archive(dirpath, archive_path, cleanup=False): (source)

Makes an archive containing the given directory.

Supported formats include .zip, .tar, .tar.gz, .tgz, .tar.bz and .tbz.

Parameters
dirpaththe directory to archive
archive_paththe archive path to write
cleanup:Falsewhether to delete the directory after archiving it
def make_temp_dir(basedir=None): (source)

Makes a temporary directory.

Parameters
basedir:Nonean optional directory in which to create the new directory. The default is fiftyone.config.default_dataset_dir
Returns
the temporary directory path
def move_dir(indir, outdir, overwrite=True, skip_failures=False, progress=None): (source)

Moves the contents of the given directory into the given output directory.

Parameters
indirthe input directory
outdirthe output directory
overwrite:Truewhether to delete an existing output directory (True) or merge its contents (False)
skip_failures:Falsewhether to gracefully continue without raising an error if an operation fails
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
def move_file(inpath, outpath): (source)

Moves the given file to a new location.

Parameters
inpaththe input path
outpaththe output path
def move_files(inpaths, outpaths, skip_failures=False, progress=None): (source)

Moves the files to the given locations.

Parameters
inpathsa list of input paths
outpathsa list of output paths
skip_failures:Falsewhether to gracefully continue without raising an error if an operation fails
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
def normalize_path(path): (source)

Normalizes the given path by converting it to an absolute path and expanding the user directory, if necessary.

Parameters
patha path
Returns
the normalized path
def normpath(path): (source)

Normalizes the given path by converting all slashes to forward slashes on Unix and backslashes on Windows and removing duplicate slashes.

Use this function when you need a version of os.path.normpath that converts \ to / on Unix.

Parameters
patha path
Returns
the normalized path
def open_file(path, mode='r'): (source)

Opens the given file for reading or writing.

Example usage:

import fiftyone.core.storage as fos

with fos.open_file("/tmp/file.txt", "w") as f:
    f.write("Hello, world!")

with fos.open_file("/tmp/file.txt", "r") as f:
    print(f.read())
Parameters
paththe path
mode:"r"the mode. Supported values are ("r", "rb", "w", "wb")
Returns
an open file-like object
def open_files(paths, mode='r', skip_failures=False, progress=None): (source)

Opens the given files for reading or writing.

Parameters
pathsa list of paths
mode:"r"the mode. Supported values are ("r", "rb", "w", "wb")
skip_failures:Falsewhether to gracefully continue without raising an error if an operation fails
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
Returns
a list of open file-like objects
def read_file(path, binary=False): (source)

Reads the file.

Parameters
paththe filepath
binary:Falsewhether to read the file in binary mode
Returns
the file contents
def read_files(paths, binary=False, skip_failures=False, progress=None): (source)

Reads the specified files into memory.

Parameters
pathsa list of filepaths
binary:Falsewhether to read the files in binary mode
skip_failures:Falsewhether to gracefully continue without raising an error if an operation fails
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
Returns
a list of file contents
def read_json(path): (source)

Reads a JSON file.

Parameters
paththe filepath
Returns
the JSON data
def read_ndjson(path): (source)

Reads an NDJSON file.

Parameters
paththe filepath
Returns
a list of JSON dicts
def read_yaml(path): (source)

Reads a YAML file.

Parameters
paththe filepath
Returns
a list of JSON dicts
def realpath(path): (source)

Converts the given path to absolute, resolving symlinks and relative path indicators such as . and ...

Parameters
paththe filepath
Returns
the resolved path
def run(fcn, tasks, return_results=True, num_workers=None, progress=None): (source)

Applies the given function to each element of the given tasks.

Parameters
fcna function that accepts a single argument
tasksan iterable of function arguments
return_results:Truewhether to return the function results
num_workers:Nonea suggested number of threads to use
progress:Nonewhether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
Returns
the list of function outputs, or None if return_results == False
def sep(path): (source)

Returns the path separator for the given path.

Parameters
paththe filepath
Returns
the path separator
def split_prefix(path): (source)

Splits the file system prefix from the given path.

The prefix for local paths is "".

Example usages:

import fiftyone.core.storage as fos

fos.split_prefix("/path/to/file")       # ('', '/path/to/file')
fos.split_prefix("a/file")              # ('', 'a/file')
Parameters
patha path
Returns
a (prefix, path) tuple
def write_file(str_or_bytes, path): (source)

Writes the given string/bytes to a file.

If a string is provided, it is encoded via .encode().

Parameters
str_or_bytesthe string or bytes
paththe filepath
def write_json(d, path, pretty_print=False): (source)

Writes JSON object to file.

Parameters
dJSON data
paththe filepath
pretty_print:Falsewhether to render the JSON in human readable format with newlines and indentations
def write_ndjson(obj, path): (source)

Writes the list of JSON dicts in NDJSON format.

Parameters
obja list of JSON dicts
paththe filepath
def write_yaml(obj, path, **kwargs): (source)

Writes the object to a YAML file.

Parameters
obja Python object
paththe filepath
**kwargsoptional arguments for yaml.dump(..., **kwargs)

Undocumented

def _copy_file(inpath, outpath, cleanup=False): (source)

Undocumented

def _copy_files(inpaths, outpaths, skip_failures, progress): (source)

Undocumented

def _delete_file(filepath): (source)

Undocumented

def _do_copy_file(arg): (source)

Undocumented

def _do_delete_file(arg): (source)

Undocumented

def _do_move_file(arg): (source)

Undocumented

def _do_open_file(arg): (source)

Undocumented

def _do_read_file(arg): (source)

Undocumented

def _get_local_metadata(filepath): (source)

Undocumented

def _open_file(path, mode): (source)

Undocumented

def _read_file(filepath, binary=False): (source)

Undocumented

def _run(fcn, tasks, return_results=True, num_workers=None, progress=None): (source)

Undocumented

def _to_bytes(val, encoding='utf-8'): (source)

Undocumented