Core utilities.
Class | add |
Context manager that temporarily inserts a path to sys.path. |
Class |
|
Base class for iterating over the elements of an iterable in chunks. |
Class |
|
Class for iterating over the elements of an iterable with a dynamic batch size to achieve a desired target measurement. |
Class |
|
Base class for iterating over the elements of an iterable in batches. |
Class |
|
Class for iterating over the elements of an iterable with a dynamic batch size to achieve a desired content size. |
Class |
|
Class for iterating over the elements of an iterable with a dynamic batch size to achieve a desired content size. |
Class |
|
Class for iterating over the elements of an iterable with a dynamic batch size to achieve a desired latency. |
Class |
|
Proxy module that lazily imports the underlying module the first time it is actually used. |
Class |
|
Context manager that allows for a temporary change to the level of a logging.Logger. |
Class |
|
Context manager that temporarily monkey patches the given function. |
Class |
|
A class that generates unique output paths in a directory. This is multiprocess safe and uses a shared temporary directory structure organized by parent process ID and configuration hash. The approach is robust and handles edge cases like idempotency and file extensions. |
Class |
|
No summary |
Class |
|
Context manager that allows for a temporary change to a resource limit exposed by the resource package. |
Class |
|
Wrapper around a requests.Response that provides a file-like object interface with read(), seek(), and tell() methods. |
Class |
|
Context manager that temporarily sets the attributes of a class to new values. |
Class |
|
Class for iterating over the elements of an iterable with a static batch size. |
Class |
|
Context manager that temporarily disables system-wide logging. |
Class |
|
A class that generates unique output paths in a directory. |
Function | available |
Returns the available patterns that can be used by fill_patterns . |
Function | call |
Registers the given callback function so that it will be called when the process exits for (almost) any reason |
Function | compute |
Computes the hash of the given file. |
Function | datetime |
Converts a datetime.date or datetime.datetime to milliseconds since epoch. |
Function | default |
Undocumented |
Function | deserialize |
Loads a serialized numpy array generated by serialize_numpy_array . |
Function | disable |
Context manager that temporarily disables all progress bars. |
Function | ensure |
Verifies that the given requirement is installed and importable. |
Function | ensure |
Verifies that the given package is installed. |
Function | ensure |
Verifies that the package requirements from a requirements.txt file on disk are installed. |
Function | ensure |
Verifies that tensorflow is installed and importable. |
Function | ensure |
Verifies that tensorflow_datasets is installed and importable. |
Function | ensure |
Verifies that torch and torchvision are installed and importable. |
Function | extract |
Extracts keyword arguments for the given class's constructor from the given kwargs. |
Function | extract |
Extracts keyword arguments for the given function from the given kwargs. |
Function | fill |
Fills the patterns in in the given string. |
Function | find |
Finds all files in the given root directory whose filename matches the given glob pattern(s). |
Function | get |
Returns a Batcher over iterable using defaults from your FiftyOne config. |
Function | get |
Returns the preferred multiprocessing context for the current OS. |
Function | handle |
Handles the error at the specified error level. |
Function | indent |
Indents the lines in the given string. |
Function | install |
Installs the given package via pip. |
Function | install |
Installs the package requirements from a requirements.txt file on disk. |
Function | is |
Determines whether the system is 32-bit. |
Function | is |
Determines whether the system is an ARM-based Mac (Apple Silicon). |
Function | is |
Determines if we're currently running as a container. |
Function | iter |
Iterates over the given iterable in batches. |
Function | iter |
Iterates over batches of the given object via slicing. |
Function | justify |
Justifies the headings in a list of (heading, content) string tuples by appending whitespace as necessary to each heading. |
Function | lazy |
Returns a proxy module object that will lazily import the given module the first time it is used. |
Function | load |
Loads the package requirements from a requirements.txt file on disk. |
Function | load |
Loads the XML file as a JSON dictionary. |
Function | parse |
Parses the given batching strategy configuration, applying any default config settings as necessary. |
Function | parse |
Parses the given object as an instance of the given eta.core.serial.Serializable class. |
Function | pformat |
Returns a pretty string representation of the Python object. |
Function | pprint |
Pretty-prints the Python object. |
Function | recommend |
Computes a recommended batch size for the given value type such that a request involving a list of values of this size will be less than alpha * fo.config.batcher_target_size_bytes bytes. |
Function | recommend |
Recommends a number of workers for a process pool. |
Function | recommend |
Recommends a number of workers for a thread pool. |
Function | report |
Wraps the provided progress function such that it will only be called at the specified increments or time intervals. |
Async Function | run |
Run a synchronous function as an async background task. |
Function | safe |
A safe version of os.path.relpath that returns a configurable default value if the given path if it does not lie within the given relative start. |
Function | serialize |
Serializes a numpy array. |
Function | set |
Uses the resource package to change a resource limit for the current process. |
Function | split |
Splits the given fields into sample and frame fields. |
Function | stream |
Streams the iterable of objects to stdout via less. |
Function | timedelta |
Converts a datetime.timedelta to milliseconds. |
Function | timestamp |
Converts a timestamp (number of milliseconds since epoch) to a datetime.datetime . |
Function | to |
Returns the URL-friendly slug for the given string. |
Function | validate |
Validates that the given value is a valid css color name. |
Function | validate |
Validates that the given value is a hex color string or css name. |
Variable | fos |
Undocumented |
Variable | logger |
Undocumented |
Variable | sync |
Undocumented |
Function | __rm |
Undocumented |
Function | _extract |
Undocumented |
Function | _get |
Undocumented |
Function | _is |
Undocumented |
Function | _is |
Undocumented |
Function | _report |
Undocumented |
Function | _report |
Undocumented |
Function | _sanitize |
Undocumented |
Function | _split |
Undocumented |
Function | _strip |
Undocumented |
Constant | _HYPHEN |
Undocumented |
Constant | _NAME |
Undocumented |
Constant | _REQUIREMENT |
Undocumented |
Constant | _SAFE |
Undocumented |
Returns the available patterns that can be used by
fill_patterns
.
Returns | |
a dict mapping patterns to their replacements |
Registers the given callback function so that it will be called when the process exits for (almost) any reason
Note that this should only be used from non-interactive scripts because it intercepts ctrl + c.
Covers the following cases: - normal program termination - a Python exception is raised - a SIGTERM signal is received
Parameters | |
callback | the function to execute upon termination |
Computes the hash of the given file.
Parameters | |
filepath | the path to the file |
method:None | an optional hashlib method to use. If not specified, the builtin str.__hash__ will be used |
chunkNone | an optional chunk size to use to read the file, in bytes. Only applicable when a method is provided. The default is 64kB. If negative, the entire file is read at once |
Returns | |
the hash |
Converts a datetime.date
or datetime.datetime
to milliseconds since
epoch.
Parameters | |
dt | a datetime.date or datetime.datetime |
Returns | |
the float number of milliseconds since epoch |
Loads a serialized numpy array generated by
serialize_numpy_array
.
Parameters | |
numpy | the serialized numpy array bytes |
ascii:False | whether the bytes were generated with the
ascii == True parameter of serialize_numpy_array |
Returns | |
the numpy array |
Context manager that temporarily disables all progress bars.
Example usage:
import fiftyone as fo import fiftyone.zoo as foz with fo.disable_progress_bars(): dataset = foz.load_zoo_dataset("quickstart")
Verifies that the given requirement is installed and importable.
This function imports the specified module and optionally enforces any version requirements included in requirement_str.
Therefore, unlike ensure_package
, requirement_str should refer
to the module name (e.g., "tensorflow"), not the package name (e.g.,
"tensorflow-gpu").
Parameters | |
requirement | a PEP 440-like module requirement, like "tensorflow", "tensorflow<2", "tensorflow==2.3.0", or "tensorflow>=1.13,<1.15". This can also be an iterable of multiple requirements, all of which must be installed, or this can be a single "|"-delimited string specifying multiple requirements, at least one of which must be installed |
errorNone | the error level to use, defined as:
By default, fiftyone.config.requirement_error_level is used |
errorNone | an optional custom error message to use |
logFalse | whether to generate a log message if the requirement is satisfied |
Returns | |
True/False whether the requirement is satisfied |
Verifies that the given package is installed.
This function uses importlib.metadata to locate the package by its pip name and does not actually import the module.
Therefore, unlike ensure_import
, requirement_str should refer
to the package name (e.g., "tensorflow-gpu"), not the module name
(e.g., "tensorflow").
Parameters | |
requirement | a PEP 440 compliant package requirement, like "tensorflow", "tensorflow<2", "tensorflow==2.3.0", or "tensorflow>=1.13,<1.15". This can also be an iterable of multiple requirements, all of which must be installed, or this can be a single "|"-delimited string specifying multiple requirements, at least one of which must be installed |
errorNone | the error level to use, defined as:
By default, fiftyone.config.requirement_error_level is used |
errorNone | an optional custom error message to use |
logFalse | whether to generate a log message if the requirement is satisfied |
Returns | |
True/False whether the requirement is satisfied |
Verifies that the package requirements from a requirements.txt file on disk are installed.
Parameters | |
requirements | the path to a requirements file |
errorNone | the error level to use, defined as:
By default, fiftyone.config.requirement_error_level is used |
logFalse | whether to generate a log message if a requirement is satisfied |
Verifies that tensorflow is installed and importable.
Parameters | |
eager:False | whether to require that TF is executing eagerly. If True and TF is not currently executing eagerly, this method will attempt to enable it |
errorNone | the error level to use, defined as:
By default, fiftyone.config.requirement_error_level is used |
errorNone | an optional custom error message to print |
Returns | |
True/False whether the requirement is satisfied |
Verifies that tensorflow_datasets is installed and importable.
Parameters | |
errorNone | the error level to use, defined as:
By default, fiftyone.config.requirement_error_level is used |
errorNone | an optional custom error message to print |
Returns | |
True/False whether the requirement is satisfied |
Verifies that torch and torchvision are installed and importable.
Parameters | |
errorNone | the error level to use, defined as:
By default, fiftyone.config.requirement_error_level is used |
errorNone | an optional custom error message to print |
Returns | |
True/False whether the requirement is satisfied |
Extracts keyword arguments for the given class's constructor from the given kwargs.
Parameters | |
cls | a class |
kwargs | a dictionary of keyword arguments |
Returns | |
a tuple of |
|
Extracts keyword arguments for the given function from the given kwargs.
Parameters | |
fcn | a function |
kwargs | a dictionary of keyword arguments |
Returns | |
a tuple of |
|
Fills the patterns in in the given string.
Use available_patterns
to see the available patterns that can be
used.
Parameters | |
string | a string |
Returns | |
a copy of string with any patterns replaced |
Finds all files in the given root directory whose filename matches the given glob pattern(s).
Both root_dir and patt may contain glob patterns.
Exammples:
import fiftyone.core.utils as fou # Find .txt files in `/tmp` fou.find_files("/tmp", "*.txt") # Find .txt files in subdirectories of `/tmp` that begin with `foo-` fou.find_files("/tmp/foo-*", "*.txt") # Find .txt files in `/tmp` or its subdirectories fou.find_files("/tmp", "*.txt", max_depth=2)
Parameters | |
root | the root directory |
patt | a glob pattern or list of patterns |
max | a maximum depth to search. 1 means root_dir only, 2 means root_dir and its immediate subdirectories, etc |
Returns | |
a list of matching paths |
Returns a Batcher
over iterable using defaults from your
FiftyOne config.
If no batcher is provided, this method uses fiftyone.config.default_batcher to determine the implementation to use and related configuration values as needed for each.
Parameters | |
iterable | an iterable to batch over. If None, the result of next() will be a batch size instead of a batch, and is an infinite iterator |
batcher:None | a specific Batcher subclass to use, or
False to disable batching |
transformNone | a transform function to apply to each item |
sizeNone | a function that calculates the size of each item. This is applied after transform_fn if both are provided. Only applicable when fiftyone.config.default_batcher="size" |
progress:False | whether to render a progress bar tracking the consumption of the batches (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
total:None | the length of iterable. Only applicable when progress=True. If not provided, it is computed via len(iterable), if possible |
Returns | |
a Batcher instance |
Returns the preferred multiprocessing context for the current OS.
When running on macOS or Linux with no start method configured, this method will set the default start method to "fork".
Returns | |
a multiprocessing context |
Handles the error at the specified error level.
Parameters | |
error | an Exception instance |
error | the error level to use, defined as: |
base | (optional) a base Exception from which to raise error |
- 0 | raise the error |
- 1 | log the error as a warning |
- 2 | ignore the error |
Indents the lines in the given string.
Parameters | |
s | the string |
indent:4 | the number of spaces to indent |
skip:0 | the number of lines to skip before indenting |
Returns | |
the indented string |
Installs the given package via pip.
Installation is performed via:
python -m pip install <requirement_str>
Parameters | |
requirement | a PEP 440 compliant package requirement, like "tensorflow", "tensorflow<2", "tensorflow==2.3.0", or "tensorflow>=1.13,<1.15" |
errorNone | the error level to use, defined as:
|
errorNone | an optional custom error message to use |
Installs the package requirements from a requirements.txt file on disk.
Parameters | |
requirements | the path to a requirements file |
errorNone | the error level to use, defined as:
By default, fiftyone.config.requirement_error_level is used |
Iterates over the given iterable in batches.
Parameters | |
iterable | an iterable |
batch | the desired batch size, or None to return the contents in a single batch |
Returns | |
a generator that emits tuples of elements of the requested batch size from the input |
Iterates over batches of the given object via slicing.
Parameters | |
sliceable | an object that supports slicing |
batch | the desired batch size, or None to return the contents in a single batch |
Returns | |
a generator that emits batches of elements of the requested batch size from the input |
Justifies the headings in a list of (heading, content) string tuples by appending whitespace as necessary to each heading.
Parameters | |
elements | a list of (heading, content) tuples |
width:None | an optional justification width. By default, the maximum heading length is used |
Returns | |
a list of justified (heading, content) tuples |
Returns a proxy module object that will lazily import the given module the first time it is used.
Example usage:
# Lazy version of `import tensorflow as tf` tf = lazy_import("tensorflow") # Other commands # Now the module is loaded tf.__version__
Parameters | |
module | the fully-qualified module name to import |
callback:None | a callback function to call before importing the module |
Returns | |
a proxy module object that will be lazily imported when first used |
Loads the package requirements from a requirements.txt file on disk.
Comments and extra whitespace are automatically stripped.
Parameters | |
requirements | the path to a requirements file |
Returns | |
a list of requirement strings |
Loads the XML file as a JSON dictionary.
Parameters | |
xml | the path to the XML file |
Returns | |
a JSON dict |
Parses the given batching strategy configuration, applying any default config settings as necessary.
Parameters | |
batchNone | the batch size to use. If a batching_strategy is provided, this parameter configures that strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency") |
batchingNone | the batching strategy to use for each save operation. Supported values are:
By default, fo.config.default_batcher is used |
Returns | |
a tuple of (batch_size, batching_strategy) |
Parses the given object as an instance of the given eta.core.serial.Serializable class.
Parameters | |
obj | an instance of cls, or a serialized string or dictionary representation of one |
cls | a eta.core.serial.Serializable class |
Returns | |
an instance of cls |
Returns a pretty string representation of the Python object.
Parameters | |
obj | the Python object |
indent:4 | the number of spaces to use when indenting |
width:80 | the max width of each line in the pretty representation |
depth:None | the maximum depth at which to pretty render nested dicts |
Returns | |
the pretty-formatted string |
Pretty-prints the Python object.
Parameters | |
obj | the Python object |
stream:None | the stream to write to. The default is sys.stdout |
indent:4 | the number of spaces to use when indenting |
width:80 | the max width of each line in the pretty representation |
depth:None | the maximum depth at which to pretty render nested dicts |
Computes a recommended batch size for the given value type such that a request involving a list of values of this size will be less than alpha * fo.config.batcher_target_size_bytes bytes.
Parameters | |
value | a value |
alpha:0.9 | a safety factor |
maxNone | an optional max batch size |
Returns | |
a recommended batch size |
Recommends a number of workers for a process pool.
If a fo.config.max_process_pool_workers is set, this limit is applied.
Parameters | |
numNone | a suggested number of workers |
Returns | |
a number of workers |
Recommends a number of workers for a thread pool.
If a fo.config.max_thread_pool_workers is set, this limit is applied.
Parameters | |
numNone | a suggested number of workers |
Returns | |
a number of workers |
Wraps the provided progress function such that it will only be called at the specified increments or time intervals.
Example usage:
import fiftyone as fo import fiftyone.zoo as foz def print_progress(pb): if pb.complete: print("COMPLETE") else: print("PROGRESS: %0.3f" % pb.progress) dataset = foz.load_zoo_dataset("cifar10", split="test") # Print progress at 10 equally-spaced increments progress = fo.report_progress(print_progress, n=10) dataset.compute_metadata(progress=progress) # Print progress every 0.5 seconds progress = fo.report_progress(print_progress, dt=0.5) dataset.compute_metadata(progress=progress, overwrite=True)
Parameters | |
progress | a function that accepts a ProgressBar as input |
n:None | a number of equally-spaced increments to invoke progress |
dt:None | a number of seconds between progress calls |
Returns | |
a function that accepts a ProgressBar as input |
Run a synchronous function as an async background task.
Parameters | |
func | a synchronous callable |
*args | function arguments |
Returns | |
the function's return value(s) |
A safe version of os.path.relpath that returns a configurable default value if the given path if it does not lie within the given relative start.
Parameters | |
path | a path |
start:None | the relative prefix to strip from path |
default:None | a default value to return if path does not lie within start. By default, the basename of the path is returned |
Returns | |
the relative path |
Serializes a numpy array.
Parameters | |
array | a numpy array-like |
ascii:False | whether to return a base64-encoded ASCII string instead of raw bytes |
Returns | |
the serialized bytes |
Uses the resource package to change a resource limit for the current process.
If the resource package cannot be imported, this command does nothing.
Parameters | |
limit | the name of the resource to limit. Must be the name of a constant in the resource module starting with RLIMIT. See the documentation of the resource module for supported values |
soft:None | a new soft limit to apply, which cannot exceed the hard limit. If omitted, the current soft limit is maintained |
hard:None | a new hard limit to apply. If omitted, the current hard limit is maintained |
warnFalse | whether to issue a warning rather than an error if the resource limit change is not successful |
Splits the given fields into sample and frame fields.
Frame fields are those prefixed by "frames.", and this prefix is removed from the returned frame fields.
Parameters | |
fields | a field, iterable of fields, or dict mapping field names to new field names |
Returns | |
a tuple of |
|
Streams the iterable of objects to stdout via less.
The output can be interactively traversed via scrolling and can be terminated via keyboard interrupt.
Parameters | |
objects | an iterable of objects that can be printed via str(obj) |
Converts a datetime.timedelta
to milliseconds.
Parameters | |
td | a datetime.timedelta |
Returns | |
the float number of milliseconds |
Converts a timestamp (number of milliseconds since epoch) to a
datetime.datetime
.
Parameters | |
ts | a number of milliseconds since epoch |
Returns | |
a datetime.datetime |
Returns the URL-friendly slug for the given string.
The following strategy is used to generate slugs:
- The characters A-Za-z0-9 are converted to lowercase
- Whitespace and +_.- are converted to -
- All other characters are omitted
- All consecutive - characters are reduced to a single -
- All leading and trailing - are stripped
- Both the input name and the resulting string must be [1, 100] characters in length
Examples:
name | slug ---------------------------------+----------------------- coco_2017 | coco-2017 c+o+c+o 2-0-1-7 | c-o-c-o-2-0-1-7 cat.DOG | cat-dog ---name---- | name Brian's #$&@ (Awesome?) Dataset! | brians-awesome-dataset sPaM aNd EgGs | spam-and-eggs
Parameters | |
name | a string |
Returns | |
the slug string | |
Raises | |
ValueError | if the name is invalid |
Validates that the given value is a valid css color name.
Parameters | |
value | a value |
Raises | |
ValueError | if value is not a valid css color name. |
Validates that the given value is a hex color string or css name.
Parameters | |
value | a value |
Raises | |
ValueError | if value is not a hex color string |