class documentation

Class for iterating over the elements of an iterable with a dynamic batch size to achieve a desired content size.

The batch sizes emitted when iterating over this object are dynamically scaled such that the total content size of the batch is as close as possible to a specified target size.

This batcher requires that backpressure feedback be provided, either by providing a BSON-able batch from which the content size can be computed, or by manually providing the content size.

This class is often used in conjunction with a ProgressBar to keep the user appraised on the status of a long-running task.

Example usage:

import fiftyone.core.utils as fou

elements = range(int(1e7))

batcher = fou.ContentSizeDynamicBatcher(
    elements, target_size=2**20, max_batch_beta=2.0
)

# Raises ValueError after first batch, we forgot to apply backpressure
for batch in batcher:
    print("batch size: %d" % len(batch))

# Now it works
for batch in batcher:
    print("batch size: %d" % len(batch))
    batcher.apply_backpressure(batch)

batcher = fou.ContentSizeDynamicBatcher(
    elements,
    target_size=2**20,
    max_batch_beta=2.0,
    progress=True
)

with batcher:
    for batch in batcher:
        print("batch size: %d" % len(batch))
        batcher.apply_backpressure(batch)
Parameters
iterablean iterable to batch over. If None, the result of next() will be a batch size instead of a batch, and is an infinite iterator.
target_sizethe target batch bson content size, in bytes
init_batch_sizethe initial batch size to use
min_batch_sizethe minimum allowed batch size
max_batch_sizean optional maximum allowed batch size
max_batch_betaan optional lower/upper bound on the ratio between successive batch sizes
return_viewswhether to return each batch as a fiftyone.core.view.DatasetView. Only applicable when the iterable is a fiftyone.core.collections.SampleCollection
progresswhether to render a progress bar tracking the consumption of the batches (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
totalthe length of iterable. Only applicable when progress=True. If not provided, it is computed via len(iterable), if possible
Method __init__ Undocumented
Method apply_backpressure Apply backpressure needed to rightsize the next batch.
Class Variable manual_backpressure Undocumented
Method _get_measurement Get backpressure measurement for current batch.
Instance Variable _last_batch_content_size Undocumented
Instance Variable _manually_applied_backpressure Undocumented

Inherited from BaseDynamicBatcher:

Instance Variable target_measurement Undocumented
Method _compute_batch_size Return next batch size. Concrete classes must implement.
Instance Variable _last_batch_size Undocumented

Inherited from Batcher (via BaseDynamicBatcher):

Method __enter__ Undocumented
Method __exit__ Undocumented
Method __iter__ Undocumented
Method __next__ Undocumented
Instance Variable iterable Undocumented
Instance Variable _in_context Undocumented
Instance Variable _iter Undocumented
Instance Variable _last_offset Undocumented
Instance Variable _num_samples Undocumented
Instance Variable _pb Undocumented
Instance Variable _render_progress Undocumented
def __init__(self, iterable, target_size=2 ** 20, init_batch_size=1, min_batch_size=1, max_batch_size=None, max_batch_beta=None, return_views=False, progress=False, total=None): (source)
def apply_backpressure(self, batch_or_size): (source)

Apply backpressure needed to rightsize the next batch.

Required to be implemented and called every iteration, if self.manual_backpressure == True.

Subclass defines arguments and behavior of this method.

manual_backpressure: bool = (source)
def _get_measurement(self): (source)

Get backpressure measurement for current batch.

_last_batch_content_size = (source)

Undocumented

_manually_applied_backpressure: bool = (source)