class SampleCollection(object): (source)
Known subclasses: fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Abstract class representing an ordered collection of
fiftyone.core.sample.Sample
instances in a
fiftyone.core.dataset.Dataset
.
Class Method | list |
Returns a list of all available methods on this collection that apply fiftyone.core.aggregations.Aggregation operations to this collection. |
Class Method | list |
Returns a list of all available methods on this collection that apply fiftyone.core.stages.ViewStage operations to this collection. |
Method | __add__ |
Undocumented |
Method | __bool__ |
Undocumented |
Method | __contains__ |
Undocumented |
Method | __getitem__ |
Undocumented |
Method | __iter__ |
Undocumented |
Method | __len__ |
Undocumented |
Method | __repr__ |
Undocumented |
Method | __str__ |
Undocumented |
Method | add |
Applies the given fiftyone.core.stages.ViewStage to the collection. |
Method | aggregate |
Aggregates one or more fiftyone.core.aggregations.Aggregation instances. |
Method | annotate |
Exports the samples and optional label field(s) in this collection to the given annotation backend. |
Method | apply |
Applies the model to the samples in the collection. |
Method | bounds |
Computes the bounds of a numeric field of the collection. |
Method | compute |
Computes embeddings for the samples in the collection using the given model. |
Method | compute |
Populates the metadata field of all samples in the collection. |
Method | compute |
Computes embeddings for the image patches defined by patches_field of the samples in the collection using the given model. |
Method | concat |
Concatenates the contents of the given SampleCollection to this collection. |
Method | count |
Counts the number of field values in the collection. |
Method | count |
Counts the occurrences of all label tags in the specified label field(s) of this collection. |
Method | count |
Counts the occurrences of sample tags in this collection. |
Method | count |
Counts the occurrences of field values in the collection. |
Method | create |
Creates an index on the given field or with the given specification, if necessary. |
Method | delete |
Deletes the annotation run with the given key from this collection. |
Method | delete |
Deletes all annotation runs from this collection. |
Method | delete |
Deletes the brain method run with the given key from this collection. |
Method | delete |
Deletes all brain method runs from this collection. |
Method | delete |
Deletes the evaluation results associated with the given evaluation key from this collection. |
Method | delete |
Deletes all evaluation results from this collection. |
Method | delete |
Deletes the run with the given key from this collection. |
Method | delete |
Deletes all runs from this collection. |
Method | description |
Undocumented |
Method | distinct |
Computes the distinct values of a field in the collection. |
Method | draw |
Renders annotated versions of the media in the collection with the specified label data overlaid to the given directory. |
Method | drop |
Drops the index for the given field or name, if necessary. |
Method | evaluate |
Evaluates the classification predictions in this collection with respect to the specified ground truth labels. |
Method | evaluate |
Evaluates the specified predicted detections in this collection with respect to the specified ground truth detections. |
Method | evaluate |
Evaluates the regression predictions in this collection with respect to the specified ground truth values. |
Method | evaluate |
Evaluates the specified semantic segmentation masks in this collection with respect to the specified ground truth masks. |
Method | exclude |
Excludes the samples with the given IDs from the collection. |
Method | exclude |
Excludes the samples with the given field values from the collection. |
Method | exclude |
Excludes the fields with the given names from the samples in the collection. |
Method | exclude |
Excludes the frames with the given IDs from the video collection. |
Method | exclude |
Excludes the groups with the given IDs from the grouped collection. |
Method | exclude |
Excludes the specified labels from the collection. |
Method | exists |
Returns a view containing the samples in the collection that have (or do not have) a non-None value for the given field or embedded field. |
Method | export |
Exports the samples in the collection to disk. |
Method | filter |
Filters the values of a field or embedded field of each sample in the collection. |
Method | filter |
Filters the individual fiftyone.core.labels.Keypoint.points elements in the specified keypoints field of each sample in the collection. |
Method | filter |
Filters the fiftyone.core.labels.Label field of each sample in the collection. |
Method | first |
Returns the first sample in the collection. |
Method | flatten |
Returns a flattened view that contains all samples in the dynamic grouped collection. |
Method | geo |
Sorts the samples in the collection by their proximity to a specified geolocation. |
Method | geo |
Filters the samples in this collection to only include samples whose geolocation is within a specified boundary. |
Method | get |
Returns information about the annotation run with the given key on this collection. |
Method | get |
Returns information about the brain method run with the given key on this collection. |
Method | get |
Gets the classes list for the given field, or None if no classes are available. |
Method | get |
Returns a schema dictionary describing the dynamic fields of the samples in the collection. |
Method | get |
Returns a schema dictionary describing the dynamic fields of the frames in the collection. |
Method | get |
Returns information about the evaluation with the given key on this collection. |
Method | get |
Returns the field instance of the provided path, or None if one does not exist. |
Method | get |
Returns a schema dictionary describing the fields of the samples in the collection. |
Method | get |
Returns a schema dictionary describing the fields of the frames in the collection. |
Method | get |
Returns a dict containing the samples for the given group ID. |
Method | get |
Returns a dictionary of information about the indexes on this collection. |
Method | get |
Gets the mask targets for the given field, or None if no mask targets are available. |
Method | get |
Returns information about the run with the given key on this collection. |
Method | get |
Gets the keypoint skeleton for the given field, or None if no skeleton is available. |
Method | group |
Creates a view that groups the samples in the collection by a specified field or expression. |
Method | has |
Whether this collection has an annotation run with the given key. |
Method | has |
Whether this collection has a brain method run with the given key. |
Method | has |
Determines whether this collection has a classes list for the given field. |
Method | has |
Whether this collection has an evaluation with the given key. |
Method | has |
Determines whether the collection has a field with the given name. |
Method | has |
Determines whether the collection has a frame-level field with the given name. |
Method | has |
Determines whether this collection has mask targets for the given field. |
Method | has |
Whether this collection has a run with the given key. |
Method | has |
Determines whether the collection has a sample field with the given name. |
Method | has |
Determines whether this collection has a keypoint skeleton for the given field. |
Method | head |
Returns a list of the first few samples in the collection. |
Method | histogram |
Computes a histogram of the field values in the collection. |
Method | init |
Initializes a config instance for a new run. |
Method | init |
Initializes a results instance for the run with the given key. |
Method | iter |
Returns an iterator over the groups in the collection. |
Method | iter |
Returns an iterator over the samples in the collection. |
Method | last |
Returns the last sample in the collection. |
Method | limit |
Returns a view with at most the given number of samples. |
Method | limit |
Limits the number of fiftyone.core.labels.Label instances in the specified labels list field of each sample in the collection. |
Method | list |
Returns a list of annotation keys on this collection. |
Method | list |
Returns a list of brain keys on this collection. |
Method | list |
Returns a list of evaluation keys on this collection. |
Method | list |
Returns the list of index names on this collection. |
Method | list |
Returns a list of run keys on this collection. |
Method | list |
Extracts the value type(s) in a specified list field across all samples in the collection. |
Method | load |
Loads the results for the annotation run with the given key on this collection. |
Method | load |
Loads the fiftyone.core.view.DatasetView on which the specified annotation run was performed on this collection. |
Method | load |
Downloads the labels from the given annotation run from the annotation backend and merges them into this collection. |
Method | load |
Loads the results for the brain method run with the given key on this collection. |
Method | load |
Loads the fiftyone.core.view.DatasetView on which the specified brain method run was performed on this collection. |
Method | load |
Loads the results for the evaluation with the given key on this collection. |
Method | load |
Loads the fiftyone.core.view.DatasetView on which the specified evaluation was performed on this collection. |
Method | load |
Loads the results for the run with the given key on this collection. |
Method | load |
Loads the fiftyone.core.view.DatasetView on which the specified run was performed on this collection. |
Method | make |
Makes a unique field name with the given root name for the collection. |
Method | map |
Maps the label values of a fiftyone.core.labels.Label field to new values for each sample in the collection. |
Method | match |
Filters the samples in the collection by the given filter. |
Method | match |
Filters the frames in the video collection by the given filter. |
Method | match |
Selects the samples from the collection that contain (or do not contain) at least one label that matches the specified criteria. |
Method | match |
Returns a view containing the samples in the collection that have or don't have any/all of the given tag(s). |
Method | max |
Computes the maximum of a numeric field of the collection. |
Method | mean |
Computes the arithmetic mean of the field values of the collection. |
Method | merge |
Merges the labels from the given input field into the given output field of the collection. |
Method | min |
Computes the minimum of a numeric field of the collection. |
Method | mongo |
Adds a view stage defined by a raw MongoDB aggregation pipeline. |
Method | one |
Returns a single sample in this collection matching the expression. |
Method | quantiles |
Computes the quantile(s) of the field values of a collection. |
Method | register |
Registers a run under the given key on this collection. |
Method | reload |
Reloads the collection from the database. |
Method | rename |
Replaces the key for the given annotation run with a new key. |
Method | rename |
Replaces the key for the given brain run with a new key. |
Method | rename |
Replaces the key for the given evaluation with a new key. |
Method | rename |
Replaces the key for the given run with a new key. |
Method | save |
Returns a context that can be used to save samples from this collection according to a configurable batching strategy. |
Method | save |
Saves run results for the run with the given key. |
Method | schema |
Extracts the names and types of the attributes of a specified embedded document field across all samples in the collection. |
Method | select |
Selects the samples with the given IDs from the collection. |
Method | select |
Selects the samples with the given field values from the collection. |
Method | select |
Selects only the fields with the given names from the samples in the collection. All other fields are excluded. |
Method | select |
Selects the frames with the given IDs from the video collection. |
Method | select |
Selects the samples in the group collection from the given slice(s). |
Method | select |
Selects the groups with the given IDs from the grouped collection. |
Method | select |
Selects only the specified labels from the collection. |
Method | set |
Sets a field or embedded field on each sample in a collection by evaluating the given expression. |
Method | set |
Sets the fields of the specified labels in the collection to the given values. |
Method | set |
Sets the field or embedded field on each sample or frame in the collection to the given values. |
Method | shuffle |
Randomly shuffles the samples in the collection. |
Method | skip |
Omits the given number of samples from the head of the collection. |
Method | sort |
Sorts the samples in the collection by the given field(s) or expression(s). |
Method | sort |
Sorts the collection by similarity to a specified query. |
Method | split |
Splits the labels from the given input field into the given output field of the collection. |
Method | stats |
Returns stats about the collection on disk. |
Method | std |
Computes the standard deviation of the field values of the collection. |
Method | sum |
Computes the sum of the field values of the collection. |
Method | summary |
Returns a string summary of the collection. |
Method | sync |
Syncs the last_modified_at property(s) of the dataset. |
Method | tag |
Adds the tag(s) to all labels in the specified label field(s) of this collection, if necessary. |
Method | tag |
Adds the tag(s) to all samples in this collection, if necessary. |
Method | tags |
Undocumented |
Method | tail |
Returns a list of the last few samples in the collection. |
Method | take |
Randomly samples the given number of samples from the collection. |
Method | to |
Creates a view that contains one sample per clip defined by the given field or expression in the video collection. |
Method | to |
Returns a JSON dictionary representation of the collection. |
Method | to |
Creates a view based on the results of the evaluation with the given key that contains one sample for each true positive, false positive, and false negative example in the collection, respectively. |
Method | to |
Creates a view that contains one sample per frame in the video collection. |
Method | to |
Returns a JSON string representation of the collection. |
Method | to |
Creates a view that contains one sample per object patch in the specified field of the collection. |
Method | to |
Creates a view that contains one clip for each unique object trajectory defined by their (label, index) in a frame-level field of a video collection. |
Method | untag |
Removes the tag from all labels in the specified label field(s) of this collection, if necessary. |
Method | untag |
Removes the tag(s) from all samples in this collection, if necessary. |
Method | update |
Updates the run config for the run with the given key. |
Method | validate |
Validates that the collection has a field of the given type. |
Method | validate |
Validates that the collection has field(s) with the given name(s). |
Method | values |
Extracts the values of a field from all samples in the collection. |
Method | view |
Returns a fiftyone.core.view.DatasetView containing the collection. |
Method | write |
Writes the colllection to disk in JSON format. |
Class Variable | __slots__ |
Undocumented |
Property | app |
Dataset-specific settings that customize how this collection is visualized in the :ref:`FiftyOne App <fiftyone-app>`. |
Property | classes |
The classes of the underlying dataset. |
Property | default |
The default classes of the underlying dataset. |
Property | default |
The default group slice of the collection, or None if the collection is not grouped. |
Property | default |
The default mask targets of the underlying dataset. |
Property | default |
The default keypoint skeleton of the underlying dataset. |
Property | description |
A description of the underlying dataset. |
Property | group |
The group field of the collection, or None if the collection is not grouped. |
Property | group |
A dict mapping group slices to media types, or None if the collection is not grouped. |
Property | group |
The current group slice of the collection, or None if the collection is not grouped. |
Property | group |
The list of group slices of the collection, or None if the collection is not grouped. |
Property | has |
Whether this collection has any annotation runs. |
Property | has |
Whether this collection has any brain runs. |
Property | has |
Whether this collection has any evaluation results. |
Property | has |
Whether this collection has any runs. |
Property | info |
The info dict of the underlying dataset. |
Property | mask |
The mask targets of the underlying dataset. |
Property | media |
The media type of the collection. |
Property | name |
The name of the collection. |
Property | skeletons |
The keypoint skeletons of the underlying dataset. |
Property | tags |
The list of tags of the underlying dataset. |
Method | _add |
Returns a fiftyone.core.view.DatasetView containing the contents of the collection with the given fiftyone.core.stages.ViewStage` appended to its aggregation pipeline. |
Method | _aggregate |
Runs the MongoDB aggregation pipeline on the collection and returns the result. |
Async Method | _async |
Undocumented |
Method | _build |
Undocumented |
Method | _build |
Undocumented |
Method | _build |
Undocumented |
Method | _build |
Undocumented |
Method | _contains |
Undocumented |
Method | _contains |
Undocumented |
Method | _delete |
Undocumented |
Method | _do |
Undocumented |
Method | _edit |
Undocumented |
Method | _edit |
Undocumented |
Method | _expand |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Computes the total size of the frame documents in the collection. |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Returns a dictionary mapping frame IDs to document sizes (in bytes) for each frame in the video collection. |
Method | _get |
Returns a dictionary mapping sample IDs to document sizes (in bytes) for each sample in the collection. |
Method | _get |
Returns a dictionary mapping sample IDs to total frame document sizes (in bytes) for each sample in the video collection. |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Computes the total size of the sample documents in the collection. |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _handle |
Undocumented |
Method | _handle |
Undocumented |
Method | _handle |
Undocumented |
Method | _handle |
Undocumented |
Method | _handle |
Undocumented |
Method | _has |
Undocumented |
Method | _has |
Undocumented |
Method | _has |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _list |
Undocumented |
Method | _make |
Undocumented |
Method | _make |
Undocumented |
Method | _max |
Undocumented |
Method | _min |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _pipeline |
Returns the MongoDB aggregation pipeline for the collection. |
Method | _process |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _split |
Undocumented |
Method | _sync |
Undocumented |
Method | _sync |
Undocumented |
Method | _tag |
Undocumented |
Method | _to |
Undocumented |
Method | _untag |
Undocumented |
Method | _unwind |
Undocumented |
Method | _validate |
Undocumented |
Constant | _FRAMES |
Undocumented |
Constant | _GROUPS |
Undocumented |
Property | _dataset |
The fiftyone.core.dataset.Dataset that serves the samples in this collection. |
Property | _element |
Undocumented |
Property | _elements |
Undocumented |
Property | _is |
Whether this collection contains clips. |
Property | _is |
Whether this collection contains dynamic groups. |
Property | _is |
Whether this collection contains frames of a video dataset. |
Property | _is |
Whether this collection's contents is generated from another collection. |
Property | _is |
Whether this collection contains patches. |
Property | _root |
The root fiftyone.core.dataset.Dataset from which this collection is derived. |
Returns a list of all available methods on this collection that
apply fiftyone.core.aggregations.Aggregation
operations to
this collection.
Returns | |
a list of SampleCollection method names |
Returns a list of all available methods on this collection that
apply fiftyone.core.stages.ViewStage
operations to this
collection.
Returns | |
a list of SampleCollection method names |
Applies the given fiftyone.core.stages.ViewStage
to the
collection.
Parameters | |
stage | a fiftyone.core.stages.ViewStage |
Returns | |
a fiftyone.core.view.DatasetView |
Aggregates one or more
fiftyone.core.aggregations.Aggregation
instances.
Note that it is best practice to group aggregations into a single call
to aggregate
, as this will be more efficient than performing
multiple aggregations in series.
Parameters | |
aggregations | an fiftyone.core.aggregations.Aggregation or
iterable of fiftyone.core.aggregations.Aggregation
instances |
Returns | |
an aggregation result or list of aggregation results corresponding to the input aggregation(s) |
Exports the samples and optional label field(s) in this collection to the given annotation backend.
The backend parameter controls which annotation backend to use.
Depending on the backend you use, you may want/need to provide extra
keyword arguments to this function for the constructor of the backend's
fiftyone.utils.annotations.AnnotationBackendConfig
class.
The natively provided backends and their associated config classes are:
- "cvat":
fiftyone.utils.cvat.CVATBackendConfig
- "labelstudio":
fiftyone.utils.labelstudio.LabelStudioBackendConfig
- "labelbox":
fiftyone.utils.labelbox.LabelboxBackendConfig
See :ref:`this page <requesting-annotations>` for more information about using this method, including how to define label schemas and how to configure login credentials for your annotation provider.
Parameters | |
anno | a string key to use to refer to this annotation run |
labelNone | a dictionary defining the label schema to use. If this argument is provided, it takes precedence over the other schema-related arguments |
labelNone | a string indicating a new or existing label field to annotate |
labelNone | a string indicating the type of labels to annotate. The possible values are:
All new label fields must have their type specified via this argument or in label_schema. Note that annotation backends may not support all label types |
classes:None | a list of strings indicating the class options for
label_field or all fields in label_schema without
classes specified. All new label fields must have a class list
provided via one of the supported methods. For existing label
fields, if classes are not provided by this argument nor
label_schema, they are retrieved from get_classes
if possible, or else the observed labels on your dataset are
used |
attributes:True | specifies the label attributes of each label field to include (other than their label, which is always included) in the annotation export. Can be any of the following:
If a label_schema is also provided, this parameter determines which attributes are included for all fields that do not explicitly define their per-field attributes (in addition to any per-class attributes) |
maskNone | a dict mapping pixel values to semantic label strings. Only applicable when annotating semantic segmentations |
allowTrue | whether to allow new labels to be added. Only applicable when editing existing label fields |
allowTrue | whether to allow labels to be deleted. Only applicable when editing existing label fields |
allowTrue | whether to allow the label attribute of existing labels to be modified. Only applicable when editing existing fields with label attributes |
allowTrue | whether to allow the index attribute of existing video tracks to be modified. Only applicable when editing existing frame fields with index attributes |
allowTrue | whether to allow edits to the spatial properties (bounding boxes, vertices, keypoints, masks, etc) of labels. Only applicable when editing existing spatial label fields |
media | the field containing the paths to the media files to upload |
backend:None | the annotation backend to use. The supported values are fiftyone.annotation_config.backends.keys() and the default is fiftyone.annotation_config.default_backend |
launchFalse | whether to launch the annotation backend's editor after uploading the samples |
**kwargs | keyword arguments for the
fiftyone.utils.annotations.AnnotationBackendConfig |
Returns | |
an fiftyone.utils.annotations.AnnnotationResults |
Applies the model to the samples in the collection.
This method supports all of the following cases:
- Applying an image model to an image collection
- Applying an image model to the frames of a video collection
- Applying a video model to a video collection
Parameters | |
model | a fiftyone.core.models.Model , Hugging Face
transformers model, Ultralytics model, SuperGradients model, or
Lightning Flash model |
label | the name of the field in which to store the model predictions. When performing inference on video frames, the "frames." prefix is optional |
confidenceNone | an optional confidence threshold to apply to any applicable labels generated by the model |
storeFalse | whether to store logits for the model predictions. This is only supported when the provided model has logits, model.has_logits == True |
batchNone | an optional batch size to use, if the model supports batching |
numNone | the number of workers for the
torch:torch.utils.data.DataLoader to use. Only
applicable for Torch-based models |
skipTrue | whether to gracefully continue without
raising an error if predictions cannot be generated for a
sample. Only applicable to fiftyone.core.models.Model
instances |
outputNone | an optional output directory in which to write segmentation images. Only applicable if the model generates segmentations. If none is provided, the segmentations are stored in the database |
relNone | an optional relative directory to strip from each
input filepath to generate a unique identifier that is joined
with output_dir to generate an output path for each
segmentation image. This argument allows for populating nested
subdirectories in output_dir that match the shape of the
input paths. The path is converted to an absolute path (if
necessary) via fiftyone.core.storage.normalize_path |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional model-specific keyword arguments passed through to the underlying inference implementation |
Computes the bounds of a numeric field of the collection.
None-valued fields are ignored.
This aggregation is typically applied to numeric or date field types (or lists of such types):
fiftyone.core.fields.IntField
fiftyone.core.fields.FloatField
fiftyone.core.fields.DateField
fiftyone.core.fields.DateTimeField
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the bounds of a numeric field # bounds = dataset.bounds("numeric_field") print(bounds) # (min, max) # # Compute the bounds of a numeric list field # bounds = dataset.bounds("numeric_list_field") print(bounds) # (min, max) # # Compute the bounds of a transformation of a numeric field # bounds = dataset.bounds(2 * (F("numeric_field") + 1)) print(bounds) # (min, max)
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
Returns | |
the (min, max) bounds |
Computes embeddings for the samples in the collection using the given model.
This method supports all the following cases:
- Using an image model to compute embeddings for an image collection
- Using an image model to compute frame embeddings for a video collection
- Using a video model to compute embeddings for a video collection
The model must expose embeddings, i.e.,
fiftyone.core.models.Model.has_embeddings
must return True.
If an embeddings_field is provided, the embeddings are saved to the samples; otherwise, the embeddings are returned in-memory.
Parameters | |
model | a fiftyone.core.models.Model , Hugging Face
Transformers model, Ultralytics model, SuperGradients model, or
Lightning Flash model |
embeddingsNone | the name of a field in which to store the embeddings. When computing video frame embeddings, the "frames." prefix is optional |
batchNone | an optional batch size to use, if the model supports batching |
numNone | the number of workers for the
torch:torch.utils.data.DataLoader to use. Only
applicable for Torch-based models |
skipTrue | whether to gracefully continue without
raising an error if embeddings cannot be generated for a
sample. Only applicable to fiftyone.core.models.Model
instances |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional model-specific keyword arguments passed through to the underlying inference implementation |
Returns | |
one of the following |
|
Populates the metadata field of all samples in the collection.
Any samples with existing metadata are skipped, unless overwrite == True.
Parameters | |
overwrite:False | whether to overwrite existing metadata |
numNone | a suggested number of threads to use |
skipTrue | whether to gracefully continue without raising an error if metadata cannot be computed for a sample |
warnFalse | whether to log a warning if metadata cannot be computed for a sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Computes embeddings for the image patches defined by patches_field of the samples in the collection using the given model.
This method supports all the following cases:
- Using an image model to compute patch embeddings for an image collection
- Using an image model to compute frame patch embeddings for a video collection
The model must expose embeddings, i.e.,
fiftyone.core.models.Model.has_embeddings
must return True.
If an embeddings_field is provided, the embeddings are saved to the samples; otherwise, the embeddings are returned in-memory.
Parameters | |
model | a fiftyone.core.models.Model , Hugging Face
Transformers model, Ultralytics model, SuperGradients model,
or Lightning Flash model |
patches | the name of the field defining the image patches in
each sample to embed. Must be of type
fiftyone.core.labels.Detection ,
fiftyone.core.labels.Detections ,
fiftyone.core.labels.Polyline , or
fiftyone.core.labels.Polylines . When computing video
frame embeddings, the "frames." prefix is optional |
embeddingsNone | the name of a label attribute in which to store the embeddings |
forceFalse | whether to minimally manipulate the patch bounding boxes into squares prior to extraction |
alpha:None | an optional expansion/contraction to apply to the patches before extracting them, in [-1, inf). If provided, the length and width of the box are expanded (or contracted, when alpha < 0) by (100 * alpha)%. For example, set alpha = 0.1 to expand the boxes by 10%, and set alpha = -0.1 to contract the boxes by 10% |
handle | how to handle images with no patches. Supported values are:
|
batchNone | an optional batch size to use, if the model supports batching |
numNone | the number of workers for the
torch:torch.utils.data.DataLoader to use. Only
applicable for Torch-based models |
skipTrue | whether to gracefully continue without raising an error if embeddings cannot be generated for a sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
one of the following |
|
Concatenates the contents of the given SampleCollection
to
this collection.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") # # Concatenate two views # view1 = dataset.match(F("uniqueness") < 0.2) view2 = dataset.match(F("uniqueness") > 0.7) view = view1.concat(view2) print(view1) print(view2) print(view) # # Concatenate two patches views # gt_objects = dataset.to_patches("ground_truth") patches1 = gt_objects[:50] patches2 = gt_objects[-50:] patches = patches1.concat(patches2) print(patches1) print(patches2) print(patches)
Parameters | |
samples | a SampleCollection whose contents to append to
this collection |
Returns | |
a fiftyone.core.view.DatasetView |
Counts the number of field values in the collection.
None-valued fields are ignored.
If no field is provided, the samples themselves are counted.
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="dog"), ] ), ), fo.Sample( filepath="/path/to/image2.png", predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="rabbit"), fo.Detection(label="squirrel"), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Count the number of samples in the dataset # count = dataset.count() print(count) # the count # # Count the number of samples with `predictions` # count = dataset.count("predictions") print(count) # the count # # Count the number of objects in the `predictions` field # count = dataset.count("predictions.detections") print(count) # the count # # Count the number of objects in samples with > 2 predictions # count = dataset.count( (F("predictions.detections").length() > 2).if_else( F("predictions.detections"), None ) ) print(count) # the count
Parameters | |
fieldNone | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. If neither
field_or_expr or expr is provided, the samples
themselves are counted. This can also be a list or tuple of
such arguments, in which case a tuple of corresponding
aggregation results (each receiving the same additional keyword
arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
Returns | |
the count |
Counts the occurrences of all label tags in the specified label field(s) of this collection.
Parameters | |
labelNone | an optional name or iterable of names of
fiftyone.core.labels.Label fields. By default, all
label fields are used |
Returns | |
a dict mapping tags to counts |
Counts the occurrences of field values in the collection.
This aggregation is typically applied to countable field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", tags=["sunny"], predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="dog"), ] ), ), fo.Sample( filepath="/path/to/image2.png", tags=["cloudy"], predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Compute the tag counts in the dataset # counts = dataset.count_values("tags") print(counts) # dict mapping values to counts # # Compute the predicted label counts in the dataset # counts = dataset.count_values("predictions.detections.label") print(counts) # dict mapping values to counts # # Compute the predicted label counts after some normalization # counts = dataset.count_values( F("predictions.detections.label").map_values( {"cat": "pet", "dog": "pet"} ).upper() ) print(counts) # dict mapping values to counts
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to treat nan/inf values as None when dealing with floating point values |
Returns | |
a dict mapping values to counts |
Creates an index on the given field or with the given specification, if necessary.
Indexes enable efficient sorting, merging, and other such operations.
Frame-level fields can be indexed by prepending "frames." to the field name.
Note
If an index with the same field(s) but different order(s) already exists, no new index will be created.
Use drop_index
to drop an existing index first if you wish
to replace an existing index with new properties.
Note
If you are indexing a single field and it already has a unique constraint, it will be retained regardless of the unique value you specify. Conversely, if the given field already has a non-unique index but you requested a unique index, the existing index will be replaced with a unique index.
Use drop_index
to drop an existing index first if you wish
to replace an existing index with new properties.
Parameters | |
field | the field name, embedded.field.name, or index
specification list. See
pymongo:pymongo.collection.Collection.create_index for
supported values |
unique:False | whether to add a uniqueness constraint to the index |
wait:True | whether to wait for index creation to finish |
**kwargs | optional keyword arguments for
pymongo:pymongo.collection.Collection.create_index |
Returns | |
the name of the index |
Deletes the annotation run with the given key from this collection.
Calling this method only deletes the record of the annotation run
from the collection; it will not delete any annotations loaded onto
your dataset via load_annotations
, nor will it delete any
associated information from the annotation backend.
Use load_annotation_results
to programmatically manage/delete
a run from the annotation backend.
Parameters | |
anno | an annotation key |
Deletes all annotation runs from this collection.
Calling this method only deletes the records of the annotation runs
from this collection; it will not delete any annotations loaded onto
your dataset via load_annotations
, nor will it delete any
associated information from the annotation backend.
Use load_annotation_results
to programmatically manage/delete
runs in the annotation backend.
Deletes the evaluation results associated with the given evaluation key from this collection.
Parameters | |
eval | an evaluation key |
Computes the distinct values of a field in the collection.
None-valued fields are ignored.
This aggregation is typically applied to countable field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", tags=["sunny"], predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="dog"), ] ), ), fo.Sample( filepath="/path/to/image2.png", tags=["sunny", "cloudy"], predictions=fo.Detections( detections=[ fo.Detection(label="cat"), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Get the distinct tags in a dataset # values = dataset.distinct("tags") print(values) # list of distinct values # # Get the distinct predicted labels in a dataset # values = dataset.distinct("predictions.detections.label") print(values) # list of distinct values # # Get the distinct predicted labels after some normalization # values = dataset.distinct( F("predictions.detections.label").map_values( {"cat": "pet", "dog": "pet"} ).upper() ) print(values) # list of distinct values
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
Returns | |
a sorted list of distinct values |
Renders annotated versions of the media in the collection with the specified label data overlaid to the given directory.
The filenames of the sample media are maintained, unless a name conflict would occur in output_dir, in which case an index of the form "-%d" % count is appended to the base filename.
Images are written in format fo.config.default_image_ext, and videos are written in format fo.config.default_video_ext.
Parameters | |
output | the directory to write the annotated media |
relNone | an optional relative directory to strip from each
input filepath to generate a unique identifier that is joined
with output_dir to generate an output path for each
annotated media. This argument allows for populating nested
subdirectories in output_dir that match the shape of the
input paths. The path is converted to an absolute path (if
necessary) via fiftyone.core.storage.normalize_path |
labelNone | a label field or list of label fields to
render. By default, all fiftyone.core.labels.Label
fields are drawn |
overwrite:False | whether to delete output_dir if it exists before rendering |
config:None | an optional
fiftyone.utils.annotations.DrawConfig configuring how
to draw the labels |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments specifying parameters of the
default fiftyone.utils.annotations.DrawConfig to
override |
Returns | |
the list of paths to the rendered media |
Drops the index for the given field or name, if necessary.
Parameters | |
field | a field name, embedded.field.name, or compound
index name. Use list_indexes to see the available
indexes |
Evaluates the classification predictions in this collection with respect to the specified ground truth labels.
By default, this method simply compares the ground truth and prediction for each sample, but other strategies such as binary evaluation and top-k matching can be configured via the method parameter.
You can customize the evaluation method by passing additional parameters for the method's config class as kwargs.
The natively provided method values and their associated configs are:
- "simple":
fiftyone.utils.eval.classification.SimpleEvaluationConfig
- "top-k":
fiftyone.utils.eval.classification.TopKEvaluationConfig
- "binary":
fiftyone.utils.eval.classification.BinaryEvaluationConfig
If an eval_key is specified, then this method will record some statistics on each sample:
- When evaluating sample-level fields, an eval_key field will be populated on each sample recording whether that sample's prediction is correct.
- When evaluating frame-level fields, an eval_key field will be populated on each frame recording whether that frame's prediction is correct. In addition, an eval_key field will be populated on each sample that records the average accuracy of the frame predictions of the sample.
Parameters | |
pred | the name of the field containing the predicted
fiftyone.core.labels.Classification instances |
gt | the name of the field containing the
ground truth fiftyone.core.labels.Classification
instances |
evalNone | a string key to use to refer to this evaluation |
classes:None | the list of possible classes. If not provided, the observed ground truth/predicted labels are used |
missing:None | a missing label string. Any None-valued labels are given this label for results purposes |
method:None | a string specifying the evaluation method to use. The supported values are fo.evaluation_config.classification_backends.keys() and the default is fo.evaluation_config.classification_default_backend |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments for the constructor of the
fiftyone.utils.eval.classification.ClassificationEvaluationConfig
being used |
Returns | |
a fiftyone.utils.eval.classification.ClassificationResults |
Evaluates the specified predicted detections in this collection with respect to the specified ground truth detections.
This method supports evaluating the following spatial data types:
- Object detections in
fiftyone.core.labels.Detections
format - Instance segmentations in
fiftyone.core.labels.Detections
format with their mask attributes populated - Polygons in
fiftyone.core.labels.Polylines
format - Keypoints in
fiftyone.core.labels.Keypoints
format - Temporal detections in
fiftyone.core.labels.TemporalDetections
format
For spatial object detection evaluation, this method uses COCO-style evaluation by default.
When evaluating keypoints, "IoUs" are computed via object keypoint similarity.
For temporal segment detection, this method uses ActivityNet-style evaluation by default.
You can use the method parameter to select a different method, and you can optionally customize the method by passing additional parameters for the method's config class as kwargs.
The natively provided method values and their associated configs are:
- "coco":
fiftyone.utils.eval.coco.COCOEvaluationConfig
- "open-images":
fiftyone.utils.eval.openimages.OpenImagesEvaluationConfig
- "activitynet":
fiftyone.utils.eval.activitynet.ActivityNetEvaluationConfig
If an eval_key is provided, a number of fields are populated at the object- and sample-level recording the results of the evaluation:
True positive (TP), false positive (FP), and false negative (FN) counts for the each sample are saved in top-level fields of each sample:
TP: sample.<eval_key>_tp FP: sample.<eval_key>_fp FN: sample.<eval_key>_fn
In addition, when evaluating frame-level objects, TP/FP/FN counts are recorded for each frame:
TP: frame.<eval_key>_tp FP: frame.<eval_key>_fp FN: frame.<eval_key>_fn
The fields listed below are populated on each individual object; these fields tabulate the TP/FP/FN status of the object, the ID of the matching object (if any), and the matching IoU:
TP/FP/FN: object.<eval_key> ID: object.<eval_key>_id IoU: object.<eval_key>_iou
Parameters | |
pred | the name of the field containing the predicted
fiftyone.core.labels.Detections ,
fiftyone.core.labels.Polylines ,
fiftyone.core.labels.Keypoints ,
or fiftyone.core.labels.TemporalDetections |
gt | the name of the field containing the
ground truth fiftyone.core.labels.Detections ,
fiftyone.core.labels.Polylines ,
fiftyone.core.labels.Keypoints ,
or fiftyone.core.labels.TemporalDetections |
evalNone | a string key to use to refer to this evaluation |
classes:None | the list of possible classes. If not provided, the observed ground truth/predicted labels are used |
missing:None | a missing label string. Any unmatched objects are given this label for results purposes |
method:None | a string specifying the evaluation method to use. The supported values are fo.evaluation_config.detection_backends.keys() and the default is fo.evaluation_config.detection_default_backend |
iou:0.50 | the IoU threshold to use to determine matches |
useFalse | whether to compute IoUs using the instances
masks in the mask attribute of the provided objects, which
must be fiftyone.core.labels.Detection instances |
useFalse | whether to compute IoUs using the bounding boxes
of the provided fiftyone.core.labels.Polyline
instances rather than using their actual geometries |
classwise:True | whether to only match objects with the same class label (True) or allow matches between classes (False) |
dynamic:True | whether to declare the dynamic object-level attributes that are populated on the dataset's schema |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments for the constructor of the
fiftyone.utils.eval.detection.DetectionEvaluationConfig
being used |
Returns | |
a fiftyone.utils.eval.detection.DetectionResults |
Evaluates the regression predictions in this collection with respect to the specified ground truth values.
You can customize the evaluation method by passing additional parameters for the method's config class as kwargs.
The natively provided method values and their associated configs are:
If an eval_key is specified, then this method will record some statistics on each sample:
- When evaluating sample-level fields, an eval_key field will be populated on each sample recording the error of that sample's prediction.
- When evaluating frame-level fields, an eval_key field will be populated on each frame recording the error of that frame's prediction. In addition, an eval_key field will be populated on each sample that records the average error of the frame predictions of the sample.
Parameters | |
pred | the name of the field containing the predicted
fiftyone.core.labels.Regression instances |
gt | the name of the field containing the
ground truth fiftyone.core.labels.Regression instances |
evalNone | a string key to use to refer to this evaluation |
missing:None | a missing value. Any None-valued regressions are given this value for results purposes |
method:None | a string specifying the evaluation method to use. The supported values are fo.evaluation_config.regression_backends.keys() and the default is fo.evaluation_config.regression_default_backend |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments for the constructor of the
fiftyone.utils.eval.regression.RegressionEvaluationConfig
being used |
Returns | |
a fiftyone.utils.eval.regression.RegressionResults |
Evaluates the specified semantic segmentation masks in this collection with respect to the specified ground truth masks.
If the size of a predicted mask does not match the ground truth mask, it is resized to match the ground truth.
By default, this method simply performs pixelwise evaluation of the full masks, but other strategies such as boundary-only evaluation can be configured by passing additional parameters for the method's config class as kwargs.
The natively provided method values and their associated configs are:
If an eval_key is provided, the accuracy, precision, and recall of each sample is recorded in top-level fields of each sample:
Accuracy: sample.<eval_key>_accuracy Precision: sample.<eval_key>_precision Recall: sample.<eval_key>_recall
In addition, when evaluating frame-level masks, the accuracy, precision, and recall of each frame if recorded in the following frame-level fields:
Accuracy: frame.<eval_key>_accuracy Precision: frame.<eval_key>_precision Recall: frame.<eval_key>_recall
Note
The mask values 0 and #000000 are treated as a background class for the purposes of computing evaluation metrics like precision and recall.
Parameters | |
pred | the name of the field containing the predicted
fiftyone.core.labels.Segmentation instances |
gt | the name of the field containing the
ground truth fiftyone.core.labels.Segmentation
instances |
evalNone | a string key to use to refer to this evaluation |
maskNone | a dict mapping pixel values or RGB hex strings to labels. If not provided, the observed values are used as labels |
method:None | a string specifying the evaluation method to use. The supported values are fo.evaluation_config.segmentation_backends.keys() and the default is fo.evaluation_config.segmentation_default_backend |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments for the constructor of the
fiftyone.utils.eval.segmentation.SegmentationEvaluationConfig
being used |
Returns | |
a fiftyone.utils.eval.segmentation.SegmentationResults |
Excludes the samples with the given IDs from the collection.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample(filepath="/path/to/image1.png"), fo.Sample(filepath="/path/to/image2.png"), fo.Sample(filepath="/path/to/image3.png"), ] ) # # Exclude the first sample from the dataset # sample_id = dataset.first().id view = dataset.exclude(sample_id) # # Exclude the first and last samples from the dataset # sample_ids = [dataset.first().id, dataset.last().id] view = dataset.exclude(sample_ids)
Parameters | |
sample | the samples to exclude. Can be any of the following:
|
Returns | |
a fiftyone.core.view.DatasetView |
Excludes the samples with the given field values from the collection.
This stage is typically used to work with categorical fields (strings,
ints, and bools). If you want to exclude samples based on floating
point fields, use match
.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample(filepath="image%d.jpg" % i, int=i, str=str(i)) for i in range(10) ] ) # # Create a view excluding samples whose `int` field have the given # values # view = dataset.exclude_by("int", [1, 9, 3, 7, 5]) print(view.head(5)) # # Create a view excluding samples whose `str` field have the given # values # view = dataset.exclude_by("str", ["1", "9", "3", "7", "5"]) print(view.head(5))
Parameters | |
field | a field or embedded.field.name |
values | a value or iterable of values to exclude by |
Returns | |
a fiftyone.core.view.DatasetView |
def exclude_fields(self, field_names=None, meta_filter=None, _allow_missing=False): (source) ¶
Excludes the fields with the given names from the samples in the collection.
Note that default fields cannot be excluded.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", ground_truth=fo.Classification(label="cat"), predictions=fo.Classification( label="cat", confidence=0.9, mood="surly", ), ), fo.Sample( filepath="/path/to/image2.png", ground_truth=fo.Classification(label="dog"), predictions=fo.Classification( label="dog", confidence=0.8, mood="happy", ), ), fo.Sample( filepath="/path/to/image3.png", ), ] ) # # Exclude the `predictions` field from all samples # view = dataset.exclude_fields("predictions") # # Exclude the `mood` attribute from all classifications in the # `predictions` field # view = dataset.exclude_fields("predictions.mood")
Parameters | |
fieldNone | a field name or iterable of field names to exclude. May contain embedded.field.name as well |
metaNone | a filter that dynamically excludes fields in the collection's schema according to the specified rule, which can be matched against the field's name, type, description, and/or info. For example:
|
_allow | Undocumented |
Returns | |
a fiftyone.core.view.DatasetView |
Excludes the frames with the given IDs from the video collection.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-video") # # Exclude some specific frames # frame_ids = [ dataset.first().frames.first().id, dataset.last().frames.last().id, ] view = dataset.exclude_frames(frame_ids) print(dataset.count("frames")) print(view.count("frames"))
Parameters | |
frame | the frames to exclude. Can be any of the following:
|
omitTrue | whether to omit samples that have no frames after excluding the specified frames |
Returns | |
a fiftyone.core.view.DatasetView |
Excludes the groups with the given IDs from the grouped collection.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") # # Exclude some specific groups by ID # view = dataset.take(2) group_ids = view.values("group.id") other_groups = dataset.exclude_groups(group_ids) assert len(set(group_ids) & set(other_groups.values("group.id"))) == 0
Parameters | |
group | Undocumented |
groups | the groups to exclude. Can be any of the following:
|
Returns | |
a fiftyone.core.view.DatasetView |
def exclude_labels(self, labels=None, ids=None, tags=None, fields=None, omit_empty=True): (source) ¶
Excludes the specified labels from the collection.
The returned view will omit samples, sample fields, and individual labels that do not match the specified selection criteria.
You can perform an exclusion via one or more of the following methods:
- Provide the labels argument, which should contain a list of
dicts in the format returned by
fiftyone.core.session.Session.selected_labels
, to exclude specific labels - Provide the ids argument to exclude labels with specific IDs
- Provide the tags argument to exclude labels with specific tags
If multiple criteria are specified, labels must match all of them in order to be excluded.
By default, the exclusion is applied to all
fiftyone.core.labels.Label
fields, but you can provide the
fields argument to explicitly define the field(s) in which to
exclude.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") # # Exclude the labels currently selected in the App # session = fo.launch_app(dataset) # Select some labels in the App... view = dataset.exclude_labels(labels=session.selected_labels) # # Exclude labels with the specified IDs # # Grab some label IDs ids = [ dataset.first().ground_truth.detections[0].id, dataset.last().predictions.detections[0].id, ] view = dataset.exclude_labels(ids=ids) print(dataset.count("ground_truth.detections")) print(view.count("ground_truth.detections")) print(dataset.count("predictions.detections")) print(view.count("predictions.detections")) # # Exclude labels with the specified tags # # Grab some label IDs ids = [ dataset.first().ground_truth.detections[0].id, dataset.last().predictions.detections[0].id, ] # Give the labels a "test" tag dataset = dataset.clone() # create copy since we're modifying data dataset.select_labels(ids=ids).tag_labels("test") print(dataset.count_values("ground_truth.detections.tags")) print(dataset.count_values("predictions.detections.tags")) # Exclude the labels via their tag view = dataset.exclude_labels(tags="test") print(dataset.count("ground_truth.detections")) print(view.count("ground_truth.detections")) print(dataset.count("predictions.detections")) print(view.count("predictions.detections"))
Parameters | |
labels:None | a list of dicts specifying the labels to exclude in
the format returned by
fiftyone.core.session.Session.selected_labels |
ids:None | an ID or iterable of IDs of the labels to exclude |
tags:None | a tag or iterable of tags of labels to exclude |
fields:None | a field or iterable of fields from which to exclude |
omitTrue | whether to omit samples that have no labels after filtering |
Returns | |
a fiftyone.core.view.DatasetView |
Returns a view containing the samples in the collection that have (or do not have) a non-None value for the given field or embedded field.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", ground_truth=fo.Classification(label="cat"), predictions=fo.Classification(label="cat", confidence=0.9), ), fo.Sample( filepath="/path/to/image2.png", ground_truth=fo.Classification(label="dog"), predictions=fo.Classification(label="dog", confidence=0.8), ), fo.Sample( filepath="/path/to/image3.png", ground_truth=fo.Classification(label="dog"), predictions=fo.Classification(label="dog"), ), fo.Sample( filepath="/path/to/image4.png", ground_truth=None, predictions=None, ), fo.Sample(filepath="/path/to/image5.png"), ] ) # # Only include samples that have a value in their `predictions` # field # view = dataset.exists("predictions") # # Only include samples that do NOT have a value in their # `predictions` field # view = dataset.exists("predictions", False) # # Only include samples that have prediction confidences # view = dataset.exists("predictions.confidence")
Parameters | |
field | the field name or embedded.field.name |
bool:None | whether to check if the field exists (None or True) or does not exist (False) |
Returns | |
a fiftyone.core.view.DatasetView |
Exports the samples in the collection to disk.
You can perform exports with this method via the following basic patterns:
- Provide export_dir and dataset_type to export the content to a directory in the default layout for the specified format, as documented in :ref:`this page <exporting-datasets>`
- Provide dataset_type along with data_path, labels_path, and/or export_media to directly specify where to export the source media and/or labels (if applicable) in your desired format. This syntax provides the flexibility to, for example, perform workflows like labels-only exports
- Provide a dataset_exporter to which to feed samples to perform a fully-customized export
In all workflows, the remaining parameters of this method can be provided to further configure the export.
See :ref:`this page <exporting-datasets>` for more information about the available export formats and examples of using this method.
See :ref:`this guide <custom-dataset-exporter>` for more details about
exporting datasets in custom formats by defining your own
fiftyone.utils.data.exporters.DatasetExporter
.
This method will automatically coerce the data to match the requested export in the following cases:
- When exporting in either an unlabeled image or image classification
format, if a spatial label field is provided
(
fiftyone.core.labels.Detection
,fiftyone.core.labels.Detections
,fiftyone.core.labels.Polyline
, orfiftyone.core.labels.Polylines
), then the image patches of the provided samples will be exported - When exporting in labeled image dataset formats that expect
list-type labels (
fiftyone.core.labels.Classifications
,fiftyone.core.labels.Detections
,fiftyone.core.labels.Keypoints
, orfiftyone.core.labels.Polylines
), if a label field contains labels in non-list format (e.g.,fiftyone.core.labels.Classification
), the labels will be automatically upgraded to single-label lists - When exporting in labeled image dataset formats that expect
fiftyone.core.labels.Detections
labels, if afiftyone.core.labels.Classification
field is provided, the labels will be automatically upgraded to detections that span the entire images
Parameters | |
exportNone | the directory to which to export the samples in format dataset_type. This parameter may be omitted if you have provided appropriate values for the data_path and/or labels_path parameters. Alternatively, this can also be an archive path with one of the following extensions: .zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz If an archive path is specified, the export is performed in a directory of same name (minus extension) and then automatically archived and the directory then deleted |
datasetNone | the fiftyone.types.Dataset type to
write. If not specified, the default type for label_field
is used |
dataNone | an optional parameter that enables explicit control over the location of the exported media for certain export formats. Can be any of the following:
If None, a default value of this parameter will be chosen based on the value of the export_media parameter. Note that this parameter is not applicable to certain export formats such as binary types like TF records |
labelsNone | an optional parameter that enables explicit control over the location of the exported labels. Only applicable when exporting in certain labeled dataset formats. Can be any of the following:
For labeled datasets, the default value of this parameter will be chosen based on the export format so that the labels will be exported into export_dir |
exportNone | controls how to export the raw media. The supported values are:
If None, an appropriate default value of this parameter will be chosen based on the value of the data_path parameter. Note that some dataset formats may not support certain values for this parameter (e.g., when exporting in binary formats such as TF records, "symlink" is not an option) |
relNone | an optional relative directory to strip from each
input filepath to generate a unique identifier for each media.
When exporting media, this identifier is joined with
data_path to generate an output path for each exported
media. This argument allows for populating nested
subdirectories that match the shape of the input paths. The
path is converted to an absolute path (if necessary) via
fiftyone.core.storage.normalize_path |
datasetNone | a
fiftyone.utils.data.exporters.DatasetExporter to use
to export the samples. When provided, parameters such as
export_dir, dataset_type, data_path, and
labels_path have no effect |
labelNone | controls the label field(s) to export. Only applicable to labeled datasets. Can be any of the following:
Note that multiple fields can only be specified when the exporter used can handle dictionaries of labels. By default, the first field of compatible type for the exporter is used. When exporting labeled video datasets, this argument may contain frame fields prefixed by "frames." |
frameNone | controls the frame label field(s) to export. The "frames." prefix is optional. Only applicable to labeled video datasets. Can be any of the following:
Note that multiple fields can only be specified when the exporter used can handle dictionaries of frame labels. By default, the first field of compatible type for the exporter is used |
overwrite:False | whether to delete existing directories before performing the export (True) or to merge the export with existing files and directories (False) |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments to pass to the dataset
exporter's constructor. If you are exporting image patches,
this can also contain keyword arguments for
fiftyone.utils.patches.ImagePatchesExtractor |
Filters the values of a field or embedded field of each sample in the collection.
Values of field for which filter returns False are replaced with None.
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", ground_truth=fo.Classification(label="cat"), predictions=fo.Classification(label="cat", confidence=0.9), numeric_field=1.0, ), fo.Sample( filepath="/path/to/image2.png", ground_truth=fo.Classification(label="dog"), predictions=fo.Classification(label="dog", confidence=0.8), numeric_field=-1.0, ), fo.Sample( filepath="/path/to/image3.png", ground_truth=None, predictions=None, numeric_field=None, ), ] ) # # Only include classifications in the `predictions` field # whose `label` is "cat" # view = dataset.filter_field("predictions", F("label") == "cat") # # Only include samples whose `numeric_field` value is positive # view = dataset.filter_field("numeric_field", F() > 0)
Parameters | |
field | the field name or embedded.field.name |
filter | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
that returns a boolean describing the filter to apply |
onlyTrue | whether to only include samples that match the filter (True) or include all samples (False) |
Returns | |
a fiftyone.core.view.DatasetView |
def filter_keypoints(self, field, filter=None, labels=None, only_matches=True): (source) ¶
Filters the individual fiftyone.core.labels.Keypoint.points
elements in the specified keypoints field of each sample in the
collection.
Note
Use filter_labels
if you simply want to filter entire
fiftyone.core.labels.Keypoint
objects in a field.
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Keypoints( keypoints=[ fo.Keypoint( label="person", points=[(0.1, 0.1), (0.1, 0.9), (0.9, 0.9), (0.9, 0.1)], confidence=[0.7, 0.8, 0.95, 0.99], ) ] ) ), fo.Sample(filepath="/path/to/image2.png"), ] ) dataset.default_skeleton = fo.KeypointSkeleton( labels=["nose", "left eye", "right eye", "left ear", "right ear"], edges=[[0, 1, 2, 0], [0, 3], [0, 4]], ) # # Only include keypoints in the `predictions` field whose # `confidence` is greater than 0.9 # view = dataset.filter_keypoints( "predictions", filter=F("confidence") > 0.9 ) # # Only include keypoints in the `predictions` field with less than # four points # view = dataset.filter_keypoints( "predictions", labels=["left eye", "right eye"] )
Parameters | |
field | the fiftyone.core.labels.Keypoint or
fiftyone.core.labels.Keypoints field to filter |
filter:None | a fiftyone.core.expressions.ViewExpression
or MongoDB expression
that returns a boolean, like F("confidence") > 0.5 or
F("occluded") == False, to apply elementwise to the
specified field, which must be a list of same length as
fiftyone.core.labels.Keypoint.points |
labels:None | a label or iterable of keypoint skeleton labels to keep |
onlyTrue | whether to only include keypoints/samples with at least one point after filtering (True) or include all keypoints/samples (False) |
Returns | |
a fiftyone.core.view.DatasetView |
def filter_labels(self, field, filter, only_matches=True, trajectories=False): (source) ¶
Filters the fiftyone.core.labels.Label
field of each
sample in the collection.
If the specified field is a single
fiftyone.core.labels.Label
type, fields for which filter
returns False are replaced with None:
fiftyone.core.labels.Classification
fiftyone.core.labels.Detection
fiftyone.core.labels.Polyline
fiftyone.core.labels.Keypoint
If the specified field is a fiftyone.core.labels.Label
list type, the label elements for which filter returns False
are omitted from the view:
fiftyone.core.labels.Classifications
fiftyone.core.labels.Detections
fiftyone.core.labels.Polylines
fiftyone.core.labels.Keypoints
Classifications Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Classification(label="cat", confidence=0.9), ), fo.Sample( filepath="/path/to/image2.png", predictions=fo.Classification(label="dog", confidence=0.8), ), fo.Sample( filepath="/path/to/image3.png", predictions=fo.Classification(label="rabbit"), ), fo.Sample( filepath="/path/to/image4.png", predictions=None, ), ] ) # # Only include classifications in the `predictions` field whose # `confidence` is greater than 0.8 # view = dataset.filter_labels("predictions", F("confidence") > 0.8) # # Only include classifications in the `predictions` field whose # `label` is "cat" or "dog" # view = dataset.filter_labels( "predictions", F("label").is_in(["cat", "dog"]) )
Detections Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.1, 0.1, 0.5, 0.5], confidence=0.9, ), fo.Detection( label="dog", bounding_box=[0.2, 0.2, 0.3, 0.3], confidence=0.8, ), ] ), ), fo.Sample( filepath="/path/to/image2.png", predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.5, 0.5, 0.4, 0.4], confidence=0.95, ), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=fo.Detections( detections=[ fo.Detection( label="squirrel", bounding_box=[0.25, 0.25, 0.5, 0.5], confidence=0.5, ), ] ), ), fo.Sample( filepath="/path/to/image4.png", predictions=None, ), ] ) # # Only include detections in the `predictions` field whose # `confidence` is greater than 0.8 # view = dataset.filter_labels("predictions", F("confidence") > 0.8) # # Only include detections in the `predictions` field whose `label` # is "cat" or "dog" # view = dataset.filter_labels( "predictions", F("label").is_in(["cat", "dog"]) ) # # Only include detections in the `predictions` field whose bounding # box area is smaller than 0.2 # # Bboxes are in [top-left-x, top-left-y, width, height] format bbox_area = F("bounding_box")[2] * F("bounding_box")[3] view = dataset.filter_labels("predictions", bbox_area < 0.2)
Polylines Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Polylines( polylines=[ fo.Polyline( label="lane", points=[[(0.1, 0.1), (0.1, 0.6)]], filled=False, ), fo.Polyline( label="road", points=[[(0.2, 0.2), (0.5, 0.5), (0.2, 0.5)]], filled=True, ), ] ), ), fo.Sample( filepath="/path/to/image2.png", predictions=fo.Polylines( polylines=[ fo.Polyline( label="lane", points=[[(0.4, 0.4), (0.9, 0.4)]], filled=False, ), fo.Polyline( label="road", points=[[(0.6, 0.6), (0.9, 0.9), (0.6, 0.9)]], filled=True, ), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Only include polylines in the `predictions` field that are filled # view = dataset.filter_labels("predictions", F("filled") == True) # # Only include polylines in the `predictions` field whose `label` # is "lane" # view = dataset.filter_labels("predictions", F("label") == "lane") # # Only include polylines in the `predictions` field with at least # 3 vertices # num_vertices = F("points").map(F().length()).sum() view = dataset.filter_labels("predictions", num_vertices >= 3)
Keypoints Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Keypoint( label="house", points=[(0.1, 0.1), (0.1, 0.9), (0.9, 0.9), (0.9, 0.1)], ), ), fo.Sample( filepath="/path/to/image2.png", predictions=fo.Keypoint( label="window", points=[(0.4, 0.4), (0.5, 0.5), (0.6, 0.6)], ), ), fo.Sample( filepath="/path/to/image3.png", predictions=None, ), ] ) # # Only include keypoints in the `predictions` field whose `label` # is "house" # view = dataset.filter_labels("predictions", F("label") == "house") # # Only include keypoints in the `predictions` field with less than # four points # view = dataset.filter_labels("predictions", F("points").length() < 4)
Parameters | |
field | the label field to filter |
filter | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
that returns a boolean describing the filter to apply |
onlyTrue | whether to only include samples with at least one label after filtering (True) or include all samples (False) |
trajectories:False | whether to match entire object trajectories for which the object matches the given filter on at least one frame. Only applicable to datasets that contain videos and frame-level label fields whose objects have their index attributes populated |
Returns | |
a fiftyone.core.view.DatasetView |
fiftyone.core.dataset.Dataset
Returns the first sample in the collection.
Returns | |
a fiftyone.core.sample.Sample or
fiftyone.core.sample.SampleView |
Returns a flattened view that contains all samples in the dynamic grouped collection.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("cifar10", split="test") # Group samples by ground truth label grouped_view = dataset.take(1000).group_by("ground_truth.label") print(len(grouped_view)) # 10 # Return a flat view that contains 10 samples from each class flat_view = grouped_view.flatten(fo.Limit(10)) print(len(flat_view)) # 100
Parameters | |
stages:None | a fiftyone.core.stages.ViewStage or list of
fiftyone.core.stages.ViewStage instances to apply to
each group's samples while flattening |
Returns | |
a fiftyone.core.view.DatasetView |
def geo_near(self, point, location_field=None, min_distance=None, max_distance=None, query=None, create_index=True): (source) ¶
Sorts the samples in the collection by their proximity to a specified geolocation.
Note
This stage must be the first stage in any
fiftyone.core.view.DatasetView
in which it appears.
Examples:
import fiftyone as fo import fiftyone.zoo as foz TIMES_SQUARE = [-73.9855, 40.7580] dataset = foz.load_zoo_dataset("quickstart-geo") # # Sort the samples by their proximity to Times Square # view = dataset.geo_near(TIMES_SQUARE) # # Sort the samples by their proximity to Times Square, and only # include samples within 5km # view = dataset.geo_near(TIMES_SQUARE, max_distance=5000) # # Sort the samples by their proximity to Times Square, and only # include samples that are in Manhattan # import fiftyone.utils.geojson as foug in_manhattan = foug.geo_within( "location.point", [ [ [-73.949701, 40.834487], [-73.896611, 40.815076], [-73.998083, 40.696534], [-74.031751, 40.715273], [-73.949701, 40.834487], ] ] ) view = dataset.geo_near( TIMES_SQUARE, location_field="location", query=in_manhattan )
Parameters | |
point | the reference point to compute distances to. Can be any of the following:
|
locationNone | the location data of each sample to use. Can be any of the following:
|
minNone | filter samples that are less than this distance (in meters) from point |
maxNone | filter samples that are greater than this distance (in meters) from point |
query:None | an optional dict defining a MongoDB read query that samples must match in order to be included in this view |
createTrue | whether to create the required spherical index, if necessary |
Returns | |
a fiftyone.core.view.DatasetView |
def geo_within(self, boundary, location_field=None, strict=True, create_index=True): (source) ¶
Filters the samples in this collection to only include samples whose geolocation is within a specified boundary.
Examples:
import fiftyone as fo import fiftyone.zoo as foz MANHATTAN = [ [ [-73.949701, 40.834487], [-73.896611, 40.815076], [-73.998083, 40.696534], [-74.031751, 40.715273], [-73.949701, 40.834487], ] ] dataset = foz.load_zoo_dataset("quickstart-geo") # # Create a view that only contains samples in Manhattan # view = dataset.geo_within(MANHATTAN)
Parameters | |
boundary | a fiftyone.core.labels.GeoLocation ,
fiftyone.core.labels.GeoLocations , GeoJSON dict, or
list of coordinates that define a Polygon or
MultiPolygon to search within |
locationNone | the location data of each sample to use. Can be any of the following:
|
strict:True | whether a sample's location data must strictly fall within boundary (True) in order to match, or whether any intersection suffices (False) |
createTrue | whether to create the required spherical index, if necessary |
Returns | |
a fiftyone.core.view.DatasetView |
Returns information about the annotation run with the given key on this collection.
Parameters | |
anno | an annotation key |
Returns | |
a fiftyone.core.annotation.AnnotationInfo |
Returns information about the brain method run with the given key on this collection.
Parameters | |
brain | a brain key |
Returns | |
a fiftyone.core.brain.BrainInfo |
Gets the classes list for the given field, or None if no classes are available.
Classes are first retrieved from classes
if they exist,
otherwise from default_classes
.
Parameters | |
field | a field name |
Returns | |
a list of classes, or None |
Returns a schema dictionary describing the dynamic fields of the samples in the collection.
Dynamic fields are embedded document fields with at least one non-None value that have not been declared on the dataset's schema.
Parameters | |
fields:None | an optional field or iterable of fields for which to return dynamic fields. By default, all fields are considered |
recursive:True | whether to recursively inspect nested lists and embedded documents |
Returns | |
a dict mapping field paths to fiftyone.core.fields.Field
instances or lists of them |
Returns a schema dictionary describing the dynamic fields of the frames in the collection.
Dynamic fields are embedded document fields with at least one non-None value that have not been declared on the dataset's schema.
Parameters | |
fields:None | an optional field or iterable of fields for which to return dynamic fields. By default, all fields are considered |
recursive:True | whether to recursively inspect nested lists and embedded documents |
Returns | |
a dict mapping field paths to fiftyone.core.fields.Field
instances or lists of them, or None if the collection does not
contain videos |
Returns information about the evaluation with the given key on this collection.
Parameters | |
eval | an evaluation key |
Returns | |
an fiftyone.core.evaluation.EvaluationInfo |
Returns the field instance of the provided path, or None if one does not exist.
Parameters | |
path | a field path |
ftype:None | an optional field type to enforce. Must be a subclass
of fiftyone.core.fields.Field |
embeddedNone | an optional embedded document type to
enforce. Must be a subclass of
fiftyone.core.odm.BaseEmbeddedDocument |
readNone | whether to optionally enforce that the field is read-only (True) or not read-only (False) |
includeFalse | whether to include fields that start with _ in the returned schema |
leaf:False | whether to return the subfield of list fields |
Returns | |
a fiftyone.core.fields.Field instance or None | |
Raises | |
ValueError | if the field does not match provided constraints |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns a schema dictionary describing the fields of the samples in the collection.
Parameters | |
ftype:None | an optional field type or iterable of types to which
to restrict the returned schema. Must be subclass(es) of
fiftyone.core.fields.Field |
embeddedNone | an optional embedded document type or
iterable of types to which to restrict the returned schema.
Must be subclass(es) of
fiftyone.core.odm.BaseEmbeddedDocument |
readNone | whether to restrict to (True) or exclude (False) read-only fields. By default, all fields are included |
infoNone | an optional key or list of keys that must be in the field's info dict |
createdNone | an optional datetime specifying a minimum creation date |
includeFalse | whether to include fields that start with _ in the returned schema |
flat:False | whether to return a flattened schema where all embedded document fields are included as top-level keys |
mode:None | whether to apply the above constraints before and/or after flattening the schema. Only applicable when flat is True. Supported values are ("before", "after", "both"). The default is "after" |
Returns | |
a dict mapping field names to fiftyone.core.fields.Field
instances |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns a schema dictionary describing the fields of the frames in the collection.
Only applicable for collections that contain videos.
Parameters | |
ftype:None | an optional field type to which to restrict the
returned schema. Must be a subclass of
fiftyone.core.fields.Field |
embeddedNone | an optional embedded document type to
which to restrict the returned schema. Must be a subclass of
fiftyone.core.odm.BaseEmbeddedDocument |
readNone | whether to restrict to (True) or exclude (False) read-only fields. By default, all fields are included |
infoNone | an optional key or list of keys that must be in the field's info dict |
createdNone | an optional datetime specifying a minimum creation date |
includeFalse | whether to include fields that start with _ in the returned schema |
flat:False | whether to return a flattened schema where all embedded document fields are included as top-level keys |
mode:None | whether to apply the above constraints before and/or after flattening the schema. Only applicable when flat is True. Supported values are ("before", "after", "both"). The default is "after" |
Returns | |
a dict mapping field names to fiftyone.core.fields.Field
instances, or None if the collection does not contain videos |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns a dict containing the samples for the given group ID.
Parameters | |
group | a group ID |
groupNone | an optional subset of group slices to load |
Returns | |
a dict mapping group names to fiftyone.core.sample.Sample
or fiftyone.core.sample.SampleView instances | |
Raises | |
KeyError | if the group ID is not found |
Returns a dictionary of information about the indexes on this collection.
See pymongo:pymongo.collection.Collection.index_information
for
details on the structure of this dictionary.
Parameters | |
includeFalse | whether to include the size, usage, and build status of each index |
Returns | |
a dict mapping index names to info dicts |
Gets the mask targets for the given field, or None if no mask targets are available.
Mask targets are first retrieved from mask_targets
if they
exist, otherwise from default_mask_targets
.
Parameters | |
field | a field name |
Returns | |
a list of classes, or None |
Returns information about the run with the given key on this collection.
Parameters | |
run | a run key |
Returns | |
a fiftyone.core.runs.RunInfo |
Gets the keypoint skeleton for the given field, or None if no skeleton is available.
Skeletons are first retrieved from skeletons
if they exist,
otherwise from default_skeleton
.
Parameters | |
field | a field name |
Returns | |
a list of classes, or None |
def group_by(self, field_or_expr, order_by=None, reverse=False, flat=False, match_expr=None, sort_expr=None, create_index=True): (source) ¶
Creates a view that groups the samples in the collection by a specified field or expression.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("cifar10", split="test") # # Take 1000 samples at random and group them by ground truth label # view = dataset.take(1000).group_by("ground_truth.label") for group in view.iter_dynamic_groups(): group_value = group.first().ground_truth.label print("%s: %d" % (group_value, len(group))) # # Variation of above operation that arranges the groups in # decreasing order of size and immediately flattens them # from itertools import groupby view = dataset.take(1000).group_by( "ground_truth.label", flat=True, sort_expr=F().length(), reverse=True, ) rle = lambda v: [(k, len(list(g))) for k, g in groupby(v)] for label, count in rle(view.values("ground_truth.label")): print("%s: %d" % (label, count))
Parameters | |
field | the field or embedded.field.name to group by, or
a list of field names defining a compound group key, or a
fiftyone.core.expressions.ViewExpression or
MongoDB aggregation expression
that defines the value to group by |
orderNone | an optional field by which to order the samples in each group |
reverse:False | whether to return the results in descending order. Applies both to order_by and sort_expr |
flat:False | whether to return a grouped collection (False) or a flattened collection (True) |
matchNone | an optional
fiftyone.core.expressions.ViewExpression or
MongoDB aggregation expression
that defines which groups to include in the output view. If
provided, this expression will be evaluated on the list of
samples in each group. Only applicable when flat=True |
sortNone | an optional
fiftyone.core.expressions.ViewExpression or
MongoDB aggregation expression
that defines how to sort the groups in the output view. If
provided, this expression will be evaluated on the list of
samples in each group. Only applicable when flat=True |
createTrue | whether to create an index, if necessary, to optimize the grouping. Only applicable when grouping by field(s), not expressions |
Returns | |
a fiftyone.core.view.DatasetView |
Whether this collection has an annotation run with the given key.
Parameters | |
anno | an annotation key |
Returns | |
True/False |
Whether this collection has a brain method run with the given key.
Parameters | |
brain | a brain key |
Returns | |
True/False |
Determines whether this collection has a classes list for the given field.
Classes may be defined either in classes
or
default_classes
.
Parameters | |
field | a field name |
Returns | |
True/False |
Whether this collection has an evaluation with the given key.
Parameters | |
eval | an evaluation key |
Returns | |
True/False |
Determines whether the collection has a field with the given name.
Parameters | |
path | the field name or embedded.field.name |
Returns | |
True/False |
Determines whether the collection has a frame-level field with the given name.
Parameters | |
path | the field name or embedded.field.name |
Returns | |
True/False |
Determines whether this collection has mask targets for the given field.
Mask targets may be defined either in mask_targets
or
default_mask_targets
.
Parameters | |
field | a field name |
Returns | |
True/False |
Determines whether the collection has a sample field with the given name.
Parameters | |
path | the field name or embedded.field.name |
Returns | |
True/False |
Determines whether this collection has a keypoint skeleton for the given field.
Keypoint skeletons may be defined either in skeletons
or
default_skeleton
.
Parameters | |
field | a field name |
Returns | |
True/False |
fiftyone.core.dataset.Dataset
Returns a list of the first few samples in the collection.
If fewer than num_samples samples are in the collection, only the available samples are returned.
Parameters | |
num | the number of samples |
Returns | |
a list of fiftyone.core.sample.Sample objects |
def histogram_values(self, field_or_expr, expr=None, bins=None, range=None, auto=False): (source) ¶
Computes a histogram of the field values in the collection.
This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import numpy as np import matplotlib.pyplot as plt import fiftyone as fo from fiftyone import ViewField as F samples = [] for idx in range(100): samples.append( fo.Sample( filepath="/path/to/image%d.png" % idx, numeric_field=np.random.randn(), numeric_list_field=list(np.random.randn(10)), ) ) dataset = fo.Dataset() dataset.add_samples(samples) def plot_hist(counts, edges): counts = np.asarray(counts) edges = np.asarray(edges) left_edges = edges[:-1] widths = edges[1:] - edges[:-1] plt.bar(left_edges, counts, width=widths, align="edge") # # Compute a histogram of a numeric field # counts, edges, other = dataset.histogram_values( "numeric_field", bins=50, range=(-4, 4) ) plot_hist(counts, edges) plt.show(block=False) # # Compute the histogram of a numeric list field # counts, edges, other = dataset.histogram_values( "numeric_list_field", bins=50 ) plot_hist(counts, edges) plt.show(block=False) # # Compute the histogram of a transformation of a numeric field # counts, edges, other = dataset.histogram_values( 2 * (F("numeric_field") + 1), bins=50 ) plot_hist(counts, edges) plt.show(block=False)
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
bins:None | can be either an integer number of bins to generate or a monotonically increasing sequence specifying the bin edges to use. By default, 10 bins are created. If bins is an integer and no range is specified, bin edges are automatically distributed in an attempt to evenly distribute the counts in each bin |
range:None | a (lower, upper) tuple specifying a range in which to generate equal-width bins. Only applicable when bins is an integer |
auto:False | whether to automatically choose bin edges in an attempt to evenly distribute the counts in each bin. If this option is chosen, bins will only be used if it is an integer, and the range parameter is ignored |
Returns | |
a tuple of |
|
Initializes a config instance for a new run.
Parameters | |
**kwargs | JSON serializable config parameters |
Returns | |
a fiftyone.core.runs.RunConfig |
Initializes a results instance for the run with the given key.
Parameters | |
run | a run key |
**kwargs | JSON serializable data |
Returns | |
a fiftyone.core.runs.RunResults |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns an iterator over the groups in the collection.
Parameters | |
groupNone | an optional subset of group slices to load |
progress:False | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
autosave:False | whether to automatically save changes to samples emitted by this iterator |
batchNone | the batch size to use when autosaving samples. If a batching_strategy is provided, this parameter configures the strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency") |
batchingNone | the batching strategy to use for each save operation when autosaving samples. Supported values are:
By default, fo.config.default_batcher is used |
Returns | |
an iterator that emits dicts mapping group slice names to
fiftyone.core.sample.Sample or
fiftyone.core.sample.SampleView instances, one per group |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns an iterator over the samples in the collection.
Parameters | |
progress:False | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
autosave:False | whether to automatically save changes to samples emitted by this iterator |
batchNone | the batch size to use when autosaving samples. If a batching_strategy is provided, this parameter configures the strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency") |
batchingNone | the batching strategy to use for each save operation when autosaving samples. Supported values are:
By default, fo.config.default_batcher is used |
Returns | |
an iterator over fiftyone.core.sample.Sample or
fiftyone.core.sample.SampleView instances |
fiftyone.core.dataset.Dataset
Returns the last sample in the collection.
Returns | |
a fiftyone.core.sample.Sample or
fiftyone.core.sample.SampleView |
Returns a view with at most the given number of samples.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", ground_truth=fo.Classification(label="cat"), ), fo.Sample( filepath="/path/to/image2.png", ground_truth=fo.Classification(label="dog"), ), fo.Sample( filepath="/path/to/image3.png", ground_truth=None, ), ] ) # # Only include the first 2 samples in the view # view = dataset.limit(2)
Parameters | |
limit | the maximum number of samples to return. If a non-positive number is provided, an empty view is returned |
Returns | |
a fiftyone.core.view.DatasetView |
Limits the number of fiftyone.core.labels.Label
instances
in the specified labels list field of each sample in the collection.
The specified field must be one of the following types:
fiftyone.core.labels.Classifications
fiftyone.core.labels.Detections
fiftyone.core.labels.Keypoints
fiftyone.core.labels.Polylines
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.1, 0.1, 0.5, 0.5], confidence=0.9, ), fo.Detection( label="dog", bounding_box=[0.2, 0.2, 0.3, 0.3], confidence=0.8, ), ] ), ), fo.Sample( filepath="/path/to/image2.png", predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.5, 0.5, 0.4, 0.4], confidence=0.95, ), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image4.png", predictions=None, ), ] ) # # Only include the first detection in the `predictions` field of # each sample # view = dataset.limit_labels("predictions", 1)
Parameters | |
field | the labels list field to filter |
limit | the maximum number of labels to include in each labels list. If a non-positive number is provided, all lists will be empty |
Returns | |
a fiftyone.core.view.DatasetView |
Returns a list of annotation keys on this collection.
Parameters | |
type:None | a specific annotation run type to match, which can be:
|
method:None | a specific
fiftyone.core.annotations.AnnotationMethodConfig.method
string to match |
**kwargs | optional config parameters to match |
Returns | |
a list of annotation keys |
Returns a list of brain keys on this collection.
Parameters | |
type:None | a specific brain run type to match, which can be:
|
method:None | a specific
fiftyone.core.brain.BrainMethodConfig.method string to
match |
**kwargs | optional config parameters to match |
Returns | |
a list of brain keys |
Returns a list of evaluation keys on this collection.
Parameters | |
type:None | a specific evaluation type to match, which can be:
|
method:None | a specific
fiftyone.core.evaluations.EvaluationMethodConfig.method
string to match |
**kwargs | optional config parameters to match |
Returns | |
a list of evaluation keys |
Returns the list of index names on this collection.
Single-field indexes are referenced by their field name, while compound
indexes are referenced by more complicated strings. See
pymongo:pymongo.collection.Collection.index_information
for
details on the compound format.
Returns | |
the list of index names |
Returns a list of run keys on this collection.
Parameters | |
**kwargs | optional config parameters to match |
Returns | |
a list of run keys |
Extracts the value type(s) in a specified list field across all samples in the collection.
Examples:
from datetime import datetime import fiftyone as fo dataset = fo.Dataset() sample1 = fo.Sample( filepath="image1.png", ground_truth=fo.Classification( label="cat", info=[ fo.DynamicEmbeddedDocument( task="initial_annotation", author="Alice", timestamp=datetime(1970, 1, 1), notes=["foo", "bar"], ), fo.DynamicEmbeddedDocument( task="editing_pass", author="Bob", timestamp=datetime.utcnow(), ), ], ), ) sample2 = fo.Sample( filepath="image2.png", ground_truth=fo.Classification( label="dog", info=[ fo.DynamicEmbeddedDocument( task="initial_annotation", author="Bob", timestamp=datetime(2018, 10, 18), notes=["spam", "eggs"], ), ], ), ) dataset.add_samples([sample1, sample2]) # Determine that `ground_truth.info` contains embedded documents print(dataset.list_schema("ground_truth.info")) # fo.EmbeddedDocumentField # Determine the fields of the embedded documents in the list print(dataset.schema("ground_truth.info[]")) # {'task': StringField, ..., 'notes': ListField} # Determine the type of the values in the nested `notes` list field # Since `ground_truth.info` is not yet declared on the dataset's # schema, we must manually include `[]` to unwind the info lists print(dataset.list_schema("ground_truth.info[].notes")) # fo.StringField # Declare the `ground_truth.info` field dataset.add_sample_field( "ground_truth.info", fo.ListField, subfield=fo.EmbeddedDocumentField, embedded_doc_type=fo.DynamicEmbeddedDocument, ) # Now we can inspect the nested `notes` field without unwinding print(dataset.list_schema("ground_truth.info.notes")) # fo.StringField
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
Returns | |
a fiftyone.core.fields.Field or list of
fiftyone.core.fields.Field instances describing the value
type(s) in the list |
Loads the results for the annotation run with the given key on this collection.
The fiftyone.utils.annotations.AnnotationResults
object
returned by this method will provide a variety of backend-specific
methods allowing you to perform actions such as checking the status and
deleting this run from the annotation backend.
Use load_annotations
to load the labels from an annotation
run onto your FiftyOne dataset.
Parameters | |
anno | an annotation key |
cache:True | whether to cache the results on the collection |
**kwargs | keyword arguments for run's
fiftyone.core.annotation.AnnotationMethodConfig.load_credentials
method |
Returns | |
a fiftyone.utils.annotations.AnnotationResults |
Loads the fiftyone.core.view.DatasetView
on which the
specified annotation run was performed on this collection.
Parameters | |
anno | an annotation key |
selectFalse | whether to exclude fields involved in other annotation runs |
Returns | |
a fiftyone.core.view.DatasetView |
Downloads the labels from the given annotation run from the annotation backend and merges them into this collection.
See :ref:`this page <loading-annotations>` for more information
about using this method to import annotations that you have scheduled
by calling annotate
.
Parameters | |
anno | an annotation key |
destNone | an optional name of a new destination field into which to load the annotations, or a dict mapping field names in the run's label schema to new destination field names |
unexpected:"prompt" | how to deal with any unexpected labels that don't match the run's label schema when importing. The supported values are:
|
cleanup:False | whether to delete any informtation regarding this run from the annotation backend after loading the annotations |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | keyword arguments for the run's
fiftyone.core.annotation.AnnotationMethodConfig.load_credentials
method |
Returns | |
None, unless unexpected=="return" and unexpected labels are found, in which case a dict containing the extra labels is returned |
Loads the results for the brain method run with the given key on this collection.
Parameters | |
brain | a brain key |
cache:True | whether to cache the results on the collection |
loadTrue | whether to load the view on which the results were computed (True) or the full dataset (False) |
**kwargs | keyword arguments for the run's
fiftyone.core.brain.BrainMethodConfig.load_credentials
method |
Returns | |
a fiftyone.core.brain.BrainResults |
Loads the fiftyone.core.view.DatasetView
on which the
specified brain method run was performed on this collection.
Parameters | |
brain | a brain key |
selectFalse | whether to exclude fields involved in other brain method runs |
Returns | |
a fiftyone.core.view.DatasetView |
Loads the results for the evaluation with the given key on this collection.
Parameters | |
eval | an evaluation key |
cache:True | whether to cache the results on the collection |
**kwargs | keyword arguments for the run's
fiftyone.core.evaluation.EvaluationMethodConfig.load_credentials
method |
Returns | |
a fiftyone.core.evaluation.EvaluationResults |
Loads the fiftyone.core.view.DatasetView
on which the
specified evaluation was performed on this collection.
Parameters | |
eval | an evaluation key |
selectFalse | whether to exclude fields involved in other evaluations |
Returns | |
a fiftyone.core.view.DatasetView |
Loads the results for the run with the given key on this collection.
Parameters | |
run | a run key |
cache:True | whether to cache the results on the collection |
loadTrue | whether to load the view on which the results were computed (True) or the full dataset (False) |
**kwargs | keyword arguments for the run's
fiftyone.core.runs.RunConfig.load_credentials method |
Returns | |
a fiftyone.core.runs.RunResults |
Loads the fiftyone.core.view.DatasetView
on which the
specified run was performed on this collection.
Parameters | |
run | a run key |
selectFalse | whether to exclude fields involved in other runs |
Returns | |
a fiftyone.core.view.DatasetView |
Makes a unique field name with the given root name for the collection.
Parameters | |
root:"" | an optional root for the output field name |
Returns | |
the field name |
Maps the label values of a fiftyone.core.labels.Label
field to new values for each sample in the collection.
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", weather=fo.Classification(label="sunny"), predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.1, 0.1, 0.5, 0.5], confidence=0.9, ), fo.Detection( label="dog", bounding_box=[0.2, 0.2, 0.3, 0.3], confidence=0.8, ), ] ), ), fo.Sample( filepath="/path/to/image2.png", weather=fo.Classification(label="cloudy"), predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.5, 0.5, 0.4, 0.4], confidence=0.95, ), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image3.png", weather=fo.Classification(label="partly cloudy"), predictions=fo.Detections( detections=[ fo.Detection( label="squirrel", bounding_box=[0.25, 0.25, 0.5, 0.5], confidence=0.5, ), ] ), ), fo.Sample( filepath="/path/to/image4.png", predictions=None, ), ] ) # # Map the "partly cloudy" weather label to "cloudy" # view = dataset.map_labels("weather", {"partly cloudy": "cloudy"}) # # Map "rabbit" and "squirrel" predictions to "other" # view = dataset.map_labels( "predictions", {"rabbit": "other", "squirrel": "other"} )
Parameters | |
field | the labels field to map |
map | a dict mapping label values to new label values |
Returns | |
a fiftyone.core.view.DatasetView |
Filters the samples in the collection by the given filter.
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", weather=fo.Classification(label="sunny"), predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.1, 0.1, 0.5, 0.5], confidence=0.9, ), fo.Detection( label="dog", bounding_box=[0.2, 0.2, 0.3, 0.3], confidence=0.8, ), ] ), ), fo.Sample( filepath="/path/to/image2.jpg", weather=fo.Classification(label="cloudy"), predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.5, 0.5, 0.4, 0.4], confidence=0.95, ), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image3.png", weather=fo.Classification(label="partly cloudy"), predictions=fo.Detections( detections=[ fo.Detection( label="squirrel", bounding_box=[0.25, 0.25, 0.5, 0.5], confidence=0.5, ), ] ), ), fo.Sample( filepath="/path/to/image4.jpg", predictions=None, ), ] ) # # Only include samples whose `filepath` ends with ".jpg" # view = dataset.match(F("filepath").ends_with(".jpg")) # # Only include samples whose `weather` field is "sunny" # view = dataset.match(F("weather").label == "sunny") # # Only include samples with at least 2 objects in their # `predictions` field # view = dataset.match(F("predictions").detections.length() >= 2) # # Only include samples whose `predictions` field contains at least # one object with area smaller than 0.2 # # Bboxes are in [top-left-x, top-left-y, width, height] format bbox = F("bounding_box") bbox_area = bbox[2] * bbox[3] small_boxes = F("predictions.detections").filter(bbox_area < 0.2) view = dataset.match(small_boxes.length() > 0)
Parameters | |
filter | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
that returns a boolean describing the filter to apply |
Returns | |
a fiftyone.core.view.DatasetView |
Filters the frames in the video collection by the given filter.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart-video") # # Match frames with at least 10 detections # num_objects = F("detections.detections").length() view = dataset.match_frames(num_objects > 10) print(dataset.count()) print(view.count()) print(dataset.count("frames")) print(view.count("frames"))
Parameters | |
filter | a fiftyone.core.expressions.ViewExpression or
MongoDB aggregation expression
that returns a boolean describing the filter to apply |
omitTrue | whether to omit samples with no frame labels after filtering |
Returns | |
a fiftyone.core.view.DatasetView |
def match_labels(self, labels=None, ids=None, tags=None, filter=None, fields=None, bool=None): (source) ¶
Selects the samples from the collection that contain (or do not contain) at least one label that matches the specified criteria.
Note that, unlike select_labels
and filter_labels
, this
stage will not filter the labels themselves; it only selects the
corresponding samples.
You can perform a selection via one or more of the following methods:
- Provide the labels argument, which should contain a list of
dicts in the format returned by
fiftyone.core.session.Session.selected_labels
, to match specific labels - Provide the ids argument to match labels with specific IDs
- Provide the tags argument to match labels with specific tags
- Provide the filter argument to match labels based on a boolean
fiftyone.core.expressions.ViewExpression
that is applied to each individualfiftyone.core.labels.Label
element - Pass bool=False to negate the operation and instead match samples that do not contain at least one label matching the specified criteria
If multiple criteria are specified, labels must match all of them in order to trigger a sample match.
By default, the selection is applied to all
fiftyone.core.labels.Label
fields, but you can provide the
fields argument to explicitly define the field(s) in which to
search.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") # # Only show samples whose labels are currently selected in the App # session = fo.launch_app(dataset) # Select some labels in the App... view = dataset.match_labels(labels=session.selected_labels) # # Only include samples that contain labels with the specified IDs # # Grab some label IDs ids = [ dataset.first().ground_truth.detections[0].id, dataset.last().predictions.detections[0].id, ] view = dataset.match_labels(ids=ids) print(len(view)) print(view.count("ground_truth.detections")) print(view.count("predictions.detections")) # # Only include samples that contain labels with the specified tags # # Grab some label IDs ids = [ dataset.first().ground_truth.detections[0].id, dataset.last().predictions.detections[0].id, ] # Give the labels a "test" tag dataset = dataset.clone() # create copy since we're modifying data dataset.select_labels(ids=ids).tag_labels("test") print(dataset.count_values("ground_truth.detections.tags")) print(dataset.count_values("predictions.detections.tags")) # Retrieve the labels via their tag view = dataset.match_labels(tags="test") print(len(view)) print(view.count("ground_truth.detections")) print(view.count("predictions.detections")) # # Only include samples that contain labels matching a filter # filter = F("confidence") > 0.99 view = dataset.match_labels(filter=filter, fields="predictions") print(len(view)) print(view.count("ground_truth.detections")) print(view.count("predictions.detections"))
Parameters | |
labels:None | a list of dicts specifying the labels to select in
the format returned by
fiftyone.core.session.Session.selected_labels |
ids:None | an ID or iterable of IDs of the labels to select |
tags:None | a tag or iterable of tags of labels to select |
filter:None | a fiftyone.core.expressions.ViewExpression
or MongoDB aggregation expression
that returns a boolean describing whether to select a given
label. In the case of list fields like
fiftyone.core.labels.Detections , the filter is applied
to the list elements, not the root field |
fields:None | a field or iterable of fields from which to select |
bool:None | whether to match samples that have (None or True) or do not have (False) at least one label that matches the specified criteria |
Returns | |
a fiftyone.core.view.DatasetView |
Returns a view containing the samples in the collection that have or don't have any/all of the given tag(s).
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample(filepath="image1.png", tags=["train"]), fo.Sample(filepath="image2.png", tags=["test"]), fo.Sample(filepath="image3.png", tags=["train", "test"]), fo.Sample(filepath="image4.png"), ] ) # # Only include samples that have the "test" tag # view = dataset.match_tags("test") # # Only include samples that do not have the "test" tag # view = dataset.match_tags("test", bool=False) # # Only include samples that have the "test" or "train" tags # view = dataset.match_tags(["test", "train"]) # # Only include samples that have the "test" and "train" tags # view = dataset.match_tags(["test", "train"], all=True) # # Only include samples that do not have the "test" or "train" tags # view = dataset.match_tags(["test", "train"], bool=False) # # Only include samples that do not have the "test" and "train" tags # view = dataset.match_tags(["test", "train"], bool=False, all=True)
Parameters | |
tags | the tag or iterable of tags to match |
bool:None | whether to match samples that have (None or True) or do not have (False) the given tags |
all:False | whether to match samples that have (or don't have) all (True) or any (False) of the given tags |
Returns | |
a fiftyone.core.view.DatasetView |
Computes the maximum of a numeric field of the collection.
None-valued fields are ignored.
This aggregation is typically applied to numeric or date field types (or lists of such types):
fiftyone.core.fields.IntField
fiftyone.core.fields.FloatField
fiftyone.core.fields.DateField
fiftyone.core.fields.DateTimeField
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the maximum of a numeric field # max = dataset.max("numeric_field") print(max) # the max # # Compute the maximum of a numeric list field # max = dataset.max("numeric_list_field") print(max) # the max # # Compute the maximum of a transformation of a numeric field # max = dataset.max(2 * (F("numeric_field") + 1)) print(max) # the max
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
Returns | |
the maximum value |
Computes the arithmetic mean of the field values of the collection.
None-valued fields are ignored.
This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the mean of a numeric field # mean = dataset.mean("numeric_field") print(mean) # the mean # # Compute the mean of a numeric list field # mean = dataset.mean("numeric_list_field") print(mean) # the mean # # Compute the mean of a transformation of a numeric field # mean = dataset.mean(2 * (F("numeric_field") + 1)) print(mean) # the mean
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
Returns | |
the mean |
Merges the labels from the given input field into the given output field of the collection.
If this collection is a dataset, the input field is deleted after the merge.
If this collection is a view, the input field will still exist on the underlying dataset but will only contain the labels not present in this view.
Parameters | |
in | the name of the input label field |
out | the name of the output label field, which will be created if necessary |
Computes the minimum of a numeric field of the collection.
None-valued fields are ignored.
This aggregation is typically applied to numeric or date field types (or lists of such types):
fiftyone.core.fields.IntField
fiftyone.core.fields.FloatField
fiftyone.core.fields.DateField
fiftyone.core.fields.DateTimeField
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the minimum of a numeric field # min = dataset.min("numeric_field") print(min) # the min # # Compute the minimum of a numeric list field # min = dataset.min("numeric_list_field") print(min) # the min # # Compute the minimum of a transformation of a numeric field # min = dataset.min(2 * (F("numeric_field") + 1)) print(min) # the min
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
Returns | |
the minimum value |
Adds a view stage defined by a raw MongoDB aggregation pipeline.
See MongoDB aggregation pipelines for more details.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.1, 0.1, 0.5, 0.5], confidence=0.9, ), fo.Detection( label="dog", bounding_box=[0.2, 0.2, 0.3, 0.3], confidence=0.8, ), ] ), ), fo.Sample( filepath="/path/to/image2.png", predictions=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.5, 0.5, 0.4, 0.4], confidence=0.95, ), fo.Detection(label="rabbit"), ] ), ), fo.Sample( filepath="/path/to/image3.png", predictions=fo.Detections( detections=[ fo.Detection( label="squirrel", bounding_box=[0.25, 0.25, 0.5, 0.5], confidence=0.5, ), ] ), ), fo.Sample( filepath="/path/to/image4.png", predictions=None, ), ] ) # # Extract a view containing the second and third samples in the # dataset # view = dataset.mongo([{"$skip": 1}, {"$limit": 2}]) # # Sort by the number of objects in the `precictions` field # view = dataset.mongo([ { "$addFields": { "_sort_field": { "$size": {"$ifNull": ["$predictions.detections", []]} } } }, {"$sort": {"_sort_field": -1}}, {"$project": {"_sort_field": False}}, ])
Parameters | |
pipeline | a MongoDB aggregation pipeline (list of dicts) |
_needs | Undocumented |
_group | Undocumented |
Returns | |
a fiftyone.core.view.DatasetView |
fiftyone.core.dataset.Dataset
Returns a single sample in this collection matching the expression.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") # # Get a sample by filepath # # A random filepath in the dataset filepath = dataset.take(1).first().filepath # Get sample by filepath sample = dataset.one(F("filepath") == filepath) # # Dealing with multiple matches # # Get a sample whose image is JPEG sample = dataset.one(F("filepath").ends_with(".jpg")) # Raises an error since there are multiple JPEGs dataset.one(F("filepath").ends_with(".jpg"), exact=True)
Parameters | |
expr | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
that evaluates to True for the sample to match |
exact:False | whether to raise an error if multiple samples match the expression |
Returns | |
a fiftyone.core.sample.SampleView | |
Raises | |
ValueError | if no samples match the expression or if exact=True |
and multiple samples match the expression |
Computes the quantile(s) of the field values of a collection.
None-valued fields are ignored.
This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the quantiles of a numeric field # quantiles = dataset.quantiles("numeric_field", [0.1, 0.5, 0.9]) print(quantiles) # the quantiles # # Compute the quantiles of a numeric list field # quantiles = dataset.quantiles("numeric_list_field", [0.1, 0.5, 0.9]) print(quantiles) # the quantiles # # Compute the mean of a transformation of a numeric field # quantiles = dataset.quantiles(2 * (F("numeric_field") + 1), [0.1, 0.5, 0.9]) print(quantiles) # the quantiles
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate |
quantiles | the quantile or iterable of quantiles to compute. Each quantile must be a numeric value in [0, 1] |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
Returns | |
the quantile or list of quantiles |
Registers a run under the given key on this collection.
Parameters | |
run | a run key |
config | a fiftyone.core.runs.RunConfig |
results:None | an optional fiftyone.core.runs.RunResults |
overwrite:False | whether to allow overwriting an existing run of the same type |
cleanup:True | whether to execute an existing run's
fiftyone.core.runs.Run.cleanup method when overwriting
it |
cache:True | whether to cache the results on the collection |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Reloads the collection from the database.
Replaces the key for the given annotation run with a new key.
Parameters | |
anno | an annotation key |
new | a new annotation key |
Replaces the key for the given brain run with a new key.
Parameters | |
brain | a brain key |
new | a new brain key |
Replaces the key for the given evaluation with a new key.
Parameters | |
eval | an evaluation key |
new | Undocumented |
new | a new evaluation key |
Replaces the key for the given run with a new key.
Parameters | |
run | a run key |
new | a new run key |
Returns a context that can be used to save samples from this collection according to a configurable batching strategy.
Examples:
import random as r import string as s import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("cifar10", split="test") def make_label(): return "".join(r.choice(s.ascii_letters) for i in range(10)) # No save context for sample in dataset.iter_samples(progress=True): sample.ground_truth.label = make_label() sample.save() # Save using default batching strategy with dataset.save_context() as context: for sample in dataset.iter_samples(progress=True): sample.ground_truth.label = make_label() context.save(sample) # Save in batches of 10 with dataset.save_context(batch_size=10) as context: for sample in dataset.iter_samples(progress=True): sample.ground_truth.label = make_label() context.save(sample) # Save every 0.5 seconds with dataset.save_context(batch_size=0.5) as context: for sample in dataset.iter_samples(progress=True): sample.ground_truth.label = make_label() context.save(sample)
Parameters | |
batchNone | the batch size to use. If a batching_strategy is provided, this parameter configures the strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency") |
batchingNone | the batching strategy to use for each save operation. Supported values are:
By default, fo.config.default_batcher is used |
Returns | |
a SaveContext |
Saves run results for the run with the given key.
Parameters | |
run | a run key |
results | a fiftyone.core.runs.RunResults |
overwrite:True | whether to overwrite an existing result with the same key |
cache:True | whether to cache the results on the collection |
def schema(self, field_or_expr, expr=None, dynamic_only=False, _doc_type=None, _include_private=False): (source) ¶
Extracts the names and types of the attributes of a specified embedded document field across all samples in the collection.
Schema aggregations are useful for detecting the presence and types of
dynamic attributes of fiftyone.core.labels.Label
fields
across a collection.
Examples:
import fiftyone as fo dataset = fo.Dataset() sample1 = fo.Sample( filepath="image1.png", ground_truth=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.1, 0.1, 0.4, 0.4], foo="bar", hello=True, ), fo.Detection( label="dog", bounding_box=[0.5, 0.5, 0.4, 0.4], hello=None, ) ] ) ) sample2 = fo.Sample( filepath="image2.png", ground_truth=fo.Detections( detections=[ fo.Detection( label="rabbit", bounding_box=[0.1, 0.1, 0.4, 0.4], foo=None, ), fo.Detection( label="squirrel", bounding_box=[0.5, 0.5, 0.4, 0.4], hello="there", ), ] ) ) dataset.add_samples([sample1, sample2]) # # Get schema of all dynamic attributes on the detections in a # `Detections` field # print(dataset.schema("ground_truth.detections", dynamic_only=True)) # {'foo': StringField, 'hello': [BooleanField, StringField]}
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
dynamicFalse | whether to only include dynamically added attributes |
_doc | Undocumented |
_include | Undocumented |
Returns | |
a dict mapping field names to fiftyone.core.fields.Field
instances. If a field's values takes multiple non-None types, the
list of observed types will be returned |
Selects the samples with the given IDs from the collection.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") # # Create a view containing the currently selected samples in the App # session = fo.launch_app(dataset) # Select samples in the App... view = dataset.select(session.selected)
- ordered (False): whether to sort the samples in the returned view to
- match the order of the provided IDs
Parameters | |
sample | the samples to select. Can be any of the following:
|
ordered | Undocumented |
Returns | |
a fiftyone.core.view.DatasetView |
Selects the samples with the given field values from the collection.
This stage is typically used to work with categorical fields (strings,
ints, and bools). If you want to select samples based on floating point
fields, use match
.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample(filepath="image%d.jpg" % i, int=i, str=str(i)) for i in range(100) ] ) # # Create a view containing samples whose `int` field have the given # values # view = dataset.select_by("int", [1, 51, 11, 41, 21, 31]) print(view.head(6)) # # Create a view containing samples whose `str` field have the given # values, in order # view = dataset.select_by( "str", ["1", "51", "11", "41", "21", "31"], ordered=True ) print(view.head(6))
Parameters | |
field | a field or embedded.field.name |
values | a value or iterable of values to select by |
ordered:False | whether to sort the samples in the returned view to match the order of the provided values |
Returns | |
a fiftyone.core.view.DatasetView |
def select_fields(self, field_names=None, meta_filter=None, _allow_missing=False): (source) ¶
Selects only the fields with the given names from the samples in the collection. All other fields are excluded.
Note that default sample fields are always selected.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", uniqueness=1.0, ground_truth=fo.Detections( detections=[ fo.Detection( label="cat", bounding_box=[0.1, 0.1, 0.5, 0.5], mood="surly", age=51, ), fo.Detection( label="dog", bounding_box=[0.2, 0.2, 0.3, 0.3], mood="happy", age=52, ), ] ) ), fo.Sample( filepath="/path/to/image2.png", uniqueness=0.0, ), fo.Sample( filepath="/path/to/image3.png", ), ] ) # # Include only the default fields on each sample # view = dataset.select_fields() # # Include only the `uniqueness` field (and the default fields) on # each sample # view = dataset.select_fields("uniqueness") # # Include only the `mood` attribute (and the default attributes) of # each `Detection` in the `ground_truth` field # view = dataset.select_fields("ground_truth.detections.mood")
Parameters | |
fieldNone | a field name or iterable of field names to select. May contain embedded.field.name as well |
metaNone | a filter that dynamically selects fields in the collection's schema according to the specified rule, which can be matched against the field's name, type, description, and/or info. For example:
|
_allow | Undocumented |
Returns | |
a fiftyone.core.view.DatasetView |
Selects the frames with the given IDs from the video collection.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-video") # # Select some specific frames # frame_ids = [ dataset.first().frames.first().id, dataset.last().frames.last().id, ] view = dataset.select_frames(frame_ids) print(dataset.count()) print(view.count()) print(dataset.count("frames")) print(view.count("frames"))
Parameters | |
frame | the frames to select. Can be any of the following:
|
omitTrue | whether to omit samples that have no frames after selecting the specified frames |
Returns | |
a fiftyone.core.view.DatasetView |
def select_group_slices(self, slices=None, media_type=None, _allow_mixed=False, _force_mixed=False): (source) ¶
Selects the samples in the group collection from the given slice(s).
The returned view is a flattened non-grouped view containing only the slice(s) of interest.
Note
This stage performs a $lookup that pulls the requested slice(s) for each sample in the input collection from the source dataset. As a result, this stage always emits unfiltered samples.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_group_field("group", default="ego") group1 = fo.Group() group2 = fo.Group() dataset.add_samples( [ fo.Sample( filepath="/path/to/left-image1.jpg", group=group1.element("left"), ), fo.Sample( filepath="/path/to/video1.mp4", group=group1.element("ego"), ), fo.Sample( filepath="/path/to/right-image1.jpg", group=group1.element("right"), ), fo.Sample( filepath="/path/to/left-image2.jpg", group=group2.element("left"), ), fo.Sample( filepath="/path/to/video2.mp4", group=group2.element("ego"), ), fo.Sample( filepath="/path/to/right-image2.jpg", group=group2.element("right"), ), ] ) # # Retrieve the samples from the "ego" group slice # view = dataset.select_group_slices("ego") # # Retrieve the samples from the "left" or "right" group slices # view = dataset.select_group_slices(["left", "right"]) # # Retrieve all image samples # view = dataset.select_group_slices(media_type="image")
Parameters | |
slices:None | a group slice or iterable of group slices to select. If neither argument is provided, a flattened list of all samples is returned |
mediaNone | a media type whose slice(s) to select |
_allow | Undocumented |
_force | Undocumented |
Returns | |
a fiftyone.core.view.DatasetView |
Selects the groups with the given IDs from the grouped collection.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") # # Select some specific groups by ID # group_ids = dataset.take(10).values("group.id") view = dataset.select_groups(group_ids) assert set(view.values("group.id")) == set(group_ids) view = dataset.select_groups(group_ids, ordered=True) assert view.values("group.id") == group_ids
Parameters | |
group | Undocumented |
ordered:False | whether to sort the groups in the returned view to match the order of the provided IDs |
groups | the groups to select. Can be any of the following:
|
Returns | |
a fiftyone.core.view.DatasetView |
def select_labels(self, labels=None, ids=None, tags=None, fields=None, omit_empty=True): (source) ¶
Selects only the specified labels from the collection.
The returned view will omit samples, sample fields, and individual labels that do not match the specified selection criteria.
You can perform a selection via one or more of the following methods:
- Provide the labels argument, which should contain a list of
dicts in the format returned by
fiftyone.core.session.Session.selected_labels
, to select specific labels - Provide the ids argument to select labels with specific IDs
- Provide the tags argument to select labels with specific tags
If multiple criteria are specified, labels must match all of them in order to be selected.
By default, the selection is applied to all
fiftyone.core.labels.Label
fields, but you can provide the
fields argument to explicitly define the field(s) in which to
select.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") # # Only include the labels currently selected in the App # session = fo.launch_app(dataset) # Select some labels in the App... view = dataset.select_labels(labels=session.selected_labels) # # Only include labels with the specified IDs # # Grab some label IDs ids = [ dataset.first().ground_truth.detections[0].id, dataset.last().predictions.detections[0].id, ] view = dataset.select_labels(ids=ids) print(view.count("ground_truth.detections")) print(view.count("predictions.detections")) # # Only include labels with the specified tags # # Grab some label IDs ids = [ dataset.first().ground_truth.detections[0].id, dataset.last().predictions.detections[0].id, ] # Give the labels a "test" tag dataset = dataset.clone() # create copy since we're modifying data dataset.select_labels(ids=ids).tag_labels("test") print(dataset.count_label_tags()) # Retrieve the labels via their tag view = dataset.select_labels(tags="test") print(view.count("ground_truth.detections")) print(view.count("predictions.detections"))
Parameters | |
labels:None | a list of dicts specifying the labels to select in
the format returned by
fiftyone.core.session.Session.selected_labels |
ids:None | an ID or iterable of IDs of the labels to select |
tags:None | a tag or iterable of tags of labels to select |
fields:None | a field or iterable of fields from which to select |
omitTrue | whether to omit samples that have no labels after filtering |
Returns | |
a fiftyone.core.view.DatasetView |
Sets a field or embedded field on each sample in a collection by evaluating the given expression.
This method can process embedded list fields. To do so, simply append [] to any list component(s) of the field path.
Note
There are two cases where FiftyOne will automatically unwind array fields without requiring you to explicitly specify this via the [] syntax:
Top-level lists: when you specify a field path that refers to a top-level list field of a dataset; i.e., list_field is automatically coerced to list_field[], if necessary.
List fields: When you specify a field path that refers to
the list field of a |Label| class, such as the
Detections.detections
attribute; i.e., ground_truth.detections.label is automatically
coerced to ground_truth.detections[].label, if necessary.
See the examples below for demonstrations of this behavior.
The provided expr is interpreted relative to the document on which the embedded field is being set. For example, if you are setting a nested field field="embedded.document.field", then the expression expr you provide will be applied to the embedded.document document. Note that you can override this behavior by defining an expression that is bound to the root document by prepending "$" to any field name(s) in the expression.
See the examples below for more information.
Note
Note that you cannot set a non-existing top-level field using this
stage, since doing so would violate the dataset's schema. You can,
however, first declare a new field via
fiftyone.core.dataset.Dataset.add_sample_field
and then
populate it in a view via this stage.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") # # Replace all values of the `uniqueness` field that are less than # 0.5 with `None` # view = dataset.set_field( "uniqueness", (F("uniqueness") >= 0.5).if_else(F("uniqueness"), None) ) print(view.bounds("uniqueness")) # # Lower bound all object confidences in the `predictions` field at # 0.5 # view = dataset.set_field( "predictions.detections.confidence", F("confidence").max(0.5) ) print(view.bounds("predictions.detections.confidence")) # # Add a `num_predictions` property to the `predictions` field that # contains the number of objects in the field # view = dataset.set_field( "predictions.num_predictions", F("$predictions.detections").length(), ) print(view.bounds("predictions.num_predictions")) # # Set an `is_animal` field on each object in the `predictions` field # that indicates whether the object is an animal # ANIMALS = [ "bear", "bird", "cat", "cow", "dog", "elephant", "giraffe", "horse", "sheep", "zebra" ] view = dataset.set_field( "predictions.detections.is_animal", F("label").is_in(ANIMALS) ) print(view.count_values("predictions.detections.is_animal"))
Parameters | |
field | the field or embedded.field.name to set |
expr | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
that defines the field value to set |
_allow | Undocumented |
Returns | |
a fiftyone.core.view.DatasetView |
fiftyone.core.clips.ClipsView
, fiftyone.core.video.FramesView
, fiftyone.core.patches._PatchesView
Sets the fields of the specified labels in the collection to the given values.
Note
This method is appropriate when you have the IDs of the labels you
wish to modify. See set_values
and set_field
if
your updates are not keyed by label ID.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") # # Populate a new boolean attribute on all high confidence labels # view = dataset.filter_labels("predictions", F("confidence") > 0.99) label_ids = view.values("predictions.detections.id", unwind=True) values = {_id: True for _id in label_ids} dataset.set_label_values("predictions.detections.high_conf", values) print(dataset.count("predictions.detections")) print(len(label_ids)) print(dataset.count_values("predictions.detections.high_conf"))
Parameters | |
field | a field or embedded.field.name |
values | a dict mapping label IDs to values |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
skipFalse | whether to treat None data in values as missing data that should not be set |
validate:True | whether to validate that the values are compliant with the dataset schema before adding them |
progress:False | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
fiftyone.core.clips.ClipsView
, fiftyone.core.video.FramesView
, fiftyone.core.patches._PatchesView
Sets the field or embedded field on each sample or frame in the collection to the given values.
When setting a sample field embedded.field.name, this function is an efficient implementation of the following loop:
for sample, value in zip(sample_collection, values): sample.embedded.field.name = value sample.save()
When setting an embedded field that contains an array, say embedded.array.field.name, this function is an efficient implementation of the following loop:
for sample, array_values in zip(sample_collection, values): for doc, value in zip(sample.embedded.array, array_values): doc.field.name = value sample.save()
When setting a frame field frames.embedded.field.name, this function is an efficient implementation of the following loop:
for sample, frame_values in zip(sample_collection, values): for frame, value in zip(sample.frames.values(), frame_values): frame.embedded.field.name = value sample.save()
When setting an embedded frame field that contains an array, say frames.embedded.array.field.name, this function is an efficient implementation of the following loop:
for sample, frame_values in zip(sample_collection, values): for frame, array_values in zip(sample.frames.values(), frame_values): for doc, value in zip(frame.embedded.array, array_values): doc.field.name = value sample.save()
When values is a dict mapping keys in key_field to values, then this function is an efficient implementation of the following loop:
for key, value in values.items(): sample = sample_collection.one(F(key_field) == key) sample.embedded.field.name = value sample.save()
When setting frame fields using the dict values syntax, each value in values may either be a list corresponding to the frames of the sample matching the given key, or each value may itself be a dict mapping frame numbers to values. In the latter case, this function is an efficient implementation of the following loop:
for key, frame_values in values.items(): sample = sample_collection.one(F(key_field) == key) for frame_number, value in frame_values.items(): frame = sample[frame_number] frame.embedded.field.name = value sample.save()
You can also update list fields using the dict values syntax, in which case this method is an efficient implementation of the natural nested list modifications of the above sample/frame loops.
The dual function of set_values
is values
, which can be
used to efficiently extract the values of a field or embedded field of
all samples in a collection as lists of values in the same structure
expected by this method.
Note
If the values you are setting can be described by a
fiftyone.core.expressions.ViewExpression
applied to the
existing dataset contents, then consider using set_field
+
save
for an even more efficient alternative to explicitly
iterating over the dataset or calling values
+
set_values
to perform the update in-memory.
Examples:
import random import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") # # Create a new sample field # values = [random.random() for _ in range(len(dataset))] dataset.set_values("random", values) print(dataset.bounds("random")) # # Add a tag to all low confidence labels # view = dataset.filter_labels("predictions", F("confidence") < 0.06) detections = view.values("predictions.detections") for sample_detections in detections: for detection in sample_detections: detection.tags.append("low_confidence") view.set_values("predictions.detections", detections) print(dataset.count_label_tags())
Parameters | |
field | a field or embedded.field.name |
values | an iterable of values, one for each sample in the collection. When setting frame fields, each element can either be an iterable of values (one for each existing frame of the sample) or a dict mapping frame numbers to values. If field_name contains array fields, the corresponding elements of values must be arrays of the same lengths. This argument can also be a dict mapping keys to values (each value as described previously), in which case the keys are used to match samples by their key_field |
keyNone | a key field to use when choosing which samples to update when values is a dict |
skipFalse | whether to treat None data in values as missing data that should not be set |
expandTrue | whether to dynamically add new sample/frame fields encountered to the dataset schema. If False, an error is raised if the root field_name does not exist |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
validate:True | whether to validate that the values are compliant with the dataset schema before adding them |
progress:False | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
_allow | Undocumented |
_sample | Undocumented |
_frame | Undocumented |
Randomly shuffles the samples in the collection.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", ground_truth=fo.Classification(label="cat"), ), fo.Sample( filepath="/path/to/image2.png", ground_truth=fo.Classification(label="dog"), ), fo.Sample( filepath="/path/to/image3.png", ground_truth=None, ), ] ) # # Return a view that contains a randomly shuffled version of the # samples in the dataset # view = dataset.shuffle() # # Shuffle the samples with a fixed random seed # view = dataset.shuffle(seed=51)
Parameters | |
seed:None | an optional random seed to use when shuffling the samples |
Returns | |
a fiftyone.core.view.DatasetView |
Omits the given number of samples from the head of the collection.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", ground_truth=fo.Classification(label="cat"), ), fo.Sample( filepath="/path/to/image2.png", ground_truth=fo.Classification(label="dog"), ), fo.Sample( filepath="/path/to/image3.png", ground_truth=fo.Classification(label="rabbit"), ), fo.Sample( filepath="/path/to/image4.png", ground_truth=None, ), ] ) # # Omit the first two samples from the dataset # view = dataset.skip(2)
Parameters | |
skip | the number of samples to skip. If a non-positive number is provided, no samples are omitted |
Returns | |
a fiftyone.core.view.DatasetView |
Sorts the samples in the collection by the given field(s) or expression(s).
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") # # Sort the samples by their `uniqueness` field in ascending order # view = dataset.sort_by("uniqueness", reverse=False) # # Sorts the samples in descending order by the number of detections # in their `predictions` field whose bounding box area is less than # 0.2 # # Bboxes are in [top-left-x, top-left-y, width, height] format bbox = F("bounding_box") bbox_area = bbox[2] * bbox[3] small_boxes = F("predictions.detections").filter(bbox_area < 0.2) view = dataset.sort_by(small_boxes.length(), reverse=True) # # Performs a compound sort where samples are first sorted in # descending or by number of detections and then in ascending order # of uniqueness for samples with the same number of predictions # view = dataset.sort_by( [ (F("predictions.detections").length(), -1), ("uniqueness", 1), ] ) num_objects, uniqueness = view[:5].values( [F("predictions.detections").length(), "uniqueness"] ) print(list(zip(num_objects, uniqueness)))
Parameters | |
field | the field(s) or expression(s) to sort by. This can be any of the following:
|
reverse:False | whether to return the results in descending order |
createTrue | whether to create an index, if necessary, to optimize the sort. Only applicable when sorting by field(s), not expressions |
Returns | |
a fiftyone.core.view.DatasetView |
def sort_by_similarity(self, query, k=None, reverse=False, dist_field=None, brain_key=None): (source) ¶
Sorts the collection by similarity to a specified query.
In order to use this stage, you must first use
fiftyone.brain.compute_similarity
to index your dataset by
similarity.
Examples:
import fiftyone as fo import fiftyone.brain as fob import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") fob.compute_similarity( dataset, model="clip-vit-base32-torch", brain_key="clip" ) # # Sort samples by their similarity to a sample by its ID # query_id = dataset.first().id view = dataset.sort_by_similarity(query_id, k=5) # # Sort samples by their similarity to a manually computed vector # model = foz.load_zoo_model("clip-vit-base32-torch") embeddings = dataset.take(2, seed=51).compute_embeddings(model) query = embeddings.mean(axis=0) view = dataset.sort_by_similarity(query, k=5) # # Sort samples by their similarity to a text prompt # query = "kites high in the air" view = dataset.sort_by_similarity(query, k=5)
Parameters | |
query | the query, which can be any of the following:
|
k:None | the number of matches to return. By default, the entire collection is sorted |
reverse:False | whether to sort by least similarity (True) or greatest similarity (False). Some backends may not support least similarity |
distNone | the name of a float field in which to store the distance of each example to the specified query. The field is created if necessary |
brainNone | the brain key of an existing
fiftyone.brain.compute_similarity run on the dataset.
If not specified, the dataset must have an applicable run,
which will be used by default |
Returns | |
a fiftyone.core.view.DatasetView |
Splits the labels from the given input field into the given output field of the collection.
This method is typically invoked on a view that has filtered the contents of the specified input field, so that the labels in the view are moved to the output field and the remaining labels are left in-place.
Alternatively, you can provide a filter expression that selects the labels of interest to move in this collection.
Parameters | |
in | the name of the input label field |
out | the name of the output label field, which will be created if necessary |
filter:None | a boolean
fiftyone.core.expressions.ViewExpression to apply to
each label in the input field to determine whether to move it
(True) or leave it (False) |
fiftyone.core.dataset.Dataset
Returns stats about the collection on disk.
The samples keys refer to the sample documents stored in the database.
For video datasets, the frames keys refer to the frame documents stored in the database.
The media keys refer to the raw media associated with each sample on disk.
The index[es] keys refer to the indexes associated with the dataset.
Note that dataset-level metadata such as annotation runs are not included in this computation.
Parameters | |
includeFalse | whether to include stats about the size of the raw media in the collection |
includeFalse | whether to include stats on the dataset's indexes |
compressed:False | whether to return the sizes of collections in their compressed form on disk (True) or the logical uncompressed size of the collections (False). This option is only supported for datasets (not views) |
Returns | |
a stats dict |
Computes the standard deviation of the field values of the collection.
None-valued fields are ignored.
This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the standard deviation of a numeric field # std = dataset.std("numeric_field") print(std) # the standard deviation # # Compute the standard deviation of a numeric list field # std = dataset.std("numeric_list_field") print(std) # the standard deviation # # Compute the standard deviation of a transformation of a numeric field # std = dataset.std(2 * (F("numeric_field") + 1)) print(std) # the standard deviation
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
sample:False | whether to compute the sample standard deviation rather than the population standard deviation |
Returns | |
the standard deviation |
Computes the sum of the field values of the collection.
None-valued fields are ignored.
This aggregation is typically applied to numeric field types (or lists of such types):
Examples:
import fiftyone as fo from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Compute the sum of a numeric field # total = dataset.sum("numeric_field") print(total) # the sum # # Compute the sum of a numeric list field # total = dataset.sum("numeric_list_field") print(total) # the sum # # Compute the sum of a transformation of a numeric field # total = dataset.sum(2 * (F("numeric_field") + 1)) print(total) # the sum
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
safe:False | whether to ignore nan/inf values when dealing with floating point values |
Returns | |
the sum |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns a string summary of the collection.
Returns | |
a string summary |
Syncs the last_modified_at property(s) of the dataset.
Updates the last_modified_at
property of the dataset if
necessary to incorporate any modification timestamps to its samples.
If include_frames==True, the last_modified_at property of each video sample is first updated if necessary to incorporate any modification timestamps to its frames.
Parameters | |
includeTrue | whether to update the last_modified_at property of video samples. Only applicable to datasets that contain videos |
Adds the tag(s) to all labels in the specified label field(s) of this collection, if necessary.
Parameters | |
tags | a tag or iterable of tags |
labelNone | an optional name or iterable of names of
fiftyone.core.labels.Label fields. By default, all
label fields are used |
Adds the tag(s) to all samples in this collection, if necessary.
Parameters | |
tags | a tag or iterable of tags |
fiftyone.core.dataset.Dataset
Returns a list of the last few samples in the collection.
If fewer than num_samples samples are in the collection, only the available samples are returned.
Parameters | |
num | the number of samples |
Returns | |
a list of fiftyone.core.sample.Sample objects |
Randomly samples the given number of samples from the collection.
Examples:
import fiftyone as fo dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", ground_truth=fo.Classification(label="cat"), ), fo.Sample( filepath="/path/to/image2.png", ground_truth=fo.Classification(label="dog"), ), fo.Sample( filepath="/path/to/image3.png", ground_truth=fo.Classification(label="rabbit"), ), fo.Sample( filepath="/path/to/image4.png", ground_truth=None, ), ] ) # # Take two random samples from the dataset # view = dataset.take(2) # # Take two random samples from the dataset with a fixed seed # view = dataset.take(2, seed=51)
Parameters | |
size | the number of samples to return. If a non-positive number is provided, an empty view is returned |
seed:None | an optional random seed to use when selecting the samples |
Returns | |
a fiftyone.core.view.DatasetView |
Creates a view that contains one sample per clip defined by the given field or expression in the video collection.
The returned view will contain:
- A sample_id field that records the sample ID from which each clip was taken
- A support field that records the [first, last] frame support of each clip
- All frame-level information from the underlying dataset of the input collection
Refer to fiftyone.core.clips.make_clips_dataset
to see the
available configuration options for generating clips.
Note
The clip generation logic will respect any frame-level modifications defined in the input collection, but the output clips will always contain all frame-level labels.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart-video") # # Create a clips view that contains one clip for each contiguous # segment that contains at least one road sign in every frame # clips = ( dataset .filter_labels("frames.detections", F("label") == "road sign") .to_clips("frames.detections") ) print(clips) # # Create a clips view that contains one clip for each contiguous # segment that contains at least two road signs in every frame # signs = F("detections.detections").filter(F("label") == "road sign") clips = dataset.to_clips(signs.length() >= 2) print(clips)
Parameters | |
field | can be any of the following:
|
otherNone | controls whether sample fields other than the default sample fields are included. Can be any of the following:
|
tol:0 | the maximum number of false frames that can be overlooked when generating clips. Only applicable when field_or_expr is a frame-level list field or expression |
min | the minimum allowable length of a clip, in frames. Only applicable when field_or_expr is a frame-level list field or an expression |
trajectories:False | whether to create clips for each unique object trajectory defined by their (label, index). Only applicable when field_or_expr is a frame-level field |
**kwargs | Undocumented |
Returns | |
a fiftyone.core.clips.ClipsView |
fiftyone.core.view.DatasetView
Returns a JSON dictionary representation of the collection.
Parameters | |
relNone | a relative directory to remove from the
filepath of each sample, if possible. The path is converted
to an absolute path (if necessary) via
fiftyone.core.storage.normalize_path . The typical use
case for this argument is that your source data lives in a
single directory and you wish to serialize relative, rather
than absolute, paths to the data within that directory |
includeFalse | whether to include private fields |
includeFalse | whether to include the frame labels for video samples |
frameNone | a directory in which to write per-sample JSON files containing the frame labels for video samples. If omitted, frame labels will be included directly in the returned JSON dict (which can be quite quite large for video datasets containing many frames). Only applicable to datasets that contain videos when include_frames is True |
prettyFalse | whether to render frame labels JSON in human readable format with newlines and indentations. Only applicable to datasets that contain videos when a frame_labels_dir is provided |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a JSON dict |
Creates a view based on the results of the evaluation with the given key that contains one sample for each true positive, false positive, and false negative example in the collection, respectively.
True positive examples will result in samples with both their ground truth and predicted fields populated, while false positive/negative examples will only have one of their corresponding predicted/ground truth fields populated, respectively.
If multiple predictions are matched to a ground truth object (e.g., if the evaluation protocol includes a crowd attribute), then all matched predictions will be stored in the single sample along with the ground truth object.
The returned dataset will also have top-level type and iou fields populated based on the evaluation results for that example, as well as a sample_id field recording the sample ID of the example, and a crowd field if the evaluation protocol defines a crowd attribute.
Note
The returned view will contain patches for the contents of this collection, which may differ from the view on which the eval_key evaluation was performed. This may exclude some labels that were evaluated and/or include labels that were not evaluated.
If you would like to see patches for the exact view on which an
evaluation was performed, first call load_evaluation_view
to load the view and then convert to patches.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") dataset.evaluate_detections("predictions", eval_key="eval") session = fo.launch_app(dataset) # # Create a patches view for the evaluation results # view = dataset.to_evaluation_patches("eval") print(view) session.view = view
Parameters | |
eval | an evaluation key that corresponds to the evaluation of
ground truth/predicted fields that are of type
fiftyone.core.labels.Detections ,
fiftyone.core.labels.Polylines , or
fiftyone.core.labels.Keypoints |
otherNone | controls whether fields other than the ground truth/predicted fields and the default sample fields are included. Can be any of the following:
|
**kwargs | Undocumented |
Returns | |
a fiftyone.core.patches.EvaluationPatchesView |
Creates a view that contains one sample per frame in the video collection.
The returned view will contain all frame-level fields and the tags of each video as sample-level fields, as well as a sample_id field that records the IDs of the parent sample for each frame.
By default, sample_frames is False and this method assumes that the frames of the input collection have filepath fields populated pointing to each frame image. Any frames without a filepath populated will be omitted from the returned view.
When sample_frames is True, this method samples each video in the collection into a directory of per-frame images and stores the filepaths in the filepath frame field of the source dataset. By default, each folder of images is written using the same basename as the input video. For example, if frames_patt = "%%06d.jpg", then videos with the following paths:
/path/to/video1.mp4 /path/to/video2.mp4 ...
would be sampled as follows:
/path/to/video1/ 000001.jpg 000002.jpg ... /path/to/video2/ 000001.jpg 000002.jpg ...
However, you can use the optional output_dir and rel_dir parameters to customize the location and shape of the sampled frame folders. For example, if output_dir = "/tmp" and rel_dir = "/path/to", then videos with the following paths:
/path/to/folderA/video1.mp4 /path/to/folderA/video2.mp4 /path/to/folderB/video3.mp4 ...
would be sampled as follows:
/tmp/folderA/ video1/ 000001.jpg 000002.jpg ... video2/ 000001.jpg 000002.jpg ... /tmp/folderB/ video3/ 000001.jpg 000002.jpg ...
By default, samples will be generated for every video frame at full resolution, but this method provides a variety of parameters that can be used to customize the sampling behavior.
Note
If this method is run multiple times with sample_frames set to True, existing frames will not be resampled unless you set force_sample to True.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart-video") session = fo.launch_app(dataset) # # Create a frames view for an entire video dataset # frames = dataset.to_frames(sample_frames=True) print(frames) session.view = frames # # Create a frames view that only contains frames with at least 10 # objects, sampled at a maximum frame rate of 1fps # num_objects = F("detections.detections").length() view = dataset.match_frames(num_objects > 10) frames = view.to_frames(max_fps=1) print(frames) session.view = frames
Parameters | |
sampleFalse | whether to assume that the frame images have already been sampled at locations stored in the filepath field of each frame (False), or whether to sample the video frames now according to the specified parameters (True) |
fps:None | an optional frame rate at which to sample each video's frames |
maxNone | an optional maximum frame rate at which to sample. Videos with frame rate exceeding this value are downsampled |
size:None | an optional (width, height) at which to sample frames. A dimension can be -1, in which case the aspect ratio is preserved. Only applicable when sample_frames=True |
minNone | an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint. Only applicable when sample_frames=True |
maxNone | an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint. Only applicable when sample_frames=True |
sparse:False | whether to only sample frame images for frame
numbers for which fiftyone.core.frame.Frame instances
exist in the input collection. This parameter has no effect
when sample_frames==False since frames must always exist in
order to have filepath information use |
outputNone | an optional output directory in which to write the sampled frames. By default, the frames are written in folders with the same basename of each video |
relNone | a relative directory to remove from the filepath of
each video, if possible. The path is converted to an absolute
path (if necessary) via
fiftyone.core.storage.normalize_path . This argument can
be used in conjunction with output_dir to cause the sampled
frames to be written in a nested directory structure within
output_dir matching the shape of the input video's folder
structure |
framesNone | a pattern specifying the filename/format to use to write or check or existing sampled frames, e.g., "%%06d.jpg". The default value is fiftyone.config.default_sequence_idx + fiftyone.config.default_image_ext |
forceFalse | whether to resample videos whose sampled frames already exist. Only applicable when sample_frames=True |
skipTrue | whether to gracefully continue without raising an error if a video cannot be sampled |
verbose:False | whether to log information about the frames that will be sampled, if any |
**kwargs | Undocumented |
Returns | |
a fiftyone.core.video.FramesView |
Returns a JSON string representation of the collection.
The samples will be written as a list in a top-level samples field of the returned dictionary.
Parameters | |
relNone | a relative directory to remove from the
filepath of each sample, if possible. The path is converted
to an absolute path (if necessary) via
fiftyone.core.storage.normalize_path . The typical use
case for this argument is that your source data lives in a
single directory and you wish to serialize relative, rather
than absolute, paths to the data within that directory |
includeFalse | whether to include private fields |
includeFalse | whether to include the frame labels for video samples |
frameNone | a directory in which to write per-sample JSON files containing the frame labels for video samples. If omitted, frame labels will be included directly in the returned JSON dict (which can be quite quite large for video datasets containing many frames). Only applicable to datasets that contain videos when include_frames is True |
prettyFalse | whether to render the JSON in human readable format with newlines and indentations |
Returns | |
a JSON string |
Creates a view that contains one sample per object patch in the specified field of the collection.
Fields other than field and the default sample fields will not be included in the returned view. A sample_id field will be added that records the sample ID from which each patch was taken.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") session = fo.launch_app(dataset) # # Create a view containing the ground truth patches # view = dataset.to_patches("ground_truth") print(view) session.view = view
Parameters | |
field | the patches field, which must be of type
fiftyone.core.labels.Detections ,
fiftyone.core.labels.Polylines , or
fiftyone.core.labels.Keypoints |
otherNone | controls whether fields other than field and the default sample fields are included. Can be any of the following:
|
keepFalse | whether to store the patches in label list fields of the same type as the input collection rather than using their single label variants |
**kwargs | Undocumented |
Returns | |
a fiftyone.core.patches.PatchesView |
Creates a view that contains one clip for each unique object trajectory defined by their (label, index) in a frame-level field of a video collection.
The returned view will contain:
- A sample_id field that records the sample ID from which each clip was taken
- A support field that records the [first, last] frame support of each clip
- A sample-level label field that records the label and index of each trajectory
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart-video") # # Create a trajectories view for the vehicles in the dataset # trajectories = ( dataset .filter_labels("frames.detections", F("label") == "vehicle") .to_trajectories("frames.detections") ) print(trajectories)
Parameters | |
field | a frame-level label list field of any of the following types: |
**kwargs | optional keyword arguments for
fiftyone.core.clips.make_clips_dataset specifying how
to perform the conversion |
Returns | |
a fiftyone.core.clips.TrajectoriesView |
Removes the tag from all labels in the specified label field(s) of this collection, if necessary.
Parameters | |
tags | a tag or iterable of tags |
labelNone | an optional name or iterable of names of
fiftyone.core.labels.Label fields. By default, all
label fields are used |
Removes the tag(s) from all samples in this collection, if necessary.
Parameters | |
tags | a tag or iterable of tags |
Updates the run config for the run with the given key.
Parameters | |
run | a run key |
config | a fiftyone.core.runs.RunConfig |
Validates that the collection has a field of the given type.
Parameters | |
path | a field name or embedded.field.name |
ftype:None | an optional field type to enforce. Must be a subclass
of fiftyone.core.fields.Field |
embeddedNone | an optional embedded document type or
iterable of types to enforce. Must be a subclass(es) of
fiftyone.core.odm.BaseEmbeddedDocument |
Raises | |
ValueError | if the field does not exist or does not have the expected type |
Validates that the collection has field(s) with the given name(s).
If embedded field names are provided, only the root field is checked.
Parameters | |
fields | a field name or iterable of field names |
includeFalse | whether to include private fields when checking for existence |
Raises | |
ValueError | if one or more of the fields do not exist |
def values(self, field_or_expr, expr=None, missing_value=None, unwind=False, _allow_missing=False, _big_result=True, _raw=False, _field=None): (source) ¶
Extracts the values of a field from all samples in the collection.
Values aggregations are useful for efficiently extracting a slice of field or embedded field values across all samples in a collection. See the examples below for more details.
The dual function of values
is set_values
, which can be
used to efficiently set a field or embedded field of all samples in a
collection by providing lists of values of same structure returned by
this aggregation.
Note
Unlike other aggregations, values
does not automatically
unwind list fields, which ensures that the returned values match
the potentially-nested structure of the documents.
You can opt-in to unwinding specific list fields using the [] syntax, or you can pass the optional unwind=True parameter to unwind all supported list fields. See :ref:`aggregations-list-fields` for more information.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = fo.Dataset() dataset.add_samples( [ fo.Sample( filepath="/path/to/image1.png", numeric_field=1.0, numeric_list_field=[1, 2, 3], ), fo.Sample( filepath="/path/to/image2.png", numeric_field=4.0, numeric_list_field=[1, 2], ), fo.Sample( filepath="/path/to/image3.png", numeric_field=None, numeric_list_field=None, ), ] ) # # Get all values of a field # values = dataset.values("numeric_field") print(values) # [1.0, 4.0, None] # # Get all values of a list field # values = dataset.values("numeric_list_field") print(values) # [[1, 2, 3], [1, 2], None] # # Get all values of transformed field # values = dataset.values(2 * (F("numeric_field") + 1)) print(values) # [4.0, 10.0, None] # # Get values from a label list field # dataset = foz.load_zoo_dataset("quickstart") # list of `Detections` detections = dataset.values("ground_truth") # list of lists of `Detection` instances detections = dataset.values("ground_truth.detections") # list of lists of detection labels labels = dataset.values("ground_truth.detections.label")
Parameters | |
field | a field name, embedded.field.name,
fiftyone.core.expressions.ViewExpression , or
MongoDB expression
defining the field or expression to aggregate. This can also
be a list or tuple of such arguments, in which case a tuple of
corresponding aggregation results (each receiving the same
additional keyword arguments, if any) will be returned |
expr:None | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
to apply to field_or_expr (which must be a field) before
aggregating |
missingNone | a value to insert for missing or None-valued fields |
unwind:False | whether to automatically unwind all recognized list fields (True) or unwind all list fields except the top-level sample field (-1) |
_allow | Undocumented |
_big | Undocumented |
_raw | Undocumented |
_field | Undocumented |
Returns | |
the list of values |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns a fiftyone.core.view.DatasetView
containing the
collection.
Returns | |
a fiftyone.core.view.DatasetView |
Writes the colllection to disk in JSON format.
Parameters | |
json | the path to write the JSON |
relNone | a relative directory to remove from the
filepath of each sample, if possible. The path is converted
to an absolute path (if necessary) via
fiftyone.core.storage.normalize_path . The typical use
case for this argument is that your source data lives in a
single directory and you wish to serialize relative, rather
than absolute, paths to the data within that directory |
includeFalse | whether to include private fields |
includeFalse | whether to include the frame labels for video samples |
frameNone | a directory in which to write per-sample JSON files containing the frame labels for video samples. If omitted, frame labels will be included directly in the returned JSON dict (which can be quite quite large for video datasets containing many frames). Only applicable to datasets that contain videos when include_frames is True |
prettyFalse | whether to render the JSON in human readable format with newlines and indentations |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Dataset-specific settings that customize how this collection is visualized in the :ref:`FiftyOne App <fiftyone-app>`.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The classes of the underlying dataset.
See fiftyone.core.dataset.Dataset.classes
for more information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The default classes of the underlying dataset.
See fiftyone.core.dataset.Dataset.default_classes
for more
information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The default group slice of the collection, or None if the collection is not grouped.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The default mask targets of the underlying dataset.
See fiftyone.core.dataset.Dataset.default_mask_targets
for more
information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The default keypoint skeleton of the underlying dataset.
See fiftyone.core.dataset.Dataset.default_skeleton
for more
information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
A description of the underlying dataset.
See fiftyone.core.dataset.Dataset.description
for more
information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The group field of the collection, or None if the collection is not grouped.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
A dict mapping group slices to media types, or None if the collection is not grouped.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The current group slice of the collection, or None if the collection is not grouped.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The list of group slices of the collection, or None if the collection is not grouped.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The info dict of the underlying dataset.
See fiftyone.core.dataset.Dataset.info
for more information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The mask targets of the underlying dataset.
See fiftyone.core.dataset.Dataset.mask_targets
for more
information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The media type of the collection.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The name of the collection.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The keypoint skeletons of the underlying dataset.
See fiftyone.core.dataset.Dataset.skeletons
for more
information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The list of tags of the underlying dataset.
See fiftyone.core.dataset.Dataset.tags
for more information.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns a fiftyone.core.view.DatasetView
containing the
contents of the collection with the given
fiftyone.core.stages.ViewStage` appended to its aggregation
pipeline.
Subclasses are responsible for performing any validation on the view stage to ensure that it is a valid stage to add to this collection.
Parameters | |
stage | a fiftyone.core.stages.ViewStage` |
Returns | |
a fiftyone.core.view.DatasetView |
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Runs the MongoDB aggregation pipeline on the collection and returns the result.
Parameters | |
pipeline:None | a MongoDB aggregation pipeline (list of dicts) to append to the current pipeline |
mediaNone | the media type of the collection, if different than the source dataset's media type |
attachFalse | whether to attach the frame documents immediately prior to executing pipeline. Only applicable to datasets that contain videos |
detachFalse | whether to detach the frame documents at the end of the pipeline. Only applicable to datasets that contain videos |
framesFalse | whether to generate a pipeline that contains only the frames in the collection |
support:None | an optional [first, last] range of frames to attach. Only applicable when attaching frames |
groupNone | the current group slice of the collection, if different than the source dataset's group slice. Only applicable for grouped collections |
groupNone | an optional list of group slices to attach when groups_only is True |
detachFalse | whether to detach the group documents at the end of the pipeline. Only applicable to grouped collections |
groupsFalse | whether to generate a pipeline that contains only the flattened group documents for the collection |
manualFalse | whether the pipeline has manually handled the initial group selection. Only applicable to grouped collections |
postNone | a MongoDB aggregation pipeline (list of dicts) to append to the very end of the pipeline, after all other arguments are applied |
Returns | |
the aggregation result dict |
Undocumented
Undocumented
Undocumented
Undocumented
Returns a dictionary mapping frame IDs to document sizes (in bytes) for each frame in the video collection.
Returns a dictionary mapping sample IDs to document sizes (in bytes) for each sample in the collection.
Returns a dictionary mapping sample IDs to total frame document sizes (in bytes) for each sample in the video collection.
Undocumented
Undocumented
Undocumented
Undocumented
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Returns the MongoDB aggregation pipeline for the collection.
Parameters | |
pipeline:None | a MongoDB aggregation pipeline (list of dicts) to append to the current pipeline |
mediaNone | the media type of the collection, if different than the source dataset's media type |
attachFalse | whether to attach the frame documents immediately prior to executing pipeline. Only applicable to datasets that contain videos |
detachFalse | whether to detach the frame documents at the end of the pipeline. Only applicable to datasets that contain videos |
framesFalse | whether to generate a pipeline that contains only the frames in the collection |
support:None | an optional [first, last] range of frames to attach. Only applicable when attaching frames |
groupNone | the current group slice of the collection, if different than the source dataset's group slice. Only applicable for grouped collections |
groupNone | an optional list of group slices to attach when groups_only is True |
detachFalse | whether to detach the group documents at the end of the pipeline. Only applicable to grouped collections |
groupsFalse | whether to generate a pipeline that contains only the flattened group documents for the collection |
manualFalse | whether the pipeline has manually handled the initial group selection. Only applicable to grouped collections |
postNone | a MongoDB aggregation pipeline (list of dicts) to append to the very end of the pipeline, after all other arguments are applied |
Returns | |
the aggregation pipeline |
Undocumented
Undocumented
Undocumented
Undocumented
Undocumented
Undocumented
fiftyone.core.clips.ClipsView
, fiftyone.core.video.FramesView
, fiftyone.core.patches._PatchesView
Undocumented
fiftyone.core.clips.ClipsView
, fiftyone.core.video.FramesView
, fiftyone.core.patches._PatchesView
Undocumented
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The fiftyone.core.dataset.Dataset
that serves the samples
in this collection.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Whether this collection contains clips.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Whether this collection contains dynamic groups.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Whether this collection contains frames of a video dataset.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Whether this collection's contents is generated from another collection.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
Whether this collection contains patches.
fiftyone.core.dataset.Dataset
, fiftyone.core.view.DatasetView
The root fiftyone.core.dataset.Dataset
from which this
collection is derived.
This is typically the same as _dataset
but may differ in cases
such as patches views.