class Dataset(foc.SampleCollection): (source)
Constructor: Dataset(name, persistent, overwrite, _create, ...)
A FiftyOne dataset.
Datasets represent an ordered collection of
fiftyone.core.sample.Sample
instances that describe a particular
type of raw media (e.g., images or videos) together with a user-defined set
of fields.
FiftyOne datasets ingest and store the labels for all samples internally; raw media is stored on disk and the dataset provides paths to the data.
See :ref:`this page <using-datasets>` for an overview of working with FiftyOne datasets.
Parameters | |
name | the name of the dataset. By default,
get_default_dataset_name is used |
persistent | whether the dataset should persist in the database after the session terminates |
overwrite | whether to overwrite an existing dataset of the same name |
Class Method | from |
Creates a Dataset from the contents of the given archive. |
Class Method | from |
Loads a Dataset from a JSON dictionary generated by fiftyone.core.collections.SampleCollection.to_dict . |
Class Method | from |
Creates a Dataset from the contents of the given directory. |
Class Method | from |
Creates a Dataset from the given images. |
Class Method | from |
Creates a Dataset from the given directory of images. |
Class Method | from |
Creates a Dataset from the given glob pattern of images. |
Class Method | from |
Creates a Dataset by importing the samples in the given fiftyone.utils.data.importers.DatasetImporter . |
Class Method | from |
Loads a Dataset from JSON generated by fiftyone.core.collections.SampleCollection.write_json or fiftyone.core.collections.SampleCollection.to_json . |
Class Method | from |
Creates a Dataset from the given labeled images. |
Class Method | from |
Creates a Dataset from the given labeled videos. |
Class Method | from |
Creates a Dataset from the given videos. |
Class Method | from |
Creates a Dataset from the given directory of videos. |
Class Method | from |
Creates a Dataset from the given glob pattern of videos. |
Method | __copy__ |
Undocumented |
Method | __deepcopy__ |
Undocumented |
Method | __delitem__ |
Undocumented |
Method | __eq__ |
Undocumented |
Method | __getattribute__ |
Undocumented |
Method | __getitem__ |
Undocumented |
Method | __init__ |
Undocumented |
Method | __len__ |
Undocumented |
Method | add |
Adds the contents of the given archive to the dataset. |
Method | add |
Adds the contents of the given collection to the dataset. |
Method | add |
Adds the contents of the given directory to the dataset. |
Method | add |
Adds all dynamic frame fields to the dataset's schema. |
Method | add |
Adds all dynamic sample fields to the dataset's schema. |
Method | add |
Adds a new frame-level field or embedded field to the dataset, if necessary. |
Method | add |
Adds a group field to the dataset, if necessary. |
Method | add |
Adds a group slice with the given media type to the dataset, if necessary. |
Method | add |
Adds the given images to the dataset. |
Method | add |
Adds the given directory of images to the dataset. |
Method | add |
Adds the given glob pattern of images to the dataset. |
Method | add |
Adds the samples from the given fiftyone.utils.data.importers.DatasetImporter to the dataset. |
Method | add |
Adds the given labeled images to the dataset. |
Method | add |
Adds the given labeled videos to the dataset. |
Method | add |
Adds the given sample to the dataset. |
Method | add |
Adds a new sample field or embedded field to the dataset, if necessary. |
Method | add |
Adds the given samples to the dataset. |
Method | add |
Adds the given videos to the dataset. |
Method | add |
Adds the given directory of videos to the dataset. |
Method | add |
Adds the given glob pattern of videos to the dataset. |
Method | app |
Undocumented |
Method | check |
Returns a list of summary fields that may need to be updated. |
Method | classes |
Undocumented |
Method | clear |
Removes all samples from the dataset. |
Method | clear |
Clears the dataset's in-memory cache. |
Method | clear |
Clears the values of the frame-level field from all samples in the dataset. |
Method | clear |
Clears the values of the frame-level fields from all samples in the dataset. |
Method | clear |
Removes all frame labels from the dataset. |
Method | clear |
Clears the values of the field from all samples in the dataset. |
Method | clear |
Clears the values of the fields from all samples in the dataset. |
Method | clone |
Creates a copy of the dataset. |
Method | clone |
Clones the frame-level field into a new field. |
Method | clone |
Clones the frame-level fields into new fields. |
Method | clone |
Clones the given sample field into a new field of the dataset. |
Method | clone |
Clones the given sample fields into new fields of the dataset. |
Method | create |
Populates a sample-level field that records the unique values or numeric ranges that appear in the specified field on each sample in the dataset. |
Method | default |
Undocumented |
Method | default |
Undocumented |
Method | default |
Undocumented |
Method | default |
Undocumented |
Method | delete |
Deletes the dataset. |
Method | delete |
Deletes the frame-level field from all samples in the dataset. |
Method | delete |
Deletes the frame-level fields from all samples in the dataset. |
Method | delete |
Deletes the given frames(s) from the dataset. |
Method | delete |
Deletes all samples in the given group slice from the dataset. |
Method | delete |
Deletes the given groups(s) from the dataset. |
Method | delete |
Deletes the specified labels from the dataset. |
Method | delete |
Deletes the field from all samples in the dataset. |
Method | delete |
Deletes the fields from all samples in the dataset. |
Method | delete |
Deletes the given sample(s) from the dataset. |
Method | delete |
Deletes the saved view with the given name. |
Method | delete |
Deletes all saved views from this dataset. |
Method | delete |
Deletes the summary field from all samples in the dataset. |
Method | delete |
Deletes the summary fields from all samples in the dataset. |
Method | delete |
Deletes the saved workspace with the given name. |
Method | delete |
Deletes all saved workspaces from this dataset. |
Method | description |
Undocumented |
Method | ensure |
Ensures that the video dataset contains frame instances for every frame of each sample's source video. |
Method | first |
Returns the first sample in the dataset. |
Method | get |
Returns a schema dictionary describing the fields of the samples in the dataset. |
Method | get |
Returns a schema dictionary describing the fields of the frames of the samples in the dataset. |
Method | get |
Returns a dict containing the samples for the given group ID. |
Method | get |
Loads the editable information about the saved view with the given name. |
Method | get |
Gets the information about the workspace with the given name. |
Method | group |
Undocumented |
Method | has |
Whether this dataset has a saved view with the given name. |
Method | has |
Whether this dataset has a saved workspace with the given name. |
Method | head |
Returns a list of the first few samples in the dataset. |
Method | info |
Undocumented |
Method | ingest |
Ingests the given iterable of images into the dataset. |
Method | ingest |
Ingests the given iterable of labeled image samples into the dataset. |
Method | ingest |
Ingests the given iterable of labeled video samples into the dataset. |
Method | ingest |
Ingests the given iterable of videos into the dataset. |
Method | iter |
Returns an iterator over the groups in the dataset. |
Method | iter |
Returns an iterator over the samples in the dataset. |
Method | last |
Returns the last sample in the dataset. |
Method | list |
List saved views on this dataset. |
Method | list |
Lists the summary fields on the dataset. |
Method | list |
List saved workspaces on this dataset. |
Method | load |
Loads the saved view with the given name. |
Method | load |
Loads the saved workspace with the given name. |
Method | mask |
Undocumented |
Method | media |
Undocumented |
Method | merge |
Merges the contents of the given archive into the dataset. |
Method | merge |
Merges the contents of the given directory into the dataset. |
Method | merge |
Merges the samples from the given fiftyone.utils.data.importers.DatasetImporter into the dataset. |
Method | merge |
Merges the fields of the given sample into this dataset. |
Method | merge |
Merges the given samples into this dataset. |
Method | name |
Undocumented |
Method | one |
Returns a single sample in this dataset matching the expression. |
Method | persistent |
Undocumented |
Method | reload |
Reloads the dataset and any in-memory samples from the database. |
Method | remove |
Removes the dynamic embedded frame field from the dataset's schema. |
Method | remove |
Removes the dynamic embedded frame fields from the dataset's schema. |
Method | remove |
Removes the dynamic embedded sample field from the dataset's schema. |
Method | remove |
Removes the dynamic embedded sample fields from the dataset's schema. |
Method | rename |
Renames the frame-level field to the given new name. |
Method | rename |
Renames the frame-level fields to the given new names. |
Method | rename |
Renames the group slice with the given name. |
Method | rename |
Renames the sample field to the given new name. |
Method | rename |
Renames the sample fields to the given new names. |
Method | save |
Saves the dataset to the database. |
Method | save |
Saves the given view into this dataset under the given name so it can be loaded later via load_saved_view . |
Method | save |
Saves a workspace into this dataset under the given name so it can be loaded later via load_workspace . |
Method | skeletons |
Undocumented |
Method | stats |
Returns stats about the dataset on disk. |
Method | summary |
Returns a string summary of the dataset. |
Method | tags |
Undocumented |
Method | tail |
Returns a list of the last few samples in the dataset. |
Method | update |
Updates the editable information for the saved view with the given name. |
Method | update |
Updates the summary field based on the current values of its source field. |
Method | update |
Updates the editable information for the saved view with the given name. |
Method | view |
Returns a fiftyone.core.view.DatasetView containing the entire dataset. |
Class Variable | __slots__ |
Undocumented |
Instance Variable | group |
The current group slice of the dataset, or None if the dataset is not grouped. |
Instance Variable | media |
The media type of the dataset. |
Property | app |
A fiftyone.core.odm.dataset.DatasetAppConfig that customizes how this dataset is visualized in the :ref:`FiftyOne App <fiftyone-app>`. |
Property | classes |
A dict mapping field names to list of class label strings for the corresponding fields of the dataset. |
Property | created |
The datetime that the dataset was created. |
Property | default |
A list of class label strings for all fiftyone.core.labels.Label fields of this dataset that do not have customized classes defined in classes . |
Property | default |
The default group slice of the dataset, or None if the dataset is not grouped. |
Property | default |
A dict defining a default mapping between pixel values (2D masks) or RGB hex strings (3D masks) and label strings for the segmentation masks of all fiftyone.core.labels.Segmentation fields of this dataset that do not have customized mask targets defined in ... |
Property | default |
A default fiftyone.core.odm.dataset.KeypointSkeleton defining the semantic labels and point connectivity for all fiftyone.core.labels.Keypoint fields of this dataset that do not have customized skeletons defined in ... |
Property | deleted |
Whether the dataset is deleted. |
Property | description |
A string description on the dataset. |
Property | group |
The group field of the dataset, or None if the dataset is not grouped. |
Property | group |
A dict mapping group slices to media types, or None if the dataset is not grouped. |
Property | group |
The list of group slices of the dataset, or None if the dataset is not grouped. |
Property | has |
Whether this dataset has any saved views. |
Property | has |
Whether this dataset has any saved workspaces. |
Property | info |
A user-facing dictionary of information about the dataset. |
Property | last |
The datetime that the dataset was last loaded. |
Property | last |
The datetime that the dataset was last modified. |
Property | mask |
A dict mapping field names to mask target dicts, each of which defines a mapping between pixel values (2D masks) or RGB hex strings (3D masks) and label strings for the segmentation masks in the corresponding field of the dataset. |
Property | name |
The name of the dataset. |
Property | persistent |
Whether the dataset persists in the database after a session is terminated. |
Property | skeletons |
A dict mapping field names to fiftyone.core.odm.dataset.KeypointSkeleton instances, each of which defines the semantic labels and point connectivity for the fiftyone.core.labels.Keypoint instances in the corresponding field of the dataset. |
Property | slug |
The slug of the dataset. |
Property | tags |
A list of tags on the dataset. |
Property | version |
The version of the fiftyone package for which the dataset is formatted. |
Method | _add |
Undocumented |
Method | _add |
Undocumented |
Method | _add |
Undocumented |
Method | _add |
Undocumented |
Method | _add |
Returns a fiftyone.core.view.DatasetView containing the contents of the collection with the given fiftyone.core.stages.ViewStage` appended to its aggregation pipeline. |
Method | _aggregate |
Runs the MongoDB aggregation pipeline on the collection and returns the result. |
Method | _apply |
Undocumented |
Method | _apply |
Undocumented |
Method | _attach |
A pipeline that attaches the frame documents for each document. |
Method | _attach |
A pipeline that attaches the requested group slice(s) for each document and stores them in under groups.<slice> keys. |
Method | _bulk |
Undocumented |
Method | _clear |
Undocumented |
Method | _clear |
Undocumented |
Method | _clear |
Undocumented |
Method | _clear |
Undocumented |
Method | _clear |
Undocumented |
Method | _clone |
Undocumented |
Method | _clone |
Undocumented |
Method | _clone |
Undocumented |
Method | _delete |
Undocumented |
Method | _delete |
Undocumented |
Method | _delete |
Undocumented |
Method | _delete |
Undocumented |
Method | _delete |
Undocumented |
Method | _delete |
Undocumented |
Method | _delete |
Undocumented |
Method | _ensure |
Undocumented |
Method | _ensure |
Undocumented |
Method | _estimated |
Undocumented |
Method | _expand |
Undocumented |
Method | _expand |
Undocumented |
Method | _expand |
Undocumented |
Method | _frame |
Undocumented |
Method | _frame |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _group |
A pipeline that selects only the given slice's documents from the pipeline. |
Method | _groups |
A pipeline that looks up the requested group slices for each document and returns (only) the unwound group slices. |
Method | _init |
Undocumented |
Method | _iter |
Undocumented |
Method | _iter |
Undocumented |
Method | _keep |
Undocumented |
Method | _keep |
Undocumented |
Method | _keep |
Undocumented |
Method | _load |
Undocumented |
Method | _make |
Undocumented |
Method | _make |
Undocumented |
Method | _make |
Undocumented |
Method | _merge |
Undocumented |
Method | _merge |
Undocumented |
Method | _merge |
Undocumented |
Method | _pipeline |
Returns the MongoDB aggregation pipeline for the collection. |
Method | _populate |
Undocumented |
Method | _reload |
Undocumented |
Method | _reload |
Undocumented |
Method | _remove |
Undocumented |
Method | _remove |
Undocumented |
Method | _rename |
Undocumented |
Method | _rename |
Undocumented |
Method | _sample |
Undocumented |
Method | _sample |
Undocumented |
Method | _save |
Undocumented |
Method | _save |
Undocumented |
Method | _serialize |
Undocumented |
Method | _set |
Undocumented |
Method | _unwind |
A pipeline that returns (only) the unwound frames documents. |
Method | _unwind |
A pipeline that returns (only) the unwound groups documents. |
Method | _update |
Undocumented |
Method | _update |
Undocumented |
Method | _upsert |
Undocumented |
Method | _upsert |
Undocumented |
Method | _validate |
Undocumented |
Method | _validate |
Undocumented |
Method | _validate |
Undocumented |
Instance Variable | _annotation |
Undocumented |
Instance Variable | _brain |
Undocumented |
Instance Variable | _deleted |
Undocumented |
Instance Variable | _doc |
Undocumented |
Instance Variable | _evaluation |
Undocumented |
Instance Variable | _frame |
Undocumented |
Instance Variable | _group |
Undocumented |
Instance Variable | _run |
Undocumented |
Instance Variable | _sample |
Undocumented |
Property | _dataset |
The fiftyone.core.dataset.Dataset that serves the samples in this collection. |
Property | _frame |
Undocumented |
Property | _frame |
Undocumented |
Property | _is |
Whether this collection contains clips. |
Property | _is |
Whether this collection contains dynamic groups. |
Property | _is |
Whether this collection contains frames of a video dataset. |
Property | _is |
Whether this collection's contents is generated from another collection. |
Property | _is |
Whether this collection contains patches. |
Property | _root |
The root fiftyone.core.dataset.Dataset from which this collection is derived. |
Property | _sample |
Undocumented |
Property | _sample |
Undocumented |
Inherited from SampleCollection
:
Class Method | list |
Returns a list of all available methods on this collection that apply fiftyone.core.aggregations.Aggregation operations to this collection. |
Class Method | list |
Returns a list of all available methods on this collection that apply fiftyone.core.stages.ViewStage operations to this collection. |
Method | __add__ |
Undocumented |
Method | __bool__ |
Undocumented |
Method | __contains__ |
Undocumented |
Method | __iter__ |
Undocumented |
Method | __repr__ |
Undocumented |
Method | __str__ |
Undocumented |
Method | add |
Applies the given fiftyone.core.stages.ViewStage to the collection. |
Method | aggregate |
Aggregates one or more fiftyone.core.aggregations.Aggregation instances. |
Method | annotate |
Exports the samples and optional label field(s) in this collection to the given annotation backend. |
Method | apply |
Applies the model to the samples in the collection. |
Method | bounds |
Computes the bounds of a numeric field of the collection. |
Method | compute |
Computes embeddings for the samples in the collection using the given model. |
Method | compute |
Populates the metadata field of all samples in the collection. |
Method | compute |
Computes embeddings for the image patches defined by patches_field of the samples in the collection using the given model. |
Method | concat |
Concatenates the contents of the given SampleCollection to this collection. |
Method | count |
Counts the number of field values in the collection. |
Method | count |
Counts the occurrences of all label tags in the specified label field(s) of this collection. |
Method | count |
Counts the occurrences of sample tags in this collection. |
Method | count |
Counts the occurrences of field values in the collection. |
Method | create |
Creates an index on the given field or with the given specification, if necessary. |
Method | delete |
Deletes the annotation run with the given key from this collection. |
Method | delete |
Deletes all annotation runs from this collection. |
Method | delete |
Deletes the brain method run with the given key from this collection. |
Method | delete |
Deletes all brain method runs from this collection. |
Method | delete |
Deletes the evaluation results associated with the given evaluation key from this collection. |
Method | delete |
Deletes all evaluation results from this collection. |
Method | delete |
Deletes the run with the given key from this collection. |
Method | delete |
Deletes all runs from this collection. |
Method | distinct |
Computes the distinct values of a field in the collection. |
Method | draw |
Renders annotated versions of the media in the collection with the specified label data overlaid to the given directory. |
Method | drop |
Drops the index for the given field or name, if necessary. |
Method | evaluate |
Evaluates the classification predictions in this collection with respect to the specified ground truth labels. |
Method | evaluate |
Evaluates the specified predicted detections in this collection with respect to the specified ground truth detections. |
Method | evaluate |
Evaluates the regression predictions in this collection with respect to the specified ground truth values. |
Method | evaluate |
Evaluates the specified semantic segmentation masks in this collection with respect to the specified ground truth masks. |
Method | exclude |
Excludes the samples with the given IDs from the collection. |
Method | exclude |
Excludes the samples with the given field values from the collection. |
Method | exclude |
Excludes the fields with the given names from the samples in the collection. |
Method | exclude |
Excludes the frames with the given IDs from the video collection. |
Method | exclude |
Excludes the groups with the given IDs from the grouped collection. |
Method | exclude |
Excludes the specified labels from the collection. |
Method | exists |
Returns a view containing the samples in the collection that have (or do not have) a non-None value for the given field or embedded field. |
Method | export |
Exports the samples in the collection to disk. |
Method | filter |
Filters the values of a field or embedded field of each sample in the collection. |
Method | filter |
Filters the individual fiftyone.core.labels.Keypoint.points elements in the specified keypoints field of each sample in the collection. |
Method | filter |
Filters the fiftyone.core.labels.Label field of each sample in the collection. |
Method | flatten |
Returns a flattened view that contains all samples in the dynamic grouped collection. |
Method | geo |
Sorts the samples in the collection by their proximity to a specified geolocation. |
Method | geo |
Filters the samples in this collection to only include samples whose geolocation is within a specified boundary. |
Method | get |
Returns information about the annotation run with the given key on this collection. |
Method | get |
Returns information about the brain method run with the given key on this collection. |
Method | get |
Gets the classes list for the given field, or None if no classes are available. |
Method | get |
Returns a schema dictionary describing the dynamic fields of the samples in the collection. |
Method | get |
Returns a schema dictionary describing the dynamic fields of the frames in the collection. |
Method | get |
Returns information about the evaluation with the given key on this collection. |
Method | get |
Returns the field instance of the provided path, or None if one does not exist. |
Method | get |
Returns a dictionary of information about the indexes on this collection. |
Method | get |
Gets the mask targets for the given field, or None if no mask targets are available. |
Method | get |
Returns information about the run with the given key on this collection. |
Method | get |
Gets the keypoint skeleton for the given field, or None if no skeleton is available. |
Method | group |
Creates a view that groups the samples in the collection by a specified field or expression. |
Method | has |
Whether this collection has an annotation run with the given key. |
Method | has |
Whether this collection has a brain method run with the given key. |
Method | has |
Determines whether this collection has a classes list for the given field. |
Method | has |
Whether this collection has an evaluation with the given key. |
Method | has |
Determines whether the collection has a field with the given name. |
Method | has |
Determines whether the collection has a frame-level field with the given name. |
Method | has |
Determines whether this collection has mask targets for the given field. |
Method | has |
Whether this collection has a run with the given key. |
Method | has |
Determines whether the collection has a sample field with the given name. |
Method | has |
Determines whether this collection has a keypoint skeleton for the given field. |
Method | histogram |
Computes a histogram of the field values in the collection. |
Method | init |
Initializes a config instance for a new run. |
Method | init |
Initializes a results instance for the run with the given key. |
Method | limit |
Returns a view with at most the given number of samples. |
Method | limit |
Limits the number of fiftyone.core.labels.Label instances in the specified labels list field of each sample in the collection. |
Method | list |
Returns a list of annotation keys on this collection. |
Method | list |
Returns a list of brain keys on this collection. |
Method | list |
Returns a list of evaluation keys on this collection. |
Method | list |
Returns the list of index names on this collection. |
Method | list |
Returns a list of run keys on this collection. |
Method | list |
Extracts the value type(s) in a specified list field across all samples in the collection. |
Method | load |
Loads the results for the annotation run with the given key on this collection. |
Method | load |
Loads the fiftyone.core.view.DatasetView on which the specified annotation run was performed on this collection. |
Method | load |
Downloads the labels from the given annotation run from the annotation backend and merges them into this collection. |
Method | load |
Loads the results for the brain method run with the given key on this collection. |
Method | load |
Loads the fiftyone.core.view.DatasetView on which the specified brain method run was performed on this collection. |
Method | load |
Loads the results for the evaluation with the given key on this collection. |
Method | load |
Loads the fiftyone.core.view.DatasetView on which the specified evaluation was performed on this collection. |
Method | load |
Loads the results for the run with the given key on this collection. |
Method | load |
Loads the fiftyone.core.view.DatasetView on which the specified run was performed on this collection. |
Method | make |
Makes a unique field name with the given root name for the collection. |
Method | map |
Maps the label values of a fiftyone.core.labels.Label field to new values for each sample in the collection. |
Method | match |
Filters the samples in the collection by the given filter. |
Method | match |
Filters the frames in the video collection by the given filter. |
Method | match |
Selects the samples from the collection that contain (or do not contain) at least one label that matches the specified criteria. |
Method | match |
Returns a view containing the samples in the collection that have or don't have any/all of the given tag(s). |
Method | max |
Computes the maximum of a numeric field of the collection. |
Method | mean |
Computes the arithmetic mean of the field values of the collection. |
Method | merge |
Merges the labels from the given input field into the given output field of the collection. |
Method | min |
Computes the minimum of a numeric field of the collection. |
Method | mongo |
Adds a view stage defined by a raw MongoDB aggregation pipeline. |
Method | quantiles |
Computes the quantile(s) of the field values of a collection. |
Method | register |
Registers a run under the given key on this collection. |
Method | rename |
Replaces the key for the given annotation run with a new key. |
Method | rename |
Replaces the key for the given brain run with a new key. |
Method | rename |
Replaces the key for the given evaluation with a new key. |
Method | rename |
Replaces the key for the given run with a new key. |
Method | save |
Returns a context that can be used to save samples from this collection according to a configurable batching strategy. |
Method | save |
Saves run results for the run with the given key. |
Method | schema |
Extracts the names and types of the attributes of a specified embedded document field across all samples in the collection. |
Method | select |
Selects the samples with the given IDs from the collection. |
Method | select |
Selects the samples with the given field values from the collection. |
Method | select |
Selects only the fields with the given names from the samples in the collection. All other fields are excluded. |
Method | select |
Selects the frames with the given IDs from the video collection. |
Method | select |
Selects the samples in the group collection from the given slice(s). |
Method | select |
Selects the groups with the given IDs from the grouped collection. |
Method | select |
Selects only the specified labels from the collection. |
Method | set |
Sets a field or embedded field on each sample in a collection by evaluating the given expression. |
Method | set |
Sets the fields of the specified labels in the collection to the given values. |
Method | set |
Sets the field or embedded field on each sample or frame in the collection to the given values. |
Method | shuffle |
Randomly shuffles the samples in the collection. |
Method | skip |
Omits the given number of samples from the head of the collection. |
Method | sort |
Sorts the samples in the collection by the given field(s) or expression(s). |
Method | sort |
Sorts the collection by similarity to a specified query. |
Method | split |
Splits the labels from the given input field into the given output field of the collection. |
Method | std |
Computes the standard deviation of the field values of the collection. |
Method | sum |
Computes the sum of the field values of the collection. |
Method | sync |
Syncs the last_modified_at property(s) of the dataset. |
Method | tag |
Adds the tag(s) to all labels in the specified label field(s) of this collection, if necessary. |
Method | tag |
Adds the tag(s) to all samples in this collection, if necessary. |
Method | take |
Randomly samples the given number of samples from the collection. |
Method | to |
Creates a view that contains one sample per clip defined by the given field or expression in the video collection. |
Method | to |
Returns a JSON dictionary representation of the collection. |
Method | to |
Creates a view based on the results of the evaluation with the given key that contains one sample for each true positive, false positive, and false negative example in the collection, respectively. |
Method | to |
Creates a view that contains one sample per frame in the video collection. |
Method | to |
Returns a JSON string representation of the collection. |
Method | to |
Creates a view that contains one sample per object patch in the specified field of the collection. |
Method | to |
Creates a view that contains one clip for each unique object trajectory defined by their (label, index) in a frame-level field of a video collection. |
Method | untag |
Removes the tag from all labels in the specified label field(s) of this collection, if necessary. |
Method | untag |
Removes the tag(s) from all samples in this collection, if necessary. |
Method | update |
Updates the run config for the run with the given key. |
Method | validate |
Validates that the collection has a field of the given type. |
Method | validate |
Validates that the collection has field(s) with the given name(s). |
Method | values |
Extracts the values of a field from all samples in the collection. |
Method | write |
Writes the colllection to disk in JSON format. |
Property | has |
Whether this collection has any annotation runs. |
Property | has |
Whether this collection has any brain runs. |
Property | has |
Whether this collection has any evaluation results. |
Property | has |
Whether this collection has any runs. |
Async Method | _async |
Undocumented |
Method | _build |
Undocumented |
Method | _build |
Undocumented |
Method | _build |
Undocumented |
Method | _build |
Undocumented |
Method | _contains |
Undocumented |
Method | _contains |
Undocumented |
Method | _do |
Undocumented |
Method | _edit |
Undocumented |
Method | _edit |
Undocumented |
Method | _expand |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Computes the total size of the frame documents in the collection. |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Returns a dictionary mapping frame IDs to document sizes (in bytes) for each frame in the video collection. |
Method | _get |
Returns a dictionary mapping sample IDs to document sizes (in bytes) for each sample in the collection. |
Method | _get |
Returns a dictionary mapping sample IDs to total frame document sizes (in bytes) for each sample in the video collection. |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Computes the total size of the sample documents in the collection. |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _handle |
Undocumented |
Method | _handle |
Undocumented |
Method | _handle |
Undocumented |
Method | _handle |
Undocumented |
Method | _handle |
Undocumented |
Method | _has |
Undocumented |
Method | _has |
Undocumented |
Method | _has |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _is |
Undocumented |
Method | _list |
Undocumented |
Method | _make |
Undocumented |
Method | _make |
Undocumented |
Method | _max |
Undocumented |
Method | _min |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _parse |
Undocumented |
Method | _process |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _serialize |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Undocumented |
Method | _split |
Undocumented |
Method | _sync |
Undocumented |
Method | _sync |
Undocumented |
Method | _tag |
Undocumented |
Method | _to |
Undocumented |
Method | _untag |
Undocumented |
Method | _unwind |
Undocumented |
Method | _validate |
Undocumented |
Constant | _FRAMES |
Undocumented |
Constant | _GROUPS |
Undocumented |
Property | _element |
Undocumented |
Property | _elements |
Undocumented |
def from_archive(cls, archive_path, dataset_type=None, data_path=None, labels_path=None, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, cleanup=True, progress=None, **kwargs): (source) ¶
Creates a Dataset
from the contents of the given archive.
If a directory with the same root name as archive_path exists, it is assumed that this directory contains the extracted contents of the archive, and thus the archive is not re-extracted.
See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.
Note
The following archive formats are explicitly supported:
.zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz
If an archive not in the above list is found, extraction will be attempted via the patool package, which supports many formats but may require that additional system packages be installed.
Parameters | |
archive | the path to an archive of a dataset directory |
datasetNone | the fiftyone.types.Dataset type of
the dataset in archive_path |
dataNone | an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following:
By default, it is assumed that the data can be located in the default location within archive_path for the dataset type |
labelsNone | an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following:
For labeled datasets, this parameter defaults to the location in archive_path of the labels for the default layout of the dataset type being imported |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
cleanup:True | whether to delete the archive after extracting it |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments to pass to the constructor of
the fiftyone.utils.data.importers.DatasetImporter for
the specified dataset_type |
Returns | |
a Dataset |
def from_dict(cls, d, name=None, persistent=False, overwrite=False, rel_dir=None, frame_labels_dir=None, progress=None): (source) ¶
Loads a Dataset
from a JSON dictionary generated by
fiftyone.core.collections.SampleCollection.to_dict
.
The JSON dictionary can contain an export of any
fiftyone.core.collections.SampleCollection
, e.g.,
Dataset
or fiftyone.core.view.DatasetView
.
Parameters | |
d | a JSON dictionary |
name:None | a name for the new dataset |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
relNone | a relative directory to prepend to the filepath
of each sample if the filepath is not absolute (begins with a
path separator). The path is converted to an absolute path
(if necessary) via fiftyone.core.storage.normalize_path |
frameNone | a directory of per-sample JSON files containing the frame labels for video samples. If omitted, it is assumed that the frame labels are included directly in the provided JSON dict. Only applicable to datasets that contain videos |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_dir(cls, dataset_dir=None, dataset_type=None, data_path=None, labels_path=None, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, progress=None, **kwargs): (source) ¶
Creates a Dataset
from the contents of the given directory.
You can create datasets with this method via the following basic patterns:
- Provide dataset_dir and dataset_type to import the contents of a directory that is organized in the default layout for the dataset type as documented in :ref:`this guide <loading-datasets-from-disk>`
- Provide dataset_type along with data_path, labels_path, or other type-specific parameters to perform a customized import. This syntax provides the flexibility to, for example, perform labels-only imports or imports where the source media lies in a different location than the labels
In either workflow, the remaining parameters of this method can be provided to further configure the import.
See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.
Parameters | |
datasetNone | the dataset directory. This can be omitted if you provide arguments such as data_path and labels_path |
datasetNone | the fiftyone.types.Dataset type of
the dataset |
dataNone | an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following:
By default, it is assumed that the data can be located in the default location within dataset_dir for the dataset type |
labelsNone | an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following:
For labeled datasets, this parameter defaults to the location in dataset_dir of the labels for the default layout of the dataset type being imported |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments to pass to the constructor of
the fiftyone.utils.data.importers.DatasetImporter for
the specified dataset_type |
Returns | |
a Dataset |
def from_images(cls, paths_or_samples, sample_parser=None, name=None, persistent=False, overwrite=False, tags=None, progress=None): (source) ¶
Creates a Dataset
from the given images.
This operation does not read the images.
See :ref:`this guide <custom-sample-parser>` for more details about
providing a custom
UnlabeledImageSampleParser
to load image samples into FiftyOne.
Parameters | |
paths | an iterable of data. If no sample_parser is provided, this must be an iterable of image paths. If a sample_parser is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser |
sampleNone | a
fiftyone.utils.data.parsers.UnlabeledImageSampleParser
instance to use to parse the samples |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
tags:None | an optional tag or iterable of tags to attach to each sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_images_dir(cls, images_dir, name=None, persistent=False, overwrite=False, tags=None, recursive=True, progress=None): (source) ¶
Creates a Dataset
from the given directory of images.
This operation does not read the images.
Parameters | |
images | a directory of images |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
tags:None | an optional tag or iterable of tags to attach to each sample |
recursive:True | whether to recursively traverse subdirectories |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_images_patt(cls, images_patt, name=None, persistent=False, overwrite=False, tags=None, progress=None): (source) ¶
Creates a Dataset
from the given glob pattern of images.
This operation does not read the images.
Parameters | |
images | a glob pattern of images like /path/to/images/*.jpg |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
tags:None | an optional tag or iterable of tags to attach to each sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_importer(cls, dataset_importer, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, progress=None): (source) ¶
Creates a Dataset
by importing the samples in the given
fiftyone.utils.data.importers.DatasetImporter
.
See :ref:`this guide <custom-dataset-importer>` for more details about
providing a custom
DatasetImporter
to import datasets into FiftyOne.
Parameters | |
dataset | a
fiftyone.utils.data.importers.DatasetImporter |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_json(cls, path_or_str, name=None, persistent=False, overwrite=False, rel_dir=None, frame_labels_dir=None, progress=None): (source) ¶
Loads a Dataset
from JSON generated by
fiftyone.core.collections.SampleCollection.write_json
or
fiftyone.core.collections.SampleCollection.to_json
.
The JSON file can contain an export of any
fiftyone.core.collections.SampleCollection
, e.g.,
Dataset
or fiftyone.core.view.DatasetView
.
Parameters | |
path | the path to a JSON file on disk or a JSON string |
name:None | a name for the new dataset |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
relNone | a relative directory to prepend to the filepath
of each sample, if the filepath is not absolute (begins with a
path separator). The path is converted to an absolute path
(if necessary) via fiftyone.core.storage.normalize_path |
frame | Undocumented |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_labeled_images(cls, samples, sample_parser, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, progress=None): (source) ¶
Creates a Dataset
from the given labeled images.
This operation will iterate over all provided samples, but the images will not be read.
See :ref:`this guide <custom-sample-parser>` for more details about
providing a custom
LabeledImageSampleParser
to load labeled image samples into FiftyOne.
Parameters | |
samples | an iterable of data |
sample | a
fiftyone.utils.data.parsers.LabeledImageSampleParser
instance to use to parse the samples |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
labelNone | controls the field(s) in which imported labels
are stored. If the parser produces a single
fiftyone.core.labels.Label instance per sample, this
argument specifies the name of the field to use; the default is
"ground_truth". If the parser produces a dictionary of
labels per sample, this argument can be either a string prefix
to prepend to each label key or a dict mapping label keys to
field names; the default in this case is to directly use the
keys of the imported label dictionaries as field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_labeled_videos(cls, samples, sample_parser, name=None, persistent=False, overwrite=False, label_field=None, tags=None, dynamic=False, progress=None): (source) ¶
Creates a Dataset
from the given labeled videos.
This operation will iterate over all provided samples, but the videos will not be read/decoded/etc.
See :ref:`this guide <custom-sample-parser>` for more details about
providing a custom
LabeledVideoSampleParser
to load labeled video samples into FiftyOne.
Parameters | |
samples | an iterable of data |
sample | a
fiftyone.utils.data.parsers.LabeledVideoSampleParser
instance to use to parse the samples |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
labelNone | controls the field(s) in which imported labels
are stored. If the parser produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the parser produces a
dictionary of labels per sample/frame, this argument can be
either a string prefix to prepend to each label key or a dict
mapping label keys to field names; the default in this case is
to directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_videos(cls, paths_or_samples, sample_parser=None, name=None, persistent=False, overwrite=False, tags=None, progress=None): (source) ¶
Creates a Dataset
from the given videos.
This operation does not read/decode the videos.
See :ref:`this guide <custom-sample-parser>` for more details about
providing a custom
UnlabeledVideoSampleParser
to load video samples into FiftyOne.
Parameters | |
paths | an iterable of data. If no sample_parser is provided, this must be an iterable of video paths. If a sample_parser is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser |
sampleNone | a
fiftyone.utils.data.parsers.UnlabeledVideoSampleParser
instance to use to parse the samples |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
tags:None | an optional tag or iterable of tags to attach to each sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a Dataset |
def from_videos_dir(cls, videos_dir, name=None, persistent=False, overwrite=False, tags=None, recursive=True, progress=None): (source) ¶
Creates a Dataset
from the given directory of videos.
This operation does not read/decode the videos.
Parameters | |
videos | a directory of videos |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
tags:None | an optional tag or iterable of tags to attach to each sample |
recursive:True | whether to recursively traverse subdirectories |
progress | Undocumented |
Returns | |
a Dataset |
def from_videos_patt(cls, videos_patt, name=None, persistent=False, overwrite=False, tags=None, progress=None): (source) ¶
Creates a Dataset
from the given glob pattern of videos.
This operation does not read/decode the videos.
Parameters | |
videos | a glob pattern of videos like /path/to/videos/*.mp4 |
name:None | a name for the dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the dataset should persist in the database after the session terminates |
overwrite:False | whether to overwrite an existing dataset of the same name |
tags:None | an optional tag or iterable of tags to attach to each sample |
progress | Undocumented |
Returns | |
a Dataset |
Undocumented
Adds the contents of the given archive to the dataset.
If a directory with the same root name as archive_path exists, it is assumed that this directory contains the extracted contents of the archive, and thus the archive is not re-extracted.
See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.
Note
The following archive formats are explicitly supported:
.zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz
If an archive not in the above list is found, extraction will be attempted via the patool package, which supports many formats but may require that additional system packages be installed.
Parameters | |
archive | the path to an archive of a dataset directory |
datasetNone | the fiftyone.types.Dataset type of
the dataset in archive_path |
dataNone | an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following:
By default, it is assumed that the data can be located in the default location within archive_path for the dataset type |
labelsNone | an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following:
For labeled datasets, this parameter defaults to the location in archive_path of the labels for the default layout of the dataset type being imported |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
addTrue | whether to add dataset info from the importer (if any) to the dataset's info |
cleanup:True | whether to delete the archive after extracting it |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments to pass to the constructor of
the fiftyone.utils.data.importers.DatasetImporter for
the specified dataset_type |
Returns | |
a list of IDs of the samples that were added to the dataset |
Adds the contents of the given collection to the dataset.
This method is a special case of Dataset.merge_samples
that
adds samples with new IDs to this dataset and omits any samples with
existing IDs (the latter would only happen in rare cases).
Use Dataset.merge_samples
if you have multiple datasets whose
samples refer to the same source media.
Parameters | |
sample | a fiftyone.core.collections.SampleCollection |
includeTrue | whether to merge dataset-level information such as info and classes |
overwriteFalse | whether to overwrite existing dataset-level information. Only applicable when include_info is True |
newFalse | whether to generate new sample/frame/group IDs. By default, the IDs of the input collection are retained |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples that were added to this dataset |
Adds the contents of the given directory to the dataset.
You can perform imports with this method via the following basic patterns:
- Provide dataset_dir and dataset_type to import the contents of a directory that is organized in the default layout for the dataset type as documented in :ref:`this guide <loading-datasets-from-disk>`
- Provide dataset_type along with data_path, labels_path, or other type-specific parameters to perform a customized import. This syntax provides the flexibility to, for example, perform labels-only imports or imports where the source media lies in a different location than the labels
In either workflow, the remaining parameters of this method can be provided to further configure the import.
See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.
Parameters | |
datasetNone | the dataset directory. This can be omitted for certain dataset formats if you provide arguments such as data_path and labels_path |
datasetNone | the fiftyone.types.Dataset type of
the dataset |
dataNone | an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following:
By default, it is assumed that the data can be located in the default location within dataset_dir for the dataset type |
labelsNone | an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following:
For labeled datasets, this parameter defaults to the location in dataset_dir of the labels for the default layout of the dataset type being imported |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
addTrue | whether to add dataset info from the importer (if any) to the dataset's info |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments to pass to the constructor of
the fiftyone.utils.data.importers.DatasetImporter for
the specified dataset_type |
Returns | |
a list of IDs of the samples that were added to the dataset |
Adds all dynamic frame fields to the dataset's schema.
Dynamic fields are embedded document fields with at least one non-None value that have not been declared on the dataset's schema.
Parameters | |
fields:None | an optional field or iterable of fields for which to add dynamic fields. By default, all fields are considered |
recursive:True | whether to recursively inspect nested lists and embedded documents for dynamic fields |
addFalse | whether to declare fields that contain values
of mixed types as generic fiftyone.core.fields.Field
instances (True) or to skip such fields (False) |
Adds all dynamic sample fields to the dataset's schema.
Dynamic fields are embedded document fields with at least one non-None value that have not been declared on the dataset's schema.
Parameters | |
fields:None | an optional field or iterable of fields for which to add dynamic fields. By default, all fields are considered |
recursive:True | whether to recursively inspect nested lists and embedded documents for dynamic fields |
addFalse | whether to declare fields that contain values
of mixed types as generic fiftyone.core.fields.Field
instances (True) or to skip such fields (False) |
Adds a new frame-level field or embedded field to the dataset, if necessary.
Only applicable to datasets that contain videos.
Parameters | |
field | the field name or embedded.field.name |
ftype | the field type to create. Must be a subclass of
fiftyone.core.fields.Field |
embeddedNone | the
fiftyone.core.odm.BaseEmbeddedDocument type of the
field. Only applicable when ftype is
fiftyone.core.fields.EmbeddedDocumentField |
subfield:None | the fiftyone.core.fields.Field type of
the contained field. Only applicable when ftype is
fiftyone.core.fields.ListField or
fiftyone.core.fields.DictField |
fields:None | a list of fiftyone.core.fields.Field
instances defining embedded document attributes. Only
applicable when ftype is
fiftyone.core.fields.EmbeddedDocumentField |
description:None | an optional description |
info:None | an optional info dict |
readFalse | whether the field should be read-only |
**kwargs | Undocumented |
Raises | |
ValueError | if a field of the same name already exists and it is not compliant with the specified values |
Adds a group field to the dataset, if necessary.
Parameters | |
field | the field name |
default:None | a default group slice for the field |
description:None | an optional description |
info:None | an optional info dict |
readFalse | whether the field should be read-only |
Raises | |
ValueError | if a group field with another name already exists |
Adds a group slice with the given media type to the dataset, if necessary.
Parameters | |
name | a group slice name |
media | the media type of the slice |
Adds the given images to the dataset.
This operation does not read the images.
See :ref:`this guide <custom-sample-parser>` for more details about
adding images to a dataset by defining your own
UnlabeledImageSampleParser
.
Parameters | |
paths | an iterable of data. If no sample_parser is provided, this must be an iterable of image paths. If a sample_parser is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser |
sampleNone | a
fiftyone.utils.data.parsers.UnlabeledImageSampleParser
instance to use to parse the samples |
tags:None | an optional tag or iterable of tags to attach to each sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples that were added to the dataset |
Adds the given directory of images to the dataset.
See fiftyone.types.ImageDirectory
for format details. In
particular, note that files with non-image MIME types are omitted.
This operation does not read the images.
Parameters | |
images | a directory of images |
tags:None | an optional tag or iterable of tags to attach to each sample |
recursive:True | whether to recursively traverse subdirectories |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples in the dataset |
Adds the given glob pattern of images to the dataset.
This operation does not read the images.
Parameters | |
images | a glob pattern of images like /path/to/images/*.jpg |
tags:None | an optional tag or iterable of tags to attach to each sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples in the dataset |
Adds the samples from the given
fiftyone.utils.data.importers.DatasetImporter
to the dataset.
See :ref:`this guide <custom-dataset-importer>` for more details about
importing datasets in custom formats by defining your own
DatasetImporter
.
Parameters | |
dataset | a
fiftyone.utils.data.importers.DatasetImporter |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
addTrue | whether to add dataset info from the importer (if any) to the dataset's info |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples that were added to the dataset |
Adds the given labeled images to the dataset.
This operation will iterate over all provided samples, but the images will not be read (unless the sample parser requires it in order to compute image metadata).
See :ref:`this guide <custom-sample-parser>` for more details about
adding labeled images to a dataset by defining your own
LabeledImageSampleParser
.
Parameters | |
samples | an iterable of data |
sample | a
fiftyone.utils.data.parsers.LabeledImageSampleParser
instance to use to parse the samples |
labelNone | controls the field(s) in which imported labels
are stored. If the parser produces a single
fiftyone.core.labels.Label instance per sample, this
argument specifies the name of the field to use; the default is
"ground_truth". If the parser produces a dictionary of
labels per sample, this argument can be either a string prefix
to prepend to each label key or a dict mapping label keys to
field names; the default in this case is to directly use the
keys of the imported label dictionaries as field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples that were added to the dataset |
Adds the given labeled videos to the dataset.
This operation will iterate over all provided samples, but the videos will not be read/decoded/etc.
See :ref:`this guide <custom-sample-parser>` for more details about
adding labeled videos to a dataset by defining your own
LabeledVideoSampleParser
.
Parameters | |
samples | an iterable of data |
sample | a
fiftyone.utils.data.parsers.LabeledVideoSampleParser
instance to use to parse the samples |
label | the name (or root name) of the frame field(s) to use for the labels |
tags:None | an optional tag or iterable of tags to attach to each sample |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples that were added to the dataset |
Adds the given sample to the dataset.
If the sample instance does not belong to a dataset, it is updated in-place to reflect its membership in this dataset. If the sample instance belongs to another dataset, it is not modified.
Parameters | |
sample | a fiftyone.core.sample.Sample |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if the sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
validate:True | whether to validate that the fields of the sample are compliant with the dataset schema before adding it |
Returns | |
the ID of the sample in the dataset |
Adds a new sample field or embedded field to the dataset, if necessary.
Parameters | |
field | the field name or embedded.field.name |
ftype | the field type to create. Must be a subclass of
fiftyone.core.fields.Field |
embeddedNone | the
fiftyone.core.odm.BaseEmbeddedDocument type of the
field. Only applicable when ftype is
fiftyone.core.fields.EmbeddedDocumentField |
subfield:None | the fiftyone.core.fields.Field type of
the contained field. Only applicable when ftype is
fiftyone.core.fields.ListField or
fiftyone.core.fields.DictField |
fields:None | a list of fiftyone.core.fields.Field
instances defining embedded document attributes. Only
applicable when ftype is
fiftyone.core.fields.EmbeddedDocumentField |
description:None | an optional description |
info:None | an optional info dict |
readFalse | whether the field should be read-only |
**kwargs | Undocumented |
Raises | |
ValueError | if a field of the same name already exists and it is not compliant with the specified values |
Adds the given samples to the dataset.
Any sample instances that do not belong to a dataset are updated in-place to reflect membership in this dataset. Any sample instances that belong to other datasets are not modified.
Parameters | |
samples | an iterable of fiftyone.core.sample.Sample
instances or a
fiftyone.core.collections.SampleCollection |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
validate:True | whether to validate that the fields of each sample are compliant with the dataset schema before adding it |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
numNone | the number of samples in samples. If not provided, this is computed (if possible) via len(samples) if needed for progress tracking |
Returns | |
a list of IDs of the samples in the dataset |
Adds the given videos to the dataset.
This operation does not read the videos.
See :ref:`this guide <custom-sample-parser>` for more details about
adding videos to a dataset by defining your own
UnlabeledVideoSampleParser
.
Parameters | |
paths | an iterable of data. If no sample_parser is provided, this must be an iterable of video paths. If a sample_parser is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser |
sampleNone | a
fiftyone.utils.data.parsers.UnlabeledVideoSampleParser
instance to use to parse the samples |
tags:None | an optional tag or iterable of tags to attach to each sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples that were added to the dataset |
Adds the given directory of videos to the dataset.
See fiftyone.types.VideoDirectory
for format details. In
particular, note that files with non-video MIME types are omitted.
This operation does not read/decode the videos.
Parameters | |
videos | a directory of videos |
tags:None | an optional tag or iterable of tags to attach to each sample |
recursive:True | whether to recursively traverse subdirectories |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples in the dataset |
Adds the given glob pattern of videos to the dataset.
This operation does not read/decode the videos.
Parameters | |
videos | a glob pattern of videos like /path/to/videos/*.mp4 |
tags:None | an optional tag or iterable of tags to attach to each sample |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples in the dataset |
Returns a list of summary fields that may need to be updated.
Summary fields may need to be updated whenever there have been modifications to the dataset's samples since the summaries were last generated.
Note that inclusion in this list is only a heuristic, as any sample modifications may not have affected the summary's source field.
Returns | |
list of summary field names |
Removes all samples from the dataset.
If reference to a sample exists in memory, the sample will be updated such that sample.in_dataset is False.
Clears the dataset's in-memory cache.
Dataset caches may contain sample/frame singletons and annotation/brain/evaluation/custom runs.
Clears the values of the frame-level field from all samples in the dataset.
The field will remain in the dataset's frame schema, and all frames will have the value None for the field.
You can use dot notation (embedded.field.name) to clear embedded frame fields.
Only applicable to datasets that contain videos.
Parameters | |
field | the field name or embedded.field.name |
Clears the values of the frame-level fields from all samples in the dataset.
The fields will remain in the dataset's frame schema, and all frames will have the value None for the field.
You can use dot notation (embedded.field.name) to clear embedded frame fields.
Only applicable to datasets that contain videos.
Parameters | |
field | the field name or iterable of field names |
Removes all frame labels from the dataset.
If reference to a frame exists in memory, the frame will be updated such that frame.in_dataset is False.
Clears the values of the field from all samples in the dataset.
The field will remain in the dataset's schema, and all samples will have the value None for the field.
You can use dot notation (embedded.field.name) to clear embedded fields.
Parameters | |
field | the field name or embedded.field.name |
Clears the values of the fields from all samples in the dataset.
The field will remain in the dataset's schema, and all samples will have the value None for the field.
You can use dot notation (embedded.field.name) to clear embedded fields.
Parameters | |
field | the field name or iterable of field names |
Creates a copy of the dataset.
Dataset clones contain deep copies of all samples and dataset-level information in the source dataset. The source media files, however, are not copied.
Parameters | |
name:None | a name for the cloned dataset. By default,
get_default_dataset_name is used |
persistent:False | whether the cloned dataset should be persistent |
Returns | |
the new Dataset |
Clones the frame-level field into a new field.
You can use dot notation (embedded.field.name) to clone embedded frame fields.
Only applicable to datasets that contain videos.
Parameters | |
field | the field name or embedded.field.name |
new | the new field name or embedded.field.name |
Clones the frame-level fields into new fields.
You can use dot notation (embedded.field.name) to clone embedded frame fields.
Only applicable to datasets that contain videos.
Parameters | |
field | a dict mapping field names to new field names into which to clone each field |
Clones the given sample field into a new field of the dataset.
You can use dot notation (embedded.field.name) to clone embedded fields.
Parameters | |
field | the field name or embedded.field.name |
new | the new field name or embedded.field.name |
Clones the given sample fields into new fields of the dataset.
You can use dot notation (embedded.field.name) to clone embedded fields.
Parameters | |
field | a dict mapping field names to new field names into which to clone each field |
Populates a sample-level field that records the unique values or numeric ranges that appear in the specified field on each sample in the dataset.
This method is particularly useful for summarizing frame-level fields of video datasets, in which case the sample-level field records the unique values or numeric ranges that appear in the specified frame-level field across all frames of that sample. This summary field can then be efficiently queried to retrieve samples that contain specific values of interest in at least one frame.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart-video") dataset.set_field("frames.detections.detections.confidence", F.rand()).save() # Generate a summary field for object labels dataset.create_summary_field("frames.detections.detections.label") # Generate a summary field for [min, max] confidences dataset.create_summary_field("frames.detections.detections.confidence") # Generate a summary field for object labels and counts dataset.create_summary_field( "frames.detections.detections.label", field_name="frames_detections_label2", include_counts=True, ) # Generate a summary field for per-label [min, max] confidences dataset.create_summary_field( "frames.detections.detections.confidence", field_name="frames_detections_confidence2", group_by="label", ) print(dataset.list_summary_fields())
Parameters | |
path | an input field path |
fieldNone | the sample-level field in which to store the summary data. By default, a suitable name is derived from the given path |
sidebarNone | the name of a :ref:`App sidebar group <app-sidebar-groups>` to which to add the summary field. By default, all summary fields are added to a "summaries" group. You can pass False to skip sidebar group modification |
includeFalse | whether to include per-value counts when summarizing categorical fields |
groupNone | an optional attribute to group by when path is a numeric field to generate per-attribute [min, max] ranges. This may either be an absolute path or an attribute name that is interpreted relative to path |
readTrue | whether to mark the summary field as read-only |
createTrue | whether to create database index(es) for the summary field |
Returns | |
the summary field name |
Deletes the dataset.
Once deleted, only the name and deleted attributes of a dataset may be accessed.
If reference to a sample exists in memory, the sample will be updated such that sample.in_dataset is False.
Deletes the frame-level field from all samples in the dataset.
You can use dot notation (embedded.field.name) to delete embedded frame fields.
Only applicable to datasets that contain videos.
Parameters | |
field | the field name or embedded.field.name |
error | the error level to use. Valid values are: |
- 0 | raise error if a top-level field cannot be deleted |
- 1 | log warning if a top-level field cannot be deleted |
- 2 | ignore top-level fields that cannot be deleted |
Deletes the frame-level fields from all samples in the dataset.
You can use dot notation (embedded.field.name) to delete embedded frame fields.
Only applicable to datasets that contain videos.
Parameters | |
field | a field name or iterable of field names |
error | the error level to use. Valid values are: |
- 0 | raise error if a top-level field cannot be deleted |
- 1 | log warning if a top-level field cannot be deleted |
- 2 | ignore top-level fields that cannot be deleted |
Deletes the given frames(s) from the dataset.
If reference to a frame exists in memory, the frame will be updated such that frame.in_dataset is False.
Parameters | |
frames | the frame(s) to delete. Can be any of the following:
|
Deletes the given groups(s) from the dataset.
If reference to a sample exists in memory, the sample will be updated such that sample.in_dataset is False.
Parameters | |
groups | the group(s) to delete. Can be any of the following:
|
Deletes the specified labels from the dataset.
You can specify the labels to delete via any of the following methods:
- Provide the labels argument, which should contain a list of
dicts in the format returned by
fiftyone.core.session.Session.selected_labels
- Provide the ids or tags arguments to specify the labels to delete via their IDs and/or tags
- Provide the view argument to delete all of the labels in a view
into this dataset. This syntax is useful if you have constructed a
fiftyone.core.view.DatasetView
defining the labels to delete
Additionally, you can specify the fields argument to restrict deletion to specific field(s), either for efficiency or to ensure that labels from other fields are not deleted if their contents are included in the other arguments.
Parameters | |
labels:None | a list of dicts specifying the labels to delete in
the format returned by
fiftyone.core.session.Session.selected_labels |
ids:None | an ID or iterable of IDs of the labels to delete |
tags:None | a tag or iterable of tags of the labels to delete |
view:None | a fiftyone.core.view.DatasetView into this
dataset containing the labels to delete |
fields:None | a field or iterable of fields from which to delete labels |
Deletes the field from all samples in the dataset.
You can use dot notation (embedded.field.name) to delete embedded fields.
Parameters | |
field | the field name or embedded.field.name |
error | the error level to use. Valid values are: |
- 0 | raise error if a top-level field cannot be deleted |
- 1 | log warning if a top-level field cannot be deleted |
- 2 | ignore top-level fields that cannot be deleted |
Deletes the fields from all samples in the dataset.
You can use dot notation (embedded.field.name) to delete embedded fields.
Parameters | |
field | the field name or iterable of field names |
error | the error level to use. Valid values are: |
- 0 | raise error if a top-level field cannot be deleted |
- 1 | log warning if a top-level field cannot be deleted |
- 2 | ignore top-level fields that cannot be deleted |
Deletes the given sample(s) from the dataset.
If reference to a sample exists in memory, the sample will be updated such that sample.in_dataset is False.
Parameters | |
samples | the sample(s) to delete. Can be any of the following:
|
Deletes the summary field from all samples in the dataset.
Parameters | |
field | the summary field |
error | the error level to use. Valid values are: |
- 0 | raise error if a summary field cannot be deleted |
- 1 | log warning if a summary field cannot be deleted |
- 2 | ignore summary fields that cannot be deleted |
Deletes the summary fields from all samples in the dataset.
Parameters | |
field | the summary field or iterable of summary fields |
error | the error level to use. Valid values are: |
- 0 | raise error if a summary field cannot be deleted |
- 1 | log warning if a summary field cannot be deleted |
- 2 | ignore summary fields that cannot be deleted |
Deletes the saved workspace with the given name.
Parameters | |
name | the name of a saved workspace |
Raises | |
ValueError | if name is not a saved workspace |
Ensures that the video dataset contains frame instances for every frame of each sample's source video.
Empty frames will be inserted for missing frames, and already existing frames are left unchanged.
Returns a schema dictionary describing the fields of the samples in the dataset.
Parameters | |
ftype:None | an optional field type or iterable of types to which
to restrict the returned schema. Must be subclass(es) of
fiftyone.core.fields.Field |
embeddedNone | an optional embedded document type or
iterable of types to which to restrict the returned schema.
Must be subclass(es) of
fiftyone.core.odm.BaseEmbeddedDocument |
readNone | whether to restrict to (True) or exclude (False) read-only fields. By default, all fields are included |
infoNone | an optional key or list of keys that must be in the field's info dict |
createdNone | an optional datetime specifying a minimum creation date |
includeFalse | whether to include fields that start with _ in the returned schema |
flat:False | whether to return a flattened schema where all embedded document fields are included as top-level keys |
mode:None | whether to apply the above constraints before and/or after flattening the schema. Only applicable when flat is True. Supported values are ("before", "after", "both"). The default is "after" |
Returns | |
a dict mapping field names to fiftyone.core.fields.Field
instances |
Returns a schema dictionary describing the fields of the frames of the samples in the dataset.
Only applicable for datasets that contain videos.
Parameters | |
ftype:None | an optional field type or iterable of types to which
to restrict the returned schema. Must be subclass(es) of
fiftyone.core.fields.Field |
embeddedNone | an optional embedded document type or
iterable of types to which to restrict the returned schema.
Must be subclass(es) of
fiftyone.core.odm.BaseEmbeddedDocument |
readNone | whether to restrict to (True) or exclude (False) read-only fields. By default, all fields are included |
infoNone | an optional key or list of keys that must be in the field's info dict |
createdNone | an optional datetime specifying a minimum creation date |
includeFalse | whether to include fields that start with _ in the returned schema |
flat:False | whether to return a flattened schema where all embedded document fields are included as top-level keys |
mode:None | whether to apply the above constraints before and/or after flattening the schema. Only applicable when flat is True. Supported values are ("before", "after", "both"). The default is "after" |
Returns | |
a dict mapping field names to fiftyone.core.fields.Field
instances, or None if the dataset does not contain videos |
Returns a dict containing the samples for the given group ID.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") group_id = dataset.take(1).first().group.id group = dataset.get_group(group_id) print(group.keys()) # ['left', 'right', 'pcd']
Parameters | |
group | a group ID |
groupNone | an optional subset of group slices to load |
Returns | |
a dict mapping group names to fiftyone.core.sample.Sample
instances | |
Raises | |
KeyError | if the group ID is not found |
Loads the editable information about the saved view with the given name.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") view = dataset.limit(10) dataset.save_view("test", view) print(dataset.get_saved_view_info("test"))
Parameters | |
name | the name of a saved view |
Returns | |
a dict of editable info |
Gets the information about the workspace with the given name.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") workspace = fo.Space() description = "A really cool (apparently empty?) workspace" dataset.save_workspace("test", workspace, description=description) print(dataset.get_workspace_info("test"))
Parameters | |
name | the name of a saved view |
Returns | |
a dict of editable info |
Whether this dataset has a saved view with the given name.
Parameters | |
name | a saved view name |
Returns | |
True/False |
Whether this dataset has a saved workspace with the given name.
Parameters | |
name | a saved workspace name |
Returns | |
True/False |
Returns a list of the first few samples in the dataset.
If fewer than num_samples samples are in the dataset, only the available samples are returned.
Parameters | |
num | the number of samples |
Returns | |
a list of fiftyone.core.sample.Sample objects |
Ingests the given iterable of images into the dataset.
The images are read in-memory and written to dataset_dir.
See :ref:`this guide <custom-sample-parser>` for more details about
ingesting images into a dataset by defining your own
UnlabeledImageSampleParser
.
Parameters | |
paths | an iterable of data. If no sample_parser is provided, this must be an iterable of image paths. If a sample_parser is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser |
sampleNone | a
fiftyone.utils.data.parsers.UnlabeledImageSampleParser
instance to use to parse the samples |
tags:None | an optional tag or iterable of tags to attach to each sample |
datasetNone | the directory in which the images will be
written. By default, get_default_dataset_dir is used |
imageNone | the image format to use to write the images to disk. By default, fiftyone.config.default_image_ext is used |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples in the dataset |
Ingests the given iterable of labeled image samples into the dataset.
The images are read in-memory and written to dataset_dir.
See :ref:`this guide <custom-sample-parser>` for more details about
ingesting labeled images into a dataset by defining your own
LabeledImageSampleParser
.
Parameters | |
samples | an iterable of data |
sample | a
fiftyone.utils.data.parsers.LabeledImageSampleParser
instance to use to parse the samples |
labelNone | controls the field(s) in which imported labels
are stored. If the parser produces a single
fiftyone.core.labels.Label instance per sample, this
argument specifies the name of the field to use; the default is
"ground_truth". If the parser produces a dictionary of
labels per sample, this argument can be either a string prefix
to prepend to each label key or a dict mapping label keys to
field names; the default in this case is to directly use the
keys of the imported label dictionaries as field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if the sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
datasetNone | the directory in which the images will be
written. By default, get_default_dataset_dir is used |
imageNone | the image format to use to write the images to disk. By default, fiftyone.config.default_image_ext is used |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples in the dataset |
Ingests the given iterable of labeled video samples into the dataset.
The videos are copied to dataset_dir.
See :ref:`this guide <custom-sample-parser>` for more details about
ingesting labeled videos into a dataset by defining your own
LabeledVideoSampleParser
.
Parameters | |
samples | an iterable of data |
sample | a
fiftyone.utils.data.parsers.LabeledVideoSampleParser
instance to use to parse the samples |
tags:None | an optional tag or iterable of tags to attach to each sample |
expandTrue | whether to dynamically add new sample fields encountered to the dataset schema. If False, an error is raised if the sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
datasetNone | the directory in which the videos will be
written. By default, get_default_dataset_dir is used |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples in the dataset |
Ingests the given iterable of videos into the dataset.
The videos are copied to dataset_dir.
See :ref:`this guide <custom-sample-parser>` for more details about
ingesting videos into a dataset by defining your own
UnlabeledVideoSampleParser
.
Parameters | |
paths | an iterable of data. If no sample_parser is provided, this must be an iterable of video paths. If a sample_parser is provided, this can be an arbitrary iterable whose elements can be parsed by the sample parser |
sampleNone | a
fiftyone.utils.data.parsers.UnlabeledVideoSampleParser
instance to use to parse the samples |
tags:None | an optional tag or iterable of tags to attach to each sample |
datasetNone | the directory in which the videos will be
written. By default, get_default_dataset_dir is used |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Returns | |
a list of IDs of the samples in the dataset |
Returns an iterator over the groups in the dataset.
Examples:
import random as r import string as s import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") def make_label(): return "".join(r.choice(s.ascii_letters) for i in range(10)) # No save context for group in dataset.iter_groups(progress=True): for sample in group.values(): sample["test"] = make_label() sample.save() # Save using default batching strategy for group in dataset.iter_groups(progress=True, autosave=True): for sample in group.values(): sample["test"] = make_label() # Save in batches of 10 for group in dataset.iter_groups( progress=True, autosave=True, batch_size=10 ): for sample in group.values(): sample["test"] = make_label() # Save every 0.5 seconds for group in dataset.iter_groups( progress=True, autosave=True, batch_size=0.5 ): for sample in group.values(): sample["test"] = make_label()
Parameters | |
groupNone | an optional subset of group slices to load |
progress:False | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
autosave:False | whether to automatically save changes to samples emitted by this iterator |
batchNone | the batch size to use when autosaving samples. If a batching_strategy is provided, this parameter configures the strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency") |
batchingNone | the batching strategy to use for each save operation when autosaving samples. Supported values are:
By default, fo.config.default_batcher is used |
Returns | |
an iterator that emits dicts mapping group slice names to
fiftyone.core.sample.Sample instances, one per group |
Returns an iterator over the samples in the dataset.
Examples:
import random as r import string as s import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("cifar10", split="test") def make_label(): return "".join(r.choice(s.ascii_letters) for i in range(10)) # No save context for sample in dataset.iter_samples(progress=True): sample.ground_truth.label = make_label() sample.save() # Save using default batching strategy for sample in dataset.iter_samples(progress=True, autosave=True): sample.ground_truth.label = make_label() # Save in batches of 10 for sample in dataset.iter_samples( progress=True, autosave=True, batch_size=10 ): sample.ground_truth.label = make_label() # Save every 0.5 seconds for sample in dataset.iter_samples( progress=True, autosave=True, batch_size=0.5 ): sample.ground_truth.label = make_label()
Parameters | |
progress:False | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
autosave:False | whether to automatically save changes to samples emitted by this iterator |
batchNone | the batch size to use when autosaving samples. If a batching_strategy is provided, this parameter configures the strategy as described below. If no batching_strategy is provided, this can either be an integer specifying the number of samples to save in a batch (in which case batching_strategy is implicitly set to "static") or a float number of seconds between batched saves (in which case batching_strategy is implicitly set to "latency") |
batchingNone | the batching strategy to use for each save operation when autosaving samples. Supported values are:
By default, fo.config.default_batcher is used |
Returns | |
an iterator over fiftyone.core.sample.Sample instances |
List saved views on this dataset.
Parameters | |
info:False | whether to return info dicts describing each saved view rather than just their names |
Returns | |
a list of saved view names or info dicts |
Lists the summary fields on the dataset.
Use create_summary_field
to create summary fields, and use
delete_summary_field
to delete them.
Returns | |
a list of summary field names |
List saved workspaces on this dataset.
Parameters | |
info:False | whether to return info dicts describing each saved workspace rather than just their names |
Returns | |
a list of saved workspace names or info dicts |
Loads the saved view with the given name.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") view = dataset.filter_labels("ground_truth", F("label") == "cat") dataset.save_view("cats", view) also_view = dataset.load_saved_view("cats") assert view == also_view
Parameters | |
name | the name of a saved view |
Returns | |
a fiftyone.core.view.DatasetView |
Loads the saved workspace with the given name.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") embeddings_panel = fo.Panel( type="Embeddings", state=dict(brainResult="img_viz", colorByField="metadata.size_bytes"), ) workspace = fo.Space(children=[embeddings_panel]) workspace_name = "embeddings-workspace" dataset.save_workspace(workspace_name, workspace) # Some time later ... load the workspace loaded_workspace = dataset.load_workspace(workspace_name) assert workspace == loaded_workspace # Launch app with the loaded workspace! session = fo.launch_app(dataset, spaces=loaded_workspace) # Or set via session later on session.spaces = loaded_workspace
Parameters | |
name | the name of a saved workspace |
Returns | |
a fiftyone.core.odm.workspace.Space | |
Raises | |
ValueError | if name is not a saved workspace |
Merges the contents of the given archive into the dataset.
Note
This method requires the ability to create unique indexes on the key_field of each collection.
See add_archive
if you want to add samples without a
uniqueness constraint.
If a directory with the same root name as archive_path exists, it is assumed that this directory contains the extracted contents of the archive, and thus the archive is not re-extracted.
See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.
Note
The following archive formats are explicitly supported:
.zip, .tar, .tar.gz, .tgz, .tar.bz, .tbz
If an archive not in the above list is found, extraction will be attempted via the patool package, which supports many formats but may require that additional system packages be installed.
By default, samples with the same absolute filepath are merged, but you can customize this behavior via the key_field and key_fcn parameters. For example, you could set key_fcn = lambda sample: os.path.basename(sample.filepath) to merge samples with the same base filename.
The behavior of this method is highly customizable. By default, all
top-level fields from the imported samples are merged in, overwriting
any existing values for those fields, with the exception of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections
fields), in which case the
elements of the lists themselves are merged. In the case of label list
fields, labels with the same id in both collections are updated
rather than duplicated.
To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.
This method can be configured in numerous ways, including:
- Whether existing samples should be modified or skipped
- Whether new samples should be added or omitted
- Whether new fields can be added to the dataset schema
- Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
- Whether to merge only specific fields, or all but certain fields
- Mapping input fields to different field names of this dataset
Parameters | |
archive | the path to an archive of a dataset directory |
datasetNone | the fiftyone.types.Dataset type of
the dataset in archive_path |
dataNone | an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following:
By default, it is assumed that the data can be located in the default location within archive_path for the dataset type |
labelsNone | an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following:
For labeled datasets, this parameter defaults to the location in archive_path of the labels for the default layout of the dataset type being imported |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
key | the sample field to use to decide whether to join with an existing sample |
keyNone | a function that accepts a
fiftyone.core.sample.Sample instance and computes a
key to decide if two samples should be merged. If a key_fcn
is provided, key_field is ignored |
skipFalse | whether to skip existing samples (True) or merge them (False) |
insertTrue | whether to insert new samples (True) or skip them (False) |
fields:None | an optional field or iterable of fields to which to restrict the merge. If provided, fields other than these are omitted from samples when merging or adding samples. One exception is that filepath is always included when adding new samples, since the field is required. This can also be a dict mapping field names of the input collection to field names of this dataset |
omitNone | an optional field or iterable of fields to exclude from the merge. If provided, these fields are omitted from imported samples, if present. One exception is that filepath is always included when adding new samples, since the field is required |
mergeTrue | whether to merge the elements of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections fields) rather than
merging the entire top-level field like other field types. For
label lists fields, existing fiftyone.core.label.Label
elements are either replaced (when overwrite is True) or
kept (when overwrite is False) when their id matches a
label from the provided samples |
overwrite:True | whether to overwrite (True) or skip (False) existing fields and label elements |
expandTrue | whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
addTrue | whether to add dataset info from the importer (if any) to the dataset |
cleanup:True | whether to delete the archive after extracting it |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments to pass to the constructor of
the fiftyone.utils.data.importers.DatasetImporter for
the specified dataset_type |
Merges the contents of the given directory into the dataset.
Note
This method requires the ability to create unique indexes on the key_field of each collection.
See add_dir
if you want to add samples without a uniqueness
constraint.
You can perform imports with this method via the following basic patterns:
- Provide dataset_dir and dataset_type to import the contents of a directory that is organized in the default layout for the dataset type as documented in :ref:`this guide <loading-datasets-from-disk>`
- Provide dataset_type along with data_path, labels_path, or other type-specific parameters to perform a customized import. This syntax provides the flexibility to, for example, perform labels-only imports or imports where the source media lies in a different location than the labels
In either workflow, the remaining parameters of this method can be provided to further configure the import.
See :ref:`this guide <loading-datasets-from-disk>` for example usages of this method and descriptions of the available dataset types.
By default, samples with the same absolute filepath are merged, but you can customize this behavior via the key_field and key_fcn parameters. For example, you could set key_fcn = lambda sample: os.path.basename(sample.filepath) to merge samples with the same base filename.
The behavior of this method is highly customizable. By default, all
top-level fields from the imported samples are merged in, overwriting
any existing values for those fields, with the exception of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections
fields), in which case the
elements of the lists themselves are merged. In the case of label list
fields, labels with the same id in both collections are updated
rather than duplicated.
To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.
This method can be configured in numerous ways, including:
- Whether existing samples should be modified or skipped
- Whether new samples should be added or omitted
- Whether new fields can be added to the dataset schema
- Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
- Whether to merge only specific fields, or all but certain fields
- Mapping input fields to different field names of this dataset
Parameters | |
datasetNone | the dataset directory. This can be omitted for certain dataset formats if you provide arguments such as data_path and labels_path |
datasetNone | the fiftyone.types.Dataset type of
the dataset |
dataNone | an optional parameter that enables explicit control over the location of the media for certain dataset types. Can be any of the following:
By default, it is assumed that the data can be located in the default location within dataset_dir for the dataset type |
labelsNone | an optional parameter that enables explicit control over the location of the labels. Only applicable when importing certain labeled dataset formats. Can be any of the following:
For labeled datasets, this parameter defaults to the location in dataset_dir of the labels for the default layout of the dataset type being imported |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
key | the sample field to use to decide whether to join with an existing sample |
keyNone | a function that accepts a
fiftyone.core.sample.Sample instance and computes a
key to decide if two samples should be merged. If a key_fcn
is provided, key_field is ignored |
skipFalse | whether to skip existing samples (True) or merge them (False) |
insertTrue | whether to insert new samples (True) or skip them (False) |
fields:None | an optional field or iterable of fields to which to restrict the merge. If provided, fields other than these are omitted from samples when merging or adding samples. One exception is that filepath is always included when adding new samples, since the field is required. This can also be a dict mapping field names of the input collection to field names of this dataset |
omitNone | an optional field or iterable of fields to exclude from the merge. If provided, these fields are omitted from imported samples, if present. One exception is that filepath is always included when adding new samples, since the field is required |
mergeTrue | whether to merge the elements of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections fields) rather than
merging the entire top-level field like other field types. For
label lists fields, existing fiftyone.core.label.Label
elements are either replaced (when overwrite is True) or
kept (when overwrite is False) when their id matches a
label from the provided samples |
overwrite:True | whether to overwrite (True) or skip (False) existing fields and label elements |
expandTrue | whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
addTrue | whether to add dataset info from the importer (if any) to the dataset |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
**kwargs | optional keyword arguments to pass to the constructor of
the fiftyone.utils.data.importers.DatasetImporter for
the specified dataset_type |
Merges the samples from the given
fiftyone.utils.data.importers.DatasetImporter
into the
dataset.
Note
This method requires the ability to create unique indexes on the key_field of each collection.
See add_importer
if you want to add samples without a
uniqueness constraint.
See :ref:`this guide <custom-dataset-importer>` for more details about
importing datasets in custom formats by defining your own
DatasetImporter
.
By default, samples with the same absolute filepath are merged, but you can customize this behavior via the key_field and key_fcn parameters. For example, you could set key_fcn = lambda sample: os.path.basename(sample.filepath) to merge samples with the same base filename.
The behavior of this method is highly customizable. By default, all
top-level fields from the imported samples are merged in, overwriting
any existing values for those fields, with the exception of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections
fields), in which case the
elements of the lists themselves are merged. In the case of label list
fields, labels with the same id in both collections are updated
rather than duplicated.
To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.
This method can be configured in numerous ways, including:
- Whether existing samples should be modified or skipped
- Whether new samples should be added or omitted
- Whether new fields can be added to the dataset schema
- Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
- Whether to merge only specific fields, or all but certain fields
- Mapping input fields to different field names of this dataset
Parameters | |
dataset | a
fiftyone.utils.data.importers.DatasetImporter |
labelNone | controls the field(s) in which imported labels
are stored. Only applicable if dataset_importer is a
fiftyone.utils.data.importers.LabeledImageDatasetImporter or
fiftyone.utils.data.importers.LabeledVideoDatasetImporter .
If the importer produces a single
fiftyone.core.labels.Label instance per sample/frame,
this argument specifies the name of the field to use; the
default is "ground_truth". If the importer produces a
dictionary of labels per sample, this argument can be either a
string prefix to prepend to each label key or a dict mapping
label keys to field names; the default in this case is to
directly use the keys of the imported label dictionaries as
field names |
tags:None | an optional tag or iterable of tags to attach to each sample |
key | the sample field to use to decide whether to join with an existing sample |
keyNone | a function that accepts a
fiftyone.core.sample.Sample instance and computes a
key to decide if two samples should be merged. If a key_fcn
is provided, key_field is ignored |
skipFalse | whether to skip existing samples (True) or merge them (False) |
insertTrue | whether to insert new samples (True) or skip them (False) |
fields:None | an optional field or iterable of fields to which to restrict the merge. If provided, fields other than these are omitted from samples when merging or adding samples. One exception is that filepath is always included when adding new samples, since the field is required. This can also be a dict mapping field names of the input collection to field names of this dataset |
omitNone | an optional field or iterable of fields to exclude from the merge. If provided, these fields are omitted from imported samples, if present. One exception is that filepath is always included when adding new samples, since the field is required |
mergeTrue | whether to merge the elements of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections fields) rather than
merging the entire top-level field like other field types. For
label lists fields, existing fiftyone.core.label.Label
elements are either replaced (when overwrite is True) or
kept (when overwrite is False) when their id matches a
label from the provided samples |
overwrite:True | whether to overwrite (True) or skip (False) existing fields and label elements |
expandTrue | whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded document fields that are encountered |
addTrue | whether to add dataset info from the importer (if any) to the dataset |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
Merges the fields of the given sample into this dataset.
By default, the sample is merged with an existing sample with the same absolute filepath, if one exists. Otherwise a new sample is inserted. You can customize this behavior via the key_field, skip_existing, and insert_new parameters.
The behavior of this method is highly customizable. By default, all
top-level fields from the provided sample are merged in, overwriting
any existing values for those fields, with the exception of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections
fields), in which case the
elements of the lists themselves are merged. In the case of label list
fields, labels with the same id in both samples are updated rather
than duplicated.
To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.
This method can be configured in numerous ways, including:
- Whether new fields can be added to the dataset schema
- Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
- Whether to merge only specific fields, or all but certain fields
- Mapping input sample fields to different field names of this sample
Parameters | |
sample | a fiftyone.core.sample.Sample |
key | the sample field to use to decide whether to join with an existing sample |
skipFalse | whether to skip existing samples (True) or merge them (False) |
insertTrue | whether to insert new samples (True) or skip them (False) |
fields:None | an optional field or iterable of fields to which to restrict the merge. May contain frame fields for video samples. This can also be a dict mapping field names of the input sample to field names of this dataset |
omitNone | an optional field or iterable of fields to exclude from the merge. May contain frame fields for video samples |
mergeTrue | whether to merge the elements of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections fields) rather than
merging the entire top-level field like other field types.
For label lists fields, existing
fiftyone.core.label.Label elements are either replaced
(when overwrite is True) or kept (when overwrite is
False) when their id matches a label from the provided
sample |
overwrite:True | whether to overwrite (True) or skip (False) existing fields and label elements |
expandTrue | whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if any fields are not in the dataset schema |
validate:True | whether to validate values for existing fields |
dynamic:False | whether to declare dynamic embedded document fields |
Merges the given samples into this dataset.
Note
This method requires the ability to create unique indexes on the key_field of each collection.
See add_collection
if you want to add samples from one
collection to another dataset without a uniqueness constraint.
By default, samples with the same absolute filepath are merged, but you can customize this behavior via the key_field and key_fcn parameters. For example, you could set key_fcn = lambda sample: os.path.basename(sample.filepath) to merge samples with the same base filename.
The behavior of this method is highly customizable. By default, all
top-level fields from the provided samples are merged in, overwriting
any existing values for those fields, with the exception of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections
fields), in which case the
elements of the lists themselves are merged. In the case of label list
fields, labels with the same id in both collections are updated
rather than duplicated.
To avoid confusion between missing fields and fields whose value is None, None-valued fields are always treated as missing while merging.
This method can be configured in numerous ways, including:
- Whether existing samples should be modified or skipped
- Whether new samples should be added or omitted
- Whether new fields can be added to the dataset schema
- Whether list fields should be treated as ordinary fields and merged as a whole rather than merging their elements
- Whether to merge only specific fields, or all but certain fields
- Mapping input fields to different field names of this dataset
Parameters | |
samples | a fiftyone.core.collections.SampleCollection or
iterable of fiftyone.core.sample.Sample instances |
key | the sample field to use to decide whether to join with an existing sample |
keyNone | a function that accepts a
fiftyone.core.sample.Sample instance and computes a
key to decide if two samples should be merged. If a key_fcn
is provided, key_field is ignored |
skipFalse | whether to skip existing samples (True) or merge them (False) |
insertTrue | whether to insert new samples (True) or skip them (False) |
fields:None | an optional field or iterable of fields to which to restrict the merge. If provided, fields other than these are omitted from samples when merging or adding samples. One exception is that filepath is always included when adding new samples, since the field is required. This can also be a dict mapping field names of the input collection to field names of this dataset |
omitNone | an optional field or iterable of fields to exclude from the merge. If provided, these fields are omitted from samples, if present, when merging or adding samples. One exception is that filepath is always included when adding new samples, since the field is required |
mergeTrue | whether to merge the elements of list fields
(e.g., tags) and label list fields (e.g.,
fiftyone.core.labels.Detections fields) rather than
merging the entire top-level field like other field types.
For label lists fields, existing
fiftyone.core.label.Label elements are either replaced
(when overwrite is True) or kept (when overwrite is
False) when their id matches a label from the provided
samples |
overwrite:True | whether to overwrite (True) or skip (False) existing fields and label elements |
expandTrue | whether to dynamically add new fields encountered to the dataset schema. If False, an error is raised if a sample's schema is not a subset of the dataset schema |
dynamic:False | whether to declare dynamic attributes of embedded
document fields that are encountered. Only applicable when
samples is not a
fiftyone.core.collections.SampleCollection |
includeTrue | whether to merge dataset-level information
such as info and classes. Only applicable when
samples is a
fiftyone.core.collections.SampleCollection |
overwriteFalse | whether to overwrite existing dataset-level
information. Only applicable when samples is a
fiftyone.core.collections.SampleCollection and
include_info is True |
progress:None | whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead |
numNone | the number of samples in samples. If not provided, this is computed (if possible) via len(samples) if needed for progress tracking |
Returns a single sample in this dataset matching the expression.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") # # Get a sample by filepath # # A random filepath in the dataset filepath = dataset.take(1).first().filepath # Get sample by filepath sample = dataset.one(F("filepath") == filepath) # # Dealing with multiple matches # # Get a sample whose image is JPEG sample = dataset.one(F("filepath").ends_with(".jpg")) # Raises an error since there are multiple JPEGs dataset.one(F("filepath").ends_with(".jpg"), exact=True)
Parameters | |
expr | a fiftyone.core.expressions.ViewExpression or
MongoDB expression
that evaluates to True for the sample to match |
exact:False | whether to raise an error if multiple samples match the expression |
Returns | |
a fiftyone.core.sample.Sample |
Removes the dynamic embedded frame field from the dataset's schema.
The underlying data is not deleted from the frames.
Parameters | |
field | the embedded.field.name |
error | the error level to use. Valid values are: |
- 0 | raise error if a top-level field cannot be removed |
- 1 | log warning if a top-level field cannot be removed |
- 2 | ignore top-level fields that cannot be removed |
Removes the dynamic embedded frame fields from the dataset's schema.
The underlying data is not deleted from the frames.
Parameters | |
field | the embedded.field.name or iterable of field names |
error | the error level to use. Valid values are: |
- 0 | raise error if a top-level field cannot be removed |
- 1 | log warning if a top-level field cannot be removed |
- 2 | ignore top-level fields that cannot be removed |
Removes the dynamic embedded sample field from the dataset's schema.
The underlying data is not deleted from the samples.
Parameters | |
field | the embedded.field.name |
error | the error level to use. Valid values are: |
- 0 | raise error if a top-level field cannot be removed |
- 1 | log warning if a top-level field cannot be removed |
- 2 | ignore top-level fields that cannot be removed |
Removes the dynamic embedded sample fields from the dataset's schema.
The underlying data is not deleted from the samples.
Parameters | |
field | the embedded.field.name or iterable of field names |
error | the error level to use. Valid values are: |
- 0 | raise error if a top-level field cannot be removed |
- 1 | log warning if a top-level field cannot be removed |
- 2 | ignore top-level fields that cannot be removed |
Renames the frame-level field to the given new name.
You can use dot notation (embedded.field.name) to rename embedded frame fields.
Only applicable to datasets that contain videos.
Parameters | |
field | the field name or embedded.field.name |
new | the new field name or embedded.field.name |
Renames the frame-level fields to the given new names.
You can use dot notation (embedded.field.name) to rename embedded frame fields.
Parameters | |
field | a dict mapping field names to new field names |
Renames the group slice with the given name.
Parameters | |
name | the group slice name |
new | the new group slice name |
Renames the sample field to the given new name.
You can use dot notation (embedded.field.name) to rename embedded fields.
Parameters | |
field | the field name or embedded.field.name |
new | the new field name or embedded.field.name |
Renames the sample fields to the given new names.
You can use dot notation (embedded.field.name) to rename embedded fields.
Parameters | |
field | a dict mapping field names to new field names |
Saves the dataset to the database.
This only needs to be called when dataset-level information such as its
Dataset.info
is modified.
Saves the given view into this dataset under the given name so it
can be loaded later via load_saved_view
.
Examples:
import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F dataset = foz.load_zoo_dataset("quickstart") view = dataset.filter_labels("ground_truth", F("label") == "cat") dataset.save_view("cats", view) also_view = dataset.load_saved_view("cats") assert view == also_view
Parameters | |
name | a name for the saved view |
view | a fiftyone.core.view.DatasetView |
description:None | an optional string description |
color:None | an optional RGB hex string like '#FF6D04' |
overwrite:False | whether to overwrite an existing saved view with the same name |
Saves a workspace into this dataset under the given name so it
can be loaded later via load_workspace
.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") embeddings_panel = fo.Panel( type="Embeddings", state=dict( brainResult="img_viz", colorByField="metadata.size_bytes" ), ) workspace = fo.Space(children=[embeddings_panel]) workspace_name = "embeddings-workspace" description = "Show embeddings only" dataset.save_workspace( workspace_name, workspace, description=description ) assert dataset.has_workspace(workspace_name) also_workspace = dataset.load_workspace(workspace_name) assert workspace == also_workspace
Parameters | |
name | a name for the saved workspace |
workspace | a fiftyone.core.odm.workspace.Space |
description:None | an optional string description |
color:None | an optional RGB hex string like '#FF6D04' |
overwrite:False | whether to overwrite an existing workspace with the same name |
Raises | |
ValueError | if overwrite==False and workspace with name already exists |
Returns stats about the dataset on disk.
The samples keys refer to the sample documents stored in the database.
For video datasets, the frames keys refer to the frame documents stored in the database.
The media keys refer to the raw media associated with each sample on disk.
The index[es] keys refer to the indexes associated with the dataset.
Note that dataset-level metadata such as annotation runs are not included in this computation.
Parameters | |
includeFalse | whether to include stats about the size of the raw media in the dataset |
includeFalse | whether to include stats on the dataset's indexes |
compressed:False | whether to return the sizes of collections in their compressed form on disk (True) or the logical uncompressed size of the collections (False) |
Returns | |
a stats dict |
Returns a list of the last few samples in the dataset.
If fewer than num_samples samples are in the dataset, only the available samples are returned.
Parameters | |
num | the number of samples |
Returns | |
a list of fiftyone.core.sample.Sample objects |
Updates the editable information for the saved view with the given name.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") view = dataset.limit(10) dataset.save_view("test", view) # Update the saved view's name and add a description info = dict( name="a new name", description="a description", ) dataset.update_saved_view_info("test", info)
Parameters | |
name | the name of a saved view |
info | a dict whose keys are a subset of the keys returned by
get_saved_view_info |
Updates the summary field based on the current values of its source field.
Parameters | |
field | the summary field |
Updates the editable information for the saved view with the given name.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") workspace = fo.Space() dataset.save_workspace("test", view) # Update the workspace's name and add a description, color info = dict( name="a new name", color="#FF6D04", description="a description", ) dataset.update_workspace_info("test", info)
Parameters | |
name | the name of a saved workspace |
info | a dict whose keys are a subset of the keys returned by
get_workspace_info |
Returns a fiftyone.core.view.DatasetView
containing the
entire dataset.
Returns | |
a fiftyone.core.view.DatasetView |
The current group slice of the dataset, or None if the dataset is not grouped.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") print(dataset.group_slices) # ['left', 'right', 'pcd'] print(dataset.group_slice) # left # Change the current group slice dataset.group_slice = "right" print(dataset.group_slice) # right
A fiftyone.core.odm.dataset.DatasetAppConfig
that
customizes how this dataset is visualized in the
:ref:`FiftyOne App <fiftyone-app>`.
Examples:
import fiftyone as fo import fiftyone.utils.image as foui import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") # View the dataset's current App config print(dataset.app_config) # Generate some thumbnail images foui.transform_images( dataset, size=(-1, 32), output_field="thumbnail_path", output_dir="/tmp/thumbnails", ) # Modify the dataset's App config dataset.app_config.media_fields = ["filepath", "thumbnail_path"] dataset.app_config.grid_media_field = "thumbnail_path" dataset.save() # must save after edits session = fo.launch_app(dataset)
A dict mapping field names to list of class label strings for the corresponding fields of the dataset.
Examples:
import fiftyone as fo dataset = fo.Dataset() # Set classes for the `ground_truth` and `predictions` fields dataset.classes = { "ground_truth": ["cat", "dog"], "predictions": ["cat", "dog", "other"], } # Edit an existing classes list dataset.classes["ground_truth"].append("other") dataset.save() # must save after edits
A list of class label strings for all
fiftyone.core.labels.Label
fields of this dataset that do not
have customized classes defined in classes
.
Examples:
import fiftyone as fo dataset = fo.Dataset() # Set default classes dataset.default_classes = ["cat", "dog"] # Edit the default classes dataset.default_classes.append("rabbit") dataset.save() # must save after edits
The default group slice of the dataset, or None if the dataset is not grouped.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") print(dataset.default_group_slice) # left # Change the default group slice dataset.default_group_slice = "right" print(dataset.default_group_slice) # right
A dict defining a default mapping between pixel values (2D masks) or
RGB hex strings (3D masks) and label strings for the segmentation masks
of all fiftyone.core.labels.Segmentation
fields of this
dataset that do not have customized mask targets defined in
mask_targets
.
Examples:
import fiftyone as fo # # 2D masks # dataset = fo.Dataset() # Set default mask targets dataset.default_mask_targets = {1: "cat", 2: "dog"} # Or, for RGB mask targets dataset.default_mask_targets = {"#3f0a44": "road", "#eeffee": "building", "#ffffff": "other"} # Edit the default mask targets dataset.default_mask_targets[255] = "other" dataset.save() # must save after edits # # 3D masks # dataset = fo.Dataset() # Set default mask targets dataset.default_mask_targets = {"#499CEF": "cat", "#6D04FF": "dog"} # Edit the default mask targets dataset.default_mask_targets["#FF6D04"] = "person" dataset.save() # must save after edits
A default fiftyone.core.odm.dataset.KeypointSkeleton
defining the semantic labels and point connectivity for all
fiftyone.core.labels.Keypoint
fields of this dataset that do
not have customized skeletons defined in skeleton
.
Examples:
import fiftyone as fo dataset = fo.Dataset() # Set default keypoint skeleton dataset.default_skeleton = fo.KeypointSkeleton( labels=[ "left hand" "left shoulder", "right shoulder", "right hand", "left eye", "right eye", "mouth", ], edges=[[0, 1, 2, 3], [4, 5, 6]], ) # Edit the default skeleton dataset.default_skeleton.labels[-1] = "lips" dataset.save() # must save after edits
A string description on the dataset.
Examples:
import fiftyone as fo dataset = fo.Dataset() # Store a description on the dataset dataset.description = "Your description here"
The group field of the dataset, or None if the dataset is not grouped.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") print(dataset.group_field) # group
A dict mapping group slices to media types, or None if the dataset is not grouped.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") print(dataset.group_media_types) # {'left': 'image', 'right': 'image', 'pcd': 'point-cloud'}
The list of group slices of the dataset, or None if the dataset is not grouped.
Examples:
import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart-groups") print(dataset.group_slices) # ['left', 'right', 'pcd']
A user-facing dictionary of information about the dataset.
Examples:
import fiftyone as fo dataset = fo.Dataset() # Store a class list in the dataset's info dataset.info = {"classes": ["cat", "dog"]} # Edit the info dataset.info["other_classes"] = ["bird", "plane"] dataset.save() # must save after edits
A dict mapping field names to mask target dicts, each of which defines a mapping between pixel values (2D masks) or RGB hex strings (3D masks) and label strings for the segmentation masks in the corresponding field of the dataset.
Examples:
import fiftyone as fo # # 2D masks # dataset = fo.Dataset() # Set mask targets for the `ground_truth` and `predictions` fields dataset.mask_targets = { "ground_truth": {1: "cat", 2: "dog"}, "predictions": {1: "cat", 2: "dog", 255: "other"}, } # Or, for RGB mask targets dataset.mask_targets = { "segmentations": {"#3f0a44": "road", "#eeffee": "building", "#ffffff": "other"} } # Edit an existing mask target dataset.mask_targets["ground_truth"][255] = "other" dataset.save() # must save after edits # # 3D masks # dataset = fo.Dataset() # Set mask targets for the `ground_truth` and `predictions` fields dataset.mask_targets = { "ground_truth": {"#499CEF": "cat", "#6D04FF": "dog"}, "predictions": { "#499CEF": "cat", "#6D04FF": "dog", "#FF6D04": "person" }, } # Edit an existing mask target dataset.mask_targets["ground_truth"]["#FF6D04"] = "person" dataset.save() # must save after edits
A dict mapping field names to
fiftyone.core.odm.dataset.KeypointSkeleton
instances, each of
which defines the semantic labels and point connectivity for the
fiftyone.core.labels.Keypoint
instances in the corresponding
field of the dataset.
Examples:
import fiftyone as fo dataset = fo.Dataset() # Set keypoint skeleton for the `ground_truth` field dataset.skeletons = { "ground_truth": fo.KeypointSkeleton( labels=[ "left hand" "left shoulder", "right shoulder", "right hand", "left eye", "right eye", "mouth", ], edges=[[0, 1, 2, 3], [4, 5, 6]], ) } # Edit an existing skeleton dataset.skeletons["ground_truth"].labels[-1] = "lips" dataset.save() # must save after edits
A list of tags on the dataset.
Examples:
import fiftyone as fo dataset = fo.Dataset() # Add some tags dataset.tags = ["test", "projectA"] # Edit the tags dataset.tags.pop() dataset.tags.append("projectB") dataset.save() # must save after edits
Returns a fiftyone.core.view.DatasetView
containing the
contents of the collection with the given
fiftyone.core.stages.ViewStage` appended to its aggregation
pipeline.
Subclasses are responsible for performing any validation on the view stage to ensure that it is a valid stage to add to this collection.
Parameters | |
stage | a fiftyone.core.stages.ViewStage` |
Returns | |
a fiftyone.core.view.DatasetView |
Runs the MongoDB aggregation pipeline on the collection and returns the result.
Parameters | |
pipeline:None | a MongoDB aggregation pipeline (list of dicts) to append to the current pipeline |
mediaNone | the media type of the collection, if different than the source dataset's media type |
attachFalse | whether to attach the frame documents immediately prior to executing pipeline. Only applicable to datasets that contain videos |
detachFalse | whether to detach the frame documents at the end of the pipeline. Only applicable to datasets that contain videos |
framesFalse | whether to generate a pipeline that contains only the frames in the collection |
support:None | an optional [first, last] range of frames to attach. Only applicable when attaching frames |
groupNone | the current group slice of the collection, if different than the source dataset's group slice. Only applicable for grouped collections |
groupNone | an optional list of group slices to attach when groups_only is True |
detachFalse | whether to detach the group documents at the end of the pipeline. Only applicable to grouped collections |
groupsFalse | whether to generate a pipeline that contains only the flattened group documents for the collection |
manualFalse | whether the pipeline has manually handled the initial group selection. Only applicable to grouped collections |
postNone | a MongoDB aggregation pipeline (list of dicts) to append to the very end of the pipeline, after all other arguments are applied |
Returns | |
the aggregation result dict |
A pipeline that attaches the requested group slice(s) for each document and stores them in under groups.<slice> keys.
A pipeline that looks up the requested group slices for each document and returns (only) the unwound group slices.
Undocumented
Undocumented
Undocumented
Returns the MongoDB aggregation pipeline for the collection.
Parameters | |
pipeline:None | a MongoDB aggregation pipeline (list of dicts) to append to the current pipeline |
mediaNone | the media type of the collection, if different than the source dataset's media type |
attachFalse | whether to attach the frame documents immediately prior to executing pipeline. Only applicable to datasets that contain videos |
detachFalse | whether to detach the frame documents at the end of the pipeline. Only applicable to datasets that contain videos |
framesFalse | whether to generate a pipeline that contains only the frames in the collection |
support:None | an optional [first, last] range of frames to attach. Only applicable when attaching frames |
groupNone | the current group slice of the collection, if different than the source dataset's group slice. Only applicable for grouped collections |
groupNone | an optional list of group slices to attach when groups_only is True |
detachFalse | whether to detach the group documents at the end of the pipeline. Only applicable to grouped collections |
groupsFalse | whether to generate a pipeline that contains only the flattened group documents for the collection |
manualFalse | whether the pipeline has manually handled the initial group selection. Only applicable to grouped collections |
postNone | a MongoDB aggregation pipeline (list of dicts) to append to the very end of the pipeline, after all other arguments are applied |
Returns | |
the aggregation pipeline |
Undocumented
Undocumented
The root fiftyone.core.dataset.Dataset
from which this
collection is derived.
This is typically the same as _dataset
but may differ in cases
such as patches views.