fiftyone.utils.sam2.SegmentAnything2VideoModel

class documentation

class SegmentAnything2VideoModel(fom.SamplesMixin, fom.Model): (source)

Constructor: SegmentAnything2VideoModel(config)

Wrapper for running Segment Anything 2 inference on videos.

Video prompt example:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart-video", max_samples=2)

# Only retain detections in the first frame
(
    dataset
    .match_frames(F("frame_number") > 1)
    .set_field("frames.detections", None)
    .save()
)

model = foz.load_zoo_model("segment-anything-2-hiera-tiny-video-torch")

# Segment inside boxes and propagate to all frames
dataset.apply_model(
    model,
    label_field="segmentations",
    prompt_field="frames.detections", # can contain Detections or Keypoints
)

session = fo.launch_app(dataset)

Parameters
config	a `SegmentAnything2VideoModelConfig`

Method	`__init__`	Undocumented
Method	`predict`	Performs prediction on the given data.
Instance Variable	`config`	Undocumented
Instance Variable	`ctx`	Undocumented
Instance Variable	`model`	Undocumented
Property	`media_type`	The media type processed by the model.
Method	`_download_model`	Undocumented
Method	`_forward_pass`	Undocumented
Method	`_forward_pass_boxes`	Undocumented
Method	`_forward_pass_points`	Undocumented
Method	`_get_field`	Undocumented
Method	`_get_prompt_type`	Undocumented
Method	`_get_prompts`	Undocumented
Method	`_load_model`	Undocumented
Instance Variable	`_curr_classes`	Undocumented
Instance Variable	`_curr_frame_height`	Undocumented
Instance Variable	`_curr_frame_width`	Undocumented
Instance Variable	`_curr_prompt_type`	Undocumented
Instance Variable	`_curr_prompts`	Undocumented
Instance Variable	`_device`	Undocumented
Instance Variable	`_fields`	Undocumented

Inherited from SamplesMixin:

Method	`needs_fields.setter`	Undocumented
Method	`predict_all`	Performs prediction on the given iterable of data.
Property	`needs_fields`	A dict mapping model-specific keys to sample field names.

Inherited from Model (via SamplesMixin):

Method	`__enter__`	Undocumented
Method	`__exit__`	Undocumented
Method	`preprocess.setter`	Undocumented
Property	`can_embed_prompts`	Whether this instance can generate prompt embeddings.
Property	`has_embeddings`	Whether this instance can generate embeddings.
Property	`has_logits`	Whether this instance can generate logits for its predictions.
Property	`preprocess`	Whether to apply `transforms` during inference (True) or to assume that they have already been applied (False).
Property	`ragged_batches`	True/False whether `transforms` may return tensors of different sizes. If True, then passing ragged lists of data to `predict_all` is not allowed.
Property	`transforms`	The preprocessing function that will/must be applied to each input before prediction, or `None` if no preprocessing is performed.

def __init__(self, config): (source) ¶

overrides fiftyone.core.models.SamplesMixin.__init__

Undocumented

def predict(self, video_reader, sample): (source) ¶

overrides fiftyone.core.models.SamplesMixin.predict

Performs prediction on the given data.

Image models should support, at minimum, processing arg values that are uint8 numpy arrays (HWC).

Video models should support, at minimum, processing arg values that are eta.core.video.VideoReader instances.

Parameters
video_reader	Undocumented
sample:`None`	the `fiftyone.core.sample.Sample` associated with the data
arg	the data
Returns
a `fiftyone.core.labels.Label` instance or dict of `fiftyone.core.labels.Label` instances containing the predictions