class documentation
class TorchImageModelConfig(foc.Config): (source)
Known subclasses: fiftyone.utils.clip.zoo.TorchCLIPModelConfig
, fiftyone.utils.open_clip.TorchOpenClipModelConfig
, fiftyone.utils.sam.SegmentAnythingModelConfig
, fiftyone.utils.sam2.SegmentAnything2ImageModelConfig
, fiftyone.utils.sam2.SegmentAnything2VideoModelConfig
, fiftyone.utils.super_gradients.TorchYoloNasModelConfig
, fiftyone.zoo.models.torch.TorchvisionImageModelConfig
Constructor: TorchImageModelConfig(d)
Configuration for running a TorchImageModel
.
Models are represented by this class via the following three components:
Model:
# Directly specify a model model # Load model from an entrypoint model = entrypoint_fcn(**entrypoint_args)
Transforms:
# Directly provide transforms transforms # Load transforms from a function transforms = transforms_fcn(**transforms_args) # Use the `image_XXX` parameters defined below to build a transform transforms = build_transforms(image_XXX, ...)
OutputProcessor:
# Directly provide an OutputProcessor output_processor # Load an OutputProcessor from a function output_processor = output_processor_cls(**output_processor_args)
Given these components, inference happens as follows:
def predict_all(imgs): imgs = [transforms(img) for img in imgs] if not raw_inputs: imgs = torch.stack(imgs) output = model(imgs) return output_processor(output, ...)
Parameters | |
model | a torch:torch.nn.Module instance to use |
entrypoint | a function or string like "torchvision.models.inception_v3" specifying the entrypoint function that loads the model |
entrypoint | a dictionary of arguments for entrypoint_fcn |
transforms | a preprocessing transform to apply |
transforms | a function or string like "torchvision.models.Inception_V3_Weights.DEFAULT.transforms" specifying a function that returns a preprocessing transform function to apply |
transforms | a dictionary of arguments for transforms_args |
ragged | whether the provided transforms or transforms_fcn may return tensors of different sizes. This must be set to False to enable batch inference, if it is desired |
raw | whether to feed the raw list of images to the model rather than stacking them as a Torch tensor |
output | an OutputProcessor instance to use |
output | a class or string like
"fifytone.utils.torch.ClassifierOutputProcessor" specifying the
OutputProcessor to use |
output | a dictionary of arguments for output_processor_cls(classes=classes, **kwargs) |
confidence | an optional confidence threshold apply to any applicable predictions generated by the model |
classes | a list of class names for the model, if applicable |
labels | a comma-separated list of the class names for the model, if applicable |
labels | the path to the labels map for the model, if applicable |
mask | a mask targets dict for the model, if applicable |
mask | the path to a mask targets map for the model, if applicable |
skeleton | a keypoint skeleton dict for the model, if applicable |
image | resize the input images during preprocessing, if necessary, so that the image dimensions are at least this (width, height) |
image | resize input images during preprocessing, if necessary, so that the smaller image dimension is at least this value |
image | resize the input images during preprocessing, if necessary, so that the image dimensions are at most this (width, height) |
image | resize input images during preprocessing, if necessary, so that the largest image dimension is at most this value. |
image | a (width, height) to which to resize the input images during preprocessing |
image | resize the smaller input dimension to this value during preprocessing |
image | crop the input images during preprocessing, if necessary, so that the image dimensions are a multiple of this patch size |
image | a 3-array of mean values in [0, 1] for preprocessing the input images |
image | a 3-array of std values in [0, 1] for preprocessing the input images inputs that are lists of Tensors |
embeddings | the name of a layer whose output to expose as embeddings. Prepend "<" to save the input tensor instead |
as | whether to operate the model as a feature extractor. If embeddings_layer is provided, this layer is passed to torchvision's create_feature_extractor() function. If no embeddings_layer is provided, the model's output is used as-is for feature extraction |
use | whether to use half precision (only supported when using GPU) |
cudnn | a value to use for
torch:torch.backends.cudnn.benchmark while the model is
running |
device | a string specifying the device to use, eg ("cuda:0", "mps", "cpu"). By default, CUDA is used if available, else CPU is used |
Method | __init__ |
Undocumented |
Instance Variable | as |
Undocumented |
Instance Variable | classes |
Undocumented |
Instance Variable | confidence |
Undocumented |
Instance Variable | cudnn |
Undocumented |
Instance Variable | device |
Undocumented |
Instance Variable | embeddings |
Undocumented |
Instance Variable | entrypoint |
Undocumented |
Instance Variable | entrypoint |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | image |
Undocumented |
Instance Variable | labels |
Undocumented |
Instance Variable | labels |
Undocumented |
Instance Variable | mask |
Undocumented |
Instance Variable | mask |
Undocumented |
Instance Variable | model |
Undocumented |
Instance Variable | output |
Undocumented |
Instance Variable | output |
Undocumented |
Instance Variable | output |
Undocumented |
Instance Variable | ragged |
Undocumented |
Instance Variable | raw |
Undocumented |
Instance Variable | skeleton |
Undocumented |
Instance Variable | transforms |
Undocumented |
Instance Variable | transforms |
Undocumented |
Instance Variable | transforms |
Undocumented |
Instance Variable | use |
Undocumented |
Inherited from Config
:
Method | __repr__ |
Undocumented |
overridden in
fiftyone.utils.clip.zoo.TorchCLIPModelConfig
, fiftyone.utils.open_clip.TorchOpenClipModelConfig
, fiftyone.utils.sam.SegmentAnythingModelConfig
, fiftyone.utils.sam2.SegmentAnything2ImageModelConfig
, fiftyone.utils.sam2.SegmentAnything2VideoModelConfig
, fiftyone.utils.super_gradients.TorchYoloNasModelConfig
, fiftyone.zoo.models.torch.TorchvisionImageModelConfig
Undocumented
overridden in
fiftyone.utils.clip.zoo.TorchCLIPModelConfig
, fiftyone.utils.open_clip.TorchOpenClipModelConfig
Undocumented