Encoders#

PyTorch Encoders#

AlexNet#

class fusionlab.encoders.AlexNet(cin=3, spatial_dims=2)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type:: Tensor

VGG#

class fusionlab.encoders.VGG16(cin=3, spatial_dims=2)[source]#

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

class fusionlab.encoders.VGG19(cin=3, spatial_dims=2)[source]#

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

InceptionNet#

class fusionlab.encoders.InceptionNetV1(cin=3, spatial_dims=2)[source]#

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

ResNet#

class fusionlab.encoders.ResNet(block, layers, zero_init_residual=False, groups=1, width_per_group=64, replace_stride_with_dilation=None, norm_layer=None, cin=3, spatial_dims=2)[source]#

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Return type:: Tensor

class fusionlab.encoders.ResNet18(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.ResNet34(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.ResNet50(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.ResNet101(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.ResNet152(cin=3, spatial_dims=2)[source]#

EfficientNet#

class fusionlab.encoders.EfficientNet(inverted_residual_setting, cin=3, stochastic_depth_prob=0.2, last_channel=None, norm_layer=None, spatial_dims=2)[source]#

__init__(inverted_residual_setting, cin=3, stochastic_depth_prob=0.2, last_channel=None, norm_layer=None, spatial_dims=2)[source]#

EfficientNet V1 and V2 main class

Parameters:

inverted_residual_setting (Sequence[Union[MBConvConfig, FusedMBConvConfig]]) – Network structure
dropout (float) – The droupout probability
stochastic_depth_prob (float) – The stochastic depth probability
num_classes (int) – Number of classes
norm_layer (Optional[Callable[..., nn.Module]]) – Module specifying the normalization layer to use
last_channel (int) – The number of channels on the penultimate layer

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Return type:: Tensor

class fusionlab.encoders.EfficientNetB0(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.EfficientNetB1(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.EfficientNetB2(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.EfficientNetB3(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.EfficientNetB4(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.EfficientNetB5(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.EfficientNetB6(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.EfficientNetB7(cin=3, spatial_dims=2)[source]#

ConvNeXt#

class fusionlab.encoders.ConvNeXt(in_chans=3, depths=[3, 3, 9, 3], dims=[96, 192, 384, 768], drop_path_rate=0.0, layer_scale_init_value=1e-06, spatial_dims=2)[source]#

A PyTorch impl ofA ConvNet for the 2020s -: https://arxiv.org/pdf/2201.03545.pdf

Parameters:

in_chans (int) – Number of input image channels. Default: 3
num_classes (int) – Number of classes for classification head. Default: 1000
depths (tuple(int)) – Number of blocks at each stage. Default: [3, 3, 9, 3]
dims (int) – Feature dimension at each stage. Default: [96, 192, 384, 768]
drop_path_rate (float) – Stochastic depth rate. Default: 0.
layer_scale_init_value (float) – Init value for Layer Scale. Default: 1e-6.
head_init_scale (float) – Init scaling value for classifier weights and biases. Default: 1.

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

class fusionlab.encoders.ConvNeXtTiny(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.ConvNeXtSmall(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.ConvNeXtBase(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.ConvNeXtLarge(cin=3, spatial_dims=2)[source]#

class fusionlab.encoders.ConvNeXtXLarge(cin=3, spatial_dims=2)[source]#

Vision Transformer#

class fusionlab.encoders.ViT(in_channels, img_size, patch_size, hidden_size=768, mlp_dim=3072, num_layers=12, num_heads=12, pos_embed='conv', dropout_rate=0.0, spatial_dims=2, qkv_bias=False, save_attn=False)[source]#

Vision Transformer (ViT), based on: “Dosovitskiy et al., An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>”

ViT supports Torchscript but only works for Pytorch after 1.8.

source code: Project-MONAI/MONAI

__init__(in_channels, img_size, patch_size, hidden_size=768, mlp_dim=3072, num_layers=12, num_heads=12, pos_embed='conv', dropout_rate=0.0, spatial_dims=2, qkv_bias=False, save_attn=False)[source]#

Parameters:

in_channels (int) – dimension of input channels.
img_size (Union[Sequence[int], int]) – dimension of input image.
patch_size (Union[Sequence[int], int]) – dimension of patch size.
hidden_size (int, optional) – dimension of hidden layer. Defaults to 768.
mlp_dim (int, optional) – dimension of feedforward layer. Defaults to 3072.
num_layers (int, optional) – number of transformer blocks. Defaults to 12.
num_heads (int, optional) – number of attention heads. Defaults to 12.
pos_embed (str, optional) – position embedding layer type. Defaults to “conv”.
num_classes (int, optional) – number of classes if classification is used. Defaults to 2.
dropout_rate (float, optional) – faction of the input units to drop. Defaults to 0.0.
spatial_dims (int, optional) – number of spatial dimensions. Defaults to 3.
qkv_bias (bool, optional) – apply bias to the qkv linear layer in self attention block. Defaults to False.
save_attn (bool, optional) – to make accessible the attention in self attention block. Defaults to False.

forward(x, return_features=False)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

fusionlab.encoders.VisionTransformer#: alias of ViT

Mix Transformer#

class fusionlab.encoders.MiT(in_channels=3, embed_dims=[32, 64, 160, 256], depths=[2, 2, 2, 2])[source]#

Mix Transformer

source code: sithu31296/semantic-segmentation

forward(x, return_features=False)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Return type:: Tensor

class fusionlab.encoders.MiTB0(in_channels=3)[source]#

class fusionlab.encoders.MiTB1(in_channels=3)[source]#

class fusionlab.encoders.MiTB2(in_channels=3)[source]#

class fusionlab.encoders.MiTB3(in_channels=3)[source]#

class fusionlab.encoders.MiTB4(in_channels=3)[source]#

class fusionlab.encoders.MiTB5(in_channels=3)[source]#