Encoders#

PyTorch Encoders#

AlexNet#

class fusionlab.encoders.AlexNet(cin=3, spatial_dims=2)[source]#
forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type:

Tensor

VGG#

class fusionlab.encoders.VGG16(cin=3, spatial_dims=2)[source]#
forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fusionlab.encoders.VGG19(cin=3, spatial_dims=2)[source]#
forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

InceptionNet#

class fusionlab.encoders.InceptionNetV1(cin=3, spatial_dims=2)[source]#
forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

ResNet#

class fusionlab.encoders.ResNet(block, layers, zero_init_residual=False, groups=1, width_per_group=64, replace_stride_with_dilation=None, norm_layer=None, cin=3, spatial_dims=2)[source]#
forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type:

Tensor

class fusionlab.encoders.ResNet18(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.ResNet34(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.ResNet50(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.ResNet101(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.ResNet152(cin=3, spatial_dims=2)[source]#

EfficientNet#

class fusionlab.encoders.EfficientNet(inverted_residual_setting, cin=3, stochastic_depth_prob=0.2, last_channel=None, norm_layer=None, spatial_dims=2)[source]#
__init__(inverted_residual_setting, cin=3, stochastic_depth_prob=0.2, last_channel=None, norm_layer=None, spatial_dims=2)[source]#

EfficientNet V1 and V2 main class

Parameters:
  • inverted_residual_setting (Sequence[Union[MBConvConfig, FusedMBConvConfig]]) – Network structure

  • dropout (float) – The droupout probability

  • stochastic_depth_prob (float) – The stochastic depth probability

  • num_classes (int) – Number of classes

  • norm_layer (Optional[Callable[..., nn.Module]]) – Module specifying the normalization layer to use

  • last_channel (int) – The number of channels on the penultimate layer

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type:

Tensor

class fusionlab.encoders.EfficientNetB0(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.EfficientNetB1(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.EfficientNetB2(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.EfficientNetB3(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.EfficientNetB4(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.EfficientNetB5(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.EfficientNetB6(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.EfficientNetB7(cin=3, spatial_dims=2)[source]#

ConvNeXt#

class fusionlab.encoders.ConvNeXt(in_chans=3, depths=[3, 3, 9, 3], dims=[96, 192, 384, 768], drop_path_rate=0.0, layer_scale_init_value=1e-06, spatial_dims=2)[source]#
A PyTorch impl ofA ConvNet for the 2020s -

https://arxiv.org/pdf/2201.03545.pdf

Parameters:
  • in_chans (int) – Number of input image channels. Default: 3

  • num_classes (int) – Number of classes for classification head. Default: 1000

  • depths (tuple(int)) – Number of blocks at each stage. Default: [3, 3, 9, 3]

  • dims (int) – Feature dimension at each stage. Default: [96, 192, 384, 768]

  • drop_path_rate (float) – Stochastic depth rate. Default: 0.

  • layer_scale_init_value (float) – Init value for Layer Scale. Default: 1e-6.

  • head_init_scale (float) – Init scaling value for classifier weights and biases. Default: 1.

forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fusionlab.encoders.ConvNeXtTiny(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.ConvNeXtSmall(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.ConvNeXtBase(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.ConvNeXtLarge(cin=3, spatial_dims=2)[source]#
class fusionlab.encoders.ConvNeXtXLarge(cin=3, spatial_dims=2)[source]#

Vision Transformer#

class fusionlab.encoders.ViT(in_channels, img_size, patch_size, hidden_size=768, mlp_dim=3072, num_layers=12, num_heads=12, pos_embed='conv', dropout_rate=0.0, spatial_dims=2, qkv_bias=False, save_attn=False)[source]#

Vision Transformer (ViT), based on: “Dosovitskiy et al., An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>”

ViT supports Torchscript but only works for Pytorch after 1.8.

source code: Project-MONAI/MONAI

__init__(in_channels, img_size, patch_size, hidden_size=768, mlp_dim=3072, num_layers=12, num_heads=12, pos_embed='conv', dropout_rate=0.0, spatial_dims=2, qkv_bias=False, save_attn=False)[source]#
Parameters:
  • in_channels (int) – dimension of input channels.

  • img_size (Union[Sequence[int], int]) – dimension of input image.

  • patch_size (Union[Sequence[int], int]) – dimension of patch size.

  • hidden_size (int, optional) – dimension of hidden layer. Defaults to 768.

  • mlp_dim (int, optional) – dimension of feedforward layer. Defaults to 3072.

  • num_layers (int, optional) – number of transformer blocks. Defaults to 12.

  • num_heads (int, optional) – number of attention heads. Defaults to 12.

  • pos_embed (str, optional) – position embedding layer type. Defaults to “conv”.

  • num_classes (int, optional) – number of classes if classification is used. Defaults to 2.

  • dropout_rate (float, optional) – faction of the input units to drop. Defaults to 0.0.

  • spatial_dims (int, optional) – number of spatial dimensions. Defaults to 3.

  • qkv_bias (bool, optional) – apply bias to the qkv linear layer in self attention block. Defaults to False.

  • save_attn (bool, optional) – to make accessible the attention in self attention block. Defaults to False.

forward(x, return_features=False)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fusionlab.encoders.VisionTransformer#

alias of ViT

Mix Transformer#

class fusionlab.encoders.MiT(in_channels=3, embed_dims=[32, 64, 160, 256], depths=[2, 2, 2, 2])[source]#

Mix Transformer

source code: sithu31296/semantic-segmentation

forward(x, return_features=False)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type:

Tensor

class fusionlab.encoders.MiTB0(in_channels=3)[source]#
class fusionlab.encoders.MiTB1(in_channels=3)[source]#
class fusionlab.encoders.MiTB2(in_channels=3)[source]#
class fusionlab.encoders.MiTB3(in_channels=3)[source]#
class fusionlab.encoders.MiTB4(in_channels=3)[source]#
class fusionlab.encoders.MiTB5(in_channels=3)[source]#