fusionlab.datasets.utils module#

class fusionlab.datasets.utils.HFDataset(dataset)[source]#

Bases: Dataset

Base Hugginface dataset wrapper class :param dataset: a dataset object that contains a getitem method

class fusionlab.datasets.utils.LSTimeClassificationDataset(data_dir, annotation_path, class_map, column_names)[source]#

Bases: Dataset

Dataset for label-studio timeseries classification task

__init__(data_dir, annotation_path, class_map, column_names)[source]#

Dataset for label-studio timeseries segmentation task

Parameters:
  • data_dir (str) – directory of csv files

  • annotation_path (str) – path to annotation json file

  • class_map (dict) – a dictionary mapping class names to class indices

  • column_names (List[str]) – A list of column names for the signal data in the CSV files.

Examples::
>>> ds = LSTimeClassificationDataset(
>>>     data_dir=DATA_DIR,
>>>     annotation_path=ANNOTATION_PATH,
>>>     class_map={"Normal": 1, "AF": 2, "AV Block": 3, "Noise": 4},
>>>     column_names=['i', 'ii', 'iii'])
>>> signals, label = ds[0]
preprocess(signals)[source]#
class fusionlab.datasets.utils.LSTimeSegDataset(data_dir, annotation_path, class_map, column_names)[source]#

Bases: Dataset

Dataset for label-studio timeseries segmentation task

__init__(data_dir, annotation_path, class_map, column_names)[source]#

Dataset for label-studio timeseries segmentation task

Parameters:
  • data_dir (str) – directory of csv files

  • annotation_path (str) – path to annotation json file

  • class_map (dict) – a dictionary mapping class names to class indices

  • column_names (List[str]) – A list of column names for the signal data in the CSV files.

Examples::
>>> ds = LSTimeSegDataset(data_dir="./12",
>>>                       annotation_path="./12.json",
>>>                       class_map={"N": 1, "p": 2, "t": 3},
>>>                       column_names=['i', 'ii', 'iii', 'avr', 'avl', 'avf', 'v1', 'v2', 'v3', 'v4', 'v5', 'v6'])
>>> signals, mask = ds[0]
preprocess(signals)[source]#
fusionlab.datasets.utils.count_parameters(model, trainable_only=False)[source]#

Returns the number of parameters in a model

Parameters:
  • model (Module) – a pytorch model

  • trainable_only (bool) – if True, only count trainable parameters

Returns:

number of parameters in the model

Return type:

num_parameters

Reference: https://discuss.pytorch.org/t/how-do-i-check-the-number-of-parameters-of-a-model/4325/9

fusionlab.datasets.utils.download_file(url, download_root, extract_root=None, filename=None, extract=False)[source]#

Download a file from a url and optionally extract it to a target directory. :type url: str :param url: URL to download file from :type url: str :type download_root: str :param download_root: Directory to place downloaded file in :type download_root: str :type extract_root: Optional[str] :param extract_root: Directory to extract downloaded file to :type extract_root: str, optional :type filename: Optional[str] :param filename: Name to save the file under. If None, use the basename of the URL :type filename: str, optional :param extract: If True, extract the downloaded file. Otherwise, do not extract. :type extract: bool, optional

Return type:

None

fusionlab.datasets.utils.standardize_tensor(tensor, dim=0)[source]#

Standardize a tensor by channel

Parameters:
  • tensor (torch.Tensor) – shape: (Channels, num_samples)

  • dim (int) – dimension to standardize

Returns:

shape: (Channels, num_samples,)

Return type:

tensor (torch.Tensor)