sc2bench.transform


sc2bench.transform.codec

sc2bench.transforms.codec.register_codec_transform_module(cls)[source]

Registers a codec transform class.

Parameters:

cls (class) – codec transform class to be registered

Returns:

registered codec transform class

Return type:

class

class sc2bench.transforms.codec.WrappedRandomResizedCrop(interpolation=None, **kwargs)[source]

RandomResizedCrop in torchvision wrapped to be defined by interpolation as a str object.

Parameters:
  • interpolation (str or None) – desired interpolation mode (‘nearest’, ‘bicubic’, ‘bilinear’, ‘box’, ‘hamming’, ‘lanczos’)

  • kwargs (dict) – kwargs for RandomResizedCrop in torchvision

class sc2bench.transforms.codec.WrappedResize(interpolation=None, **kwargs)[source]

Resize in torchvision wrapped to be defined by interpolation as a str object.

Parameters:
  • interpolation (str or None) – desired interpolation mode (‘nearest’, ‘bicubic’, ‘bilinear’, ‘box’, ‘hamming’, ‘lanczos’)

  • kwargs (dict) – kwargs for Resize in torchvision

class sc2bench.transforms.codec.PILImageModule(returns_file_size=False, open_kwargs=None, **save_kwargs)[source]

A generalized PIL module to compress (decompress) images e.g., as part of transform pipeline.

Parameters:
  • returns_file_size (bool) – returns file size of compressed object in addition to PIL image if True

  • open_kwargs (dict or None) – kwargs to be used as part of Image.open(img_buffer, **open_kwargs)

  • save_kwargs (dict or None) – kwargs to be used as part of Image.save(img_buffer, **save_kwargs)

forward(pil_img, *args)[source]

Saves PIL Image to BytesIO and reopens the image saved in the buffer.

Parameters:

pil_img (PIL.Image.Image) – image to be transformed.

Returns:

Affine transformed image or with its file size if returns_file_size=True

Return type:

PIL.Image.Image or (PIL.Image.Image, int)

class sc2bench.transforms.codec.PILTensorModule(returns_file_size=False, open_kwargs=None, **save_kwargs)[source]

A generalized PIL module to compress (decompress) tensors e.g., as part of transform pipeline.

Parameters:
  • returns_file_size (bool) – returns file size of compressed object in addition to PIL image if True

  • open_kwargs (dict or None) – kwargs to be used as part of Image.open(img_buffer, **open_kwargs)

  • save_kwargs (dict or None) – kwargs to be used as part of Image.save(img_buffer, **save_kwargs)

forward(x, *args)[source]

Splits tensor’s channels into sub-tensors (3 or fewer channels each), normalizes each using its min and max values, saves the normalized sub-tensor to BytesIO, and reopens the sub-tensor saved in the buffer to reconstruct the input tensor.

Parameters:

x (torch.Tensor) – image tensor (C, H, W) to be transformed.

Returns:

Affine transformed image tensor or with its file size if returns_file_size=True

Return type:

torch.Tensor or (torch.Tensor, int)

class sc2bench.transforms.codec.BPGModule(encoder_path, decoder_path, color_mode='ycbcr', encoder='x265', subsampling_mode='444', bit_depth='8', quality=50, returns_file_size=False)[source]

A BPG module to compress (decompress) images e.g., as part of transform pipeline.

Modified https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/utils/bench/codecs.py

Fabrice Bellard: “BPG Image format”

Warning

You need to manually install BPG software beforehand and confirm the encoder and decoder paths. For Debian machines (e.g., Ubuntu), you can use this script.

Parameters:
  • encoder_path (str) – file path of BPG encoder you manually installed

  • decoder_path (str) – file path of BPG decoder you manually installed

  • color_mode (str) – color mode (‘ycbcr’ or ‘rgb’)

  • encoder (str) – encoder type (‘x265’ or ‘jctvc’)

  • subsampling_mode (str or int) – subsampling mode (‘420’ or ‘444’)

  • bit_depth (str or int) – bit depth (8 or 10)

  • quality (int) – quality value in range [0, 51]

  • returns_file_size (bool) – returns file size of compressed object in addition to PIL image if True

forward(pil_img)[source]

Compresses and decompresses PIL Image using BPG software.

Parameters:

pil_img (PIL.Image.Image) – image to be transformed.

Returns:

Affine transformed image or with its file size if returns_file_size=True

Return type:

PIL.Image.Image or (PIL.Image.Image, int)

class sc2bench.transforms.codec.VTMModule(encoder_path, decoder_path, config_path, color_mode='ycbcr', quality=63, returns_file_size=False)[source]

A VTM module to compress (decompress) images e.g., as part of transform pipeline.

Modified https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/utils/bench/codecs.py

The Joint Video Exploration Team: “VTM reference software for VVC”

Warning

You need to manually install VTM software beforehand and confirm the encoder and decoder paths. For Debian machines (e.g., Ubuntu), you can use this script.

Parameters:
  • encoder_path (str) – file path of VTM encoder you manually installed

  • decoder_path (str) – file path of VTM decoder you manually installed

  • config_path (str) – VTM configuration file path

  • color_mode (str) – color mode (‘ycbcr’ or ‘rgb’)

  • quality (int) – quality value in range [0, 63]

  • returns_file_size (bool) – returns file size of compressed object in addition to PIL image if True

forward(pil_img)[source]

Compresses and decompresses PIL Image using VTM software.

Parameters:

pil_img (PIL.Image.Image) – image to be transformed.

Returns:

Affine transformed image or with its file size if returns_file_size=True

Return type:

PIL.Image.Image or (PIL.Image.Image, int)


sc2bench.transform.collator

sc2bench.transforms.collator.cat_list(images, fill_value=0)[source]

Concatenates a list of images with the max size for each of heights and widths and fills empty spaces with a specified value.

Parameters:
  • images (torch.Tensor) – batch tensor

  • fill_value (int) – value to be filled

Returns:

backbone model

Return type:

torch.Tensor

sc2bench.transforms.collator.pascal_seg_collate_fn(batch)[source]

Collates input data for PASCAL VOC 2012 segmentation.

Parameters:

batch (list or tuple) – list/tuple of triplets (image, target, supp_dict), where supp_dict can be an empty dict

Returns:

collated images, targets, and supplementary dicts

Return type:

(torch.Tensor, tensor.Tensor, list[dict])

sc2bench.transforms.collator.pascal_seg_eval_collate_fn(batch)[source]

Collates input data for PASCAL VOC 2012 segmentation in evaluation

Parameters:

batch (list or tuple) – list/tuple of tuples (image, target)

Returns:

collated images and targets

Return type:

(torch.Tensor, tensor.Tensor)


sc2bench.transform.misc

sc2bench.transforms.misc.register_misc_transform_module(cls)[source]

Registers a miscellaneous transform class.

Parameters:

cls (class) – miscellaneous transform class to be registered

Returns:

registered miscellaneous transform class

Return type:

class

sc2bench.transforms.misc.default_collate_w_pil(batch)[source]

Puts each data field into a tensor or PIL Image with outer dimension batch size.

Parameters:

batch – single batch to be collated

Returns:

collated batch

class sc2bench.transforms.misc.ClearTargetTransform[source]

A transform module that replaces target with an empty list.

forward(sample, *args)[source]

Replaces target data field with an empty list.

Parameters:

sample (PIL.Image.Image or torch.Tensor) – image or image tensor

Returns:

sample and an empty list

Return type:

(PIL.Image.Image or torch.Tensor, list)

class sc2bench.transforms.misc.AdaptivePad(fill=0, padding_position='hw', padding_mode='constant', factor=128, returns_org_patch_size=False)[source]

A transform module that adaptively determines the size of padded sample.

Parameters:
  • fill (int) – padded value

  • padding_position (str) – ‘hw’ (default) to pad left and right for padding horizontal size // 2 and top and bottom for padding vertical size // 2; ‘right_bottom’ to pad bottom and right only

  • padding_mode (str) – padding mode passed to pad module

  • factor (int) – factor value for the padded input sample

  • returns_org_patch_size (bool) – if True, returns the patch size of the original input

forward(x)[source]

Adaptively determines the size of padded image or image tensor.

Parameters:

x (PIL.Image.Image or torch.Tensor) – image or image tensor

Returns:

padded image or image tensor, and the patch size of the input (height, width) if returns_org_patch_size=True

Return type:

PIL.Image.Image or torch.Tensor or (PIL.Image.Image or torch.Tensor, list[int, int])

class sc2bench.transforms.misc.CustomToTensor(converts_sample=True, converts_target=True)[source]

A customized ToTensor module that can be applied to sample and target selectively.

Parameters:
  • converts_sample (bool) – if True, applies to_tensor to sample

  • converts_target (bool) – if True, applies torch.as_tensor to target

class sc2bench.transforms.misc.SimpleQuantizer(num_bits)[source]

A module to quantize tensor with its half() function if num_bits=16 (FP16) or Jacob et al.’s method if num_bits=8 (INT8 + one FP32 scale parameter).

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko: “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference” @ CVPR 2018 (2018)

Parameters:

num_bits (int) – number of bits for quantization

forward(z)[source]

Quantizes tensor.

Parameters:

z (torch.Tensor) – tensor

Returns:

quantized tensor

Return type:

torch.Tensor or torchdistill.common.tensor_util.QuantizedTensor

class sc2bench.transforms.misc.SimpleDequantizer(num_bits)[source]

A module to dequantize quantized tensor in FP32. If num_bits=8, it uses Jacob et al.’s method.

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko: “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference” @ CVPR 2018 (2018)

Parameters:

num_bits (int) – number of bits used for quantization

forward(z)[source]

Dequantizes quantized tensor.

Parameters:

z (torch.Tensor or torchdistill.common.tensor_util.QuantizedTensor) – quantized tensor

Returns:

dequantized tensor

Return type:

torch.Tensor