sc2bench.transform

sc2bench.transform.codec

sc2bench.transforms.codec.register_codec_transform_module(cls)[source]

Registers a codec transform class.

Parameters:: cls (class) – codec transform class to be registered
Returns:: registered codec transform class
Return type:: class

class sc2bench.transforms.codec.WrappedRandomResizedCrop(interpolation=None, **kwargs)[source]

RandomResizedCrop in torchvision wrapped to be defined by interpolation as a str object.

Parameters:

interpolation (str or None) – desired interpolation mode (‘nearest’, ‘bicubic’, ‘bilinear’, ‘box’, ‘hamming’, ‘lanczos’)
kwargs (dict) – kwargs for RandomResizedCrop in torchvision

class sc2bench.transforms.codec.WrappedResize(interpolation=None, **kwargs)[source]

Resize in torchvision wrapped to be defined by interpolation as a str object.

Parameters:

interpolation (str or None) – desired interpolation mode (‘nearest’, ‘bicubic’, ‘bilinear’, ‘box’, ‘hamming’, ‘lanczos’)
kwargs (dict) – kwargs for Resize in torchvision

class sc2bench.transforms.codec.PILImageModule(returns_file_size=False, open_kwargs=None, **save_kwargs)[source]

A generalized PIL module to compress (decompress) images e.g., as part of transform pipeline.

Parameters:

returns_file_size (bool) – returns file size of compressed object in addition to PIL image if True
open_kwargs (dict or None) – kwargs to be used as part of Image.open(img_buffer, **open_kwargs)
save_kwargs (dict or None) – kwargs to be used as part of Image.save(img_buffer, **save_kwargs)

forward(pil_img, *args)[source]

Saves PIL Image to BytesIO and reopens the image saved in the buffer.

Parameters:: pil_img (PIL.Image.Image) – image to be transformed.
Returns:: Affine transformed image or with its file size if returns_file_size=True
Return type:: PIL.Image.Image or (PIL.Image.Image, int)

class sc2bench.transforms.codec.PILTensorModule(returns_file_size=False, open_kwargs=None, **save_kwargs)[source]

A generalized PIL module to compress (decompress) tensors e.g., as part of transform pipeline.

Parameters:

returns_file_size (bool) – returns file size of compressed object in addition to PIL image if True
open_kwargs (dict or None) – kwargs to be used as part of Image.open(img_buffer, **open_kwargs)
save_kwargs (dict or None) – kwargs to be used as part of Image.save(img_buffer, **save_kwargs)

forward(x, *args)[source]

Splits tensor’s channels into sub-tensors (3 or fewer channels each), normalizes each using its min and max values, saves the normalized sub-tensor to BytesIO, and reopens the sub-tensor saved in the buffer to reconstruct the input tensor.

Parameters:: x (torch.Tensor) – image tensor (C, H, W) to be transformed.
Returns:: Affine transformed image tensor or with its file size if returns_file_size=True
Return type:: torch.Tensor or (torch.Tensor, int)

class sc2bench.transforms.codec.BPGModule(encoder_path, decoder_path, color_mode='ycbcr', encoder='x265', subsampling_mode='444', bit_depth='8', quality=50, returns_file_size=False)[source]

A BPG module to compress (decompress) images e.g., as part of transform pipeline.

Modified https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/utils/bench/codecs.py

Fabrice Bellard: “BPG Image format”

Warning

You need to manually install BPG software beforehand and confirm the encoder and decoder paths. For Debian machines (e.g., Ubuntu), you can use this script.

Parameters:

encoder_path (str) – file path of BPG encoder you manually installed
decoder_path (str) – file path of BPG decoder you manually installed
color_mode (str) – color mode (‘ycbcr’ or ‘rgb’)
encoder (str) – encoder type (‘x265’ or ‘jctvc’)
subsampling_mode (str or int) – subsampling mode (‘420’ or ‘444’)
bit_depth (str or int) – bit depth (8 or 10)
quality (int) – quality value in range [0, 51]
returns_file_size (bool) – returns file size of compressed object in addition to PIL image if True

forward(pil_img)[source]

Compresses and decompresses PIL Image using BPG software.

Parameters:: pil_img (PIL.Image.Image) – image to be transformed.
Returns:: Affine transformed image or with its file size if returns_file_size=True
Return type:: PIL.Image.Image or (PIL.Image.Image, int)

class sc2bench.transforms.codec.VTMModule(encoder_path, decoder_path, config_path, color_mode='ycbcr', quality=63, returns_file_size=False)[source]

A VTM module to compress (decompress) images e.g., as part of transform pipeline.

Modified https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/utils/bench/codecs.py

The Joint Video Exploration Team: “VTM reference software for VVC”

Warning

You need to manually install VTM software beforehand and confirm the encoder and decoder paths. For Debian machines (e.g., Ubuntu), you can use this script.

Parameters:

encoder_path (str) – file path of VTM encoder you manually installed
decoder_path (str) – file path of VTM decoder you manually installed
config_path (str) – VTM configuration file path
color_mode (str) – color mode (‘ycbcr’ or ‘rgb’)
quality (int) – quality value in range [0, 63]
returns_file_size (bool) – returns file size of compressed object in addition to PIL image if True

forward(pil_img)[source]

Compresses and decompresses PIL Image using VTM software.

Parameters:: pil_img (PIL.Image.Image) – image to be transformed.
Returns:: Affine transformed image or with its file size if returns_file_size=True
Return type:: PIL.Image.Image or (PIL.Image.Image, int)

sc2bench.transform.collator

sc2bench.transforms.collator.cat_list(images, fill_value=0)[source]

Concatenates a list of images with the max size for each of heights and widths and fills empty spaces with a specified value.

Parameters:

images (torch.Tensor) – batch tensor
fill_value (int) – value to be filled

Returns:

backbone model

Return type:

torch.Tensor

sc2bench.transforms.collator.pascal_seg_collate_fn(batch)[source]

Collates input data for PASCAL VOC 2012 segmentation.

Parameters:: batch (list or tuple) – list/tuple of triplets (image, target, supp_dict), where supp_dict can be an empty dict
Returns:: collated images, targets, and supplementary dicts
Return type:: (torch.Tensor, tensor.Tensor, list[dict])

sc2bench.transforms.collator.pascal_seg_eval_collate_fn(batch)[source]

Collates input data for PASCAL VOC 2012 segmentation in evaluation

Parameters:: batch (list or tuple) – list/tuple of tuples (image, target)
Returns:: collated images and targets
Return type:: (torch.Tensor, tensor.Tensor)

sc2bench.transform.misc

sc2bench.transforms.misc.register_misc_transform_module(cls)[source]

Registers a miscellaneous transform class.

Parameters:: cls (class) – miscellaneous transform class to be registered
Returns:: registered miscellaneous transform class
Return type:: class

sc2bench.transforms.misc.default_collate_w_pil(batch)[source]

Puts each data field into a tensor or PIL Image with outer dimension batch size.

Parameters:: batch – single batch to be collated
Returns:: collated batch

class sc2bench.transforms.misc.ClearTargetTransform[source]

A transform module that replaces target with an empty list.

forward(sample, *args)[source]

Replaces target data field with an empty list.

Parameters:: sample (PIL.Image.Image or torch.Tensor) – image or image tensor
Returns:: sample and an empty list
Return type:: (PIL.Image.Image or torch.Tensor, list)

class sc2bench.transforms.misc.AdaptivePad(fill=0, padding_position='hw', padding_mode='constant', factor=128, returns_org_patch_size=False)[source]

A transform module that adaptively determines the size of padded sample.

Parameters:

fill (int) – padded value
padding_position (str) – ‘hw’ (default) to pad left and right for padding horizontal size // 2 and top and bottom for padding vertical size // 2; ‘right_bottom’ to pad bottom and right only
padding_mode (str) – padding mode passed to pad module
factor (int) – factor value for the padded input sample
returns_org_patch_size (bool) – if True, returns the patch size of the original input

forward(x)[source]

Adaptively determines the size of padded image or image tensor.

Parameters:: x (PIL.Image.Image or torch.Tensor) – image or image tensor
Returns:: padded image or image tensor, and the patch size of the input (height, width) if returns_org_patch_size=True
Return type:: PIL.Image.Image or torch.Tensor or (PIL.Image.Image or torch.Tensor, list[int, int])

class sc2bench.transforms.misc.CustomToTensor(converts_sample=True, converts_target=True)[source]

A customized ToTensor module that can be applied to sample and target selectively.

Parameters:

converts_sample (bool) – if True, applies to_tensor to sample
converts_target (bool) – if True, applies torch.as_tensor to target

class sc2bench.transforms.misc.SimpleQuantizer(num_bits)[source]

A module to quantize tensor with its half() function if num_bits=16 (FP16) or Jacob et al.’s method if num_bits=8 (INT8 + one FP32 scale parameter).

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko: “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference” @ CVPR 2018 (2018)

Parameters:: num_bits (int) – number of bits for quantization

forward(z)[source]

Quantizes tensor.

Parameters:: z (torch.Tensor) – tensor
Returns:: quantized tensor
Return type:: torch.Tensor or torchdistill.common.tensor_util.QuantizedTensor

class sc2bench.transforms.misc.SimpleDequantizer(num_bits)[source]

A module to dequantize quantized tensor in FP32. If num_bits=8, it uses Jacob et al.’s method.

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko: “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference” @ CVPR 2018 (2018)

Parameters:: num_bits (int) – number of bits used for quantization

forward(z)[source]

Dequantizes quantized tensor.

Parameters:: z (torch.Tensor or torchdistill.common.tensor_util.QuantizedTensor) – quantized tensor
Returns:: dequantized tensor
Return type:: torch.Tensor