Benchmarksï
This page summarizes reproducible experimental results that torchdistill official repository supports.
ImageNet (ILSVRC 2012)ï
ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012)
Student: ResNet-18ï
Method |
Original work |
torchdistill |
Config |
---|---|---|---|
Pretrained |
N/A |
69.76* |
|
KD |
N/A |
71.37 |
|
AT |
70.70 |
70.90 |
|
FT |
71.43** |
71.56 |
|
CRD |
71.17 |
70.93 |
|
Tf-KD |
70.42 |
70.52 |
|
SSKD |
71.62 |
70.09 |
|
\(L_2\) |
70.90 |
71.08 |
|
PAD-\(L_2\) |
71.71 |
71.71 |
|
KR |
71.61 |
71.64 |
|
ICKD |
72.19 |
68.34 |
|
DIST |
72.07 |
71.95 |
|
SRD |
71.63 |
71.92 |
|
KD w/ LS |
71.42 |
71.23 |
* The pretrained ResNet-34 and ResNet-18 are provided by torchvision.
** FT is assessed with ILSVRC 2015 in the original work.
Original workï
FT: âParaphrasing Complex Network: Network Compression via Factor Transferâ
Tf-KD: âRevisiting Knowledge Distillation via Label Smoothing Regularizationâ
\(L_2\), PAD-\(L_2\): âPrime-Aware Adaptive Distillationâ
ICKD: âExploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillationâ
SRD: âUnderstanding the Role of the Projector in Knowledge Distillationâ
KD w/ LS: âLogit Standardization in Knowledge Distillationâ