Deep neural networks are capable of learning powerful representation. but often limited by heavy network architectures and high computational cost. Knowledge distillation (KD) is one of the effective ways to perform model compression and inference acceleration. But the final student models remain parameter redundancy. https://www.ngetikin.com/mega-save-NHL-Calgary-Flames-Home-Jersey-iPhone-XR-Skin-p120897-hot-value/