Results 11 to 20 of about 141,062 (268)
Multiple-Stage Knowledge Distillation
Knowledge distillation (KD) is a method in which a teacher network guides the learning of a student network, thereby resulting in an improvement in the performance of the student network.
Chuanyun Xu +6 more
doaj +1 more source
Spot-Adaptive Knowledge Distillation
12 pages, 8 ...
Jie Song +3 more
openaire +3 more sources
Similarity and Consistency by Self-distillation Method [PDF]
Due to high data pre-processing costs and missing local features detection in self-distillation methods for models compression,a similarity and consistency by self-distillation(SCD) method is proposed to improve model classification accuracy.Firstly ...
WAN Xu, MAO Yingchi, WANG Zibo, LIU Yi, PING Ping
doaj +1 more source
Distilling Knowledge via Knowledge Review [PDF]
CVPR ...
Chen, Pengguang +3 more
openaire +2 more sources
Review of Recent Distillation Studies [PDF]
Knowledge distillation has gained a lot of interest in recent years because it allows for compressing a large deep neural network (teacher DNN) into a smaller DNN (student DNN), while maintaining its accuracy.
Gao Minghong
doaj +1 more source
Recurrent Knowledge Distillation [PDF]
Knowledge distillation compacts deep networks by letting a small student network learn from a large teacher network. The accuracy of knowledge distillation recently benefited from adding residual layers. We propose to reduce the size of the student network even further by recasting multiple residual layers in the teacher network into a single recurrent
Pintea, S. (author) +2 more
openaire +3 more sources
A Virtual Knowledge Distillation via Conditional GAN
Knowledge distillation aims at transferring the knowledge from a pre-trained complex model, called teacher, to a relatively smaller and faster one, called student. Unlike previous works that transfer the teacher’s softened distributions or feature
Sihwan Kim
doaj +1 more source
Annealing Knowledge Distillation [PDF]
Significant memory and computational requirements of large deep neural networks restrict their application on edge devices. Knowledge distillation (KD) is a prominent model compression technique for deep neural networks in which the knowledge of a trained large teacher model is transferred to a smaller student model.
Jafari, Aref +3 more
openaire +2 more sources
Feature fusion-based collaborative learning for knowledge distillation
Deep neural networks have achieved a great success in a variety of applications, such as self-driving cars and intelligent robotics. Meanwhile, knowledge distillation has received increasing attention as an effective model compression technique for ...
Yiting Li +4 more
doaj +1 more source
Hint-Dynamic Knowledge Distillation
Knowledge Distillation (KD) transfers the knowledge from a high-capacity teacher model to promote a smaller student model. Existing efforts guide the distillation by matching their prediction logits, feature embedding, etc., while leaving how to efficiently utilize them in junction less explored.
Liu, Yiyang +4 more
openaire +2 more sources

