Results 31 to 40 of about 4,930,132 (374)
EPSD: Early Pruning with Self-Distillation for Efficient Model Compression
Neural network compression techniques, such as knowledge distillation (KD) and network pruning, have received increasing attention. Recent work `Prune, then Distill' reveals that a pruned student-friendly teacher network can benefit the performance of KD.
Dong Chen +8 more
semanticscholar +3 more sources
Masked image modeling (MIM) is a learning method in which the unmasked components of the input are utilized to learn and predict the masked signal, enabling learning from large amounts of unannotated data.
Xuying Wang +4 more
doaj +1 more source
Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance [PDF]
Models for natural language understanding (NLU) tasks often rely on the idiosyncratic biases of the dataset, which make them brittle against test cases outside the training distribution.
Gurevych, Iryna +2 more
core +2 more sources
Knowledge distillation is the procedure of transferring "knowledge" from a large model (the teacher) to a more compact one (the student), often being used in the context of model compression. When both models have the same architecture, this procedure is called self-distillation.
Pham, Minh +3 more
openaire +2 more sources
Bayesian Optimization Meets Self-Distillation
Bayesian optimization (BO) has contributed greatly to improving model performance by suggesting promising hyperparameter configurations iteratively based on observations from multiple training trials. However, only partial knowledge (i.e., the measured performances of trained models and their hyperparameter configurations) from previous trials is ...
Lee, HyunJae +5 more
openaire +2 more sources
Self-distillation for Surgical Action Recognition
Surgical scene understanding is a key prerequisite for contextaware decision support in the operating room. While deep learning-based approaches have already reached or even surpassed human performance in various fields, the task of surgical action recognition remains a major challenge.
Yamlahi, Amine +11 more
openaire +2 more sources
Self-Learning for Few-Shot Remote Sensing Image Captioning
Large-scale caption-labeled remote sensing image samples are expensive to acquire, and the training samples available in practical application scenarios are generally limited.
Haonan Zhou +3 more
doaj +1 more source
A Lightweight Graph Neural Network Algorithm for Action Recognition Based on Self-Distillation
Recognizing human actions can help in numerous ways, such as health monitoring, intelligent surveillance, virtual reality and human–computer interaction. A quick and accurate detection algorithm is required for daily real-time detection. This paper first
Miao Feng, Jean Meunier
doaj +1 more source
Self-Distillation: Towards Efficient and Compact Neural Networks [PDF]
Remarkable achievements have been obtained by deep neural networks in the last several years. However, the breakthrough in neural networks accuracy is always accompanied by explosive growth of computation and parameters, which leads to a severe limitation of model deployment. In this paper, we propose a novel knowledge distillation technique named self-
Linfeng, Zhang +2 more
openaire +2 more sources
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining [PDF]
This paper presents a simple yet effective framework MaskCLIP, which incorporates a newly proposed masked self-distillation into contrastive language-image pretraining.
Xiaoyi Dong +11 more
semanticscholar +1 more source

