Results 11 to 20 of about 4,930,132 (374)

InfoMSD: an information-maximization self-distillation framework for parameter-efficient fine-tuning on artwork images [PDF]

open access: yesFrontiers in Artificial Intelligence
In recent years, despite the remarkable performance of large-scale vision language models across various visual classification tasks, their substantial parameter counts and high fine-tuning costs have hindered deployment in resource-constrained cultural ...
Feng Guan   +3 more
doaj   +2 more sources

Toward Generalized Multistage Clustering: Multiview Self-Distillation

open access: yesIEEE Transactions on Neural Networks and Learning Systems, 2023
Existing multi-stage clustering methods independently learn the salient features from multiple views and then perform the clustering task. Particularly, multi-view clustering (MVC) has attracted a lot of attention in multi-view or multi-modal scenarios.
Jiatai Wang, Zhiwei Xu, Xin Wang, Tao Li
openaire   +4 more sources

SkillFactory: Self-Distillation For Learning Cognitive Behaviors [PDF]

open access: green
Reasoning models leveraging long chains of thought employ various cognitive skills, such as verification of their answers, backtracking, retrying by an alternate method, and more. Previous work has shown that when a base language model exhibits these skills, training that model further with reinforcement learning (RL) can learn to leverage them.
Zayne Sprague   +5 more
openalex   +3 more sources

COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training [PDF]

open access: greenComputer Vision and Pattern Recognition
Vision-Language Models (VLMs) trained with contrastive loss have achieved significant advancements in various vision and language tasks. However, the global nature of the contrastive loss makes VLMs focus predominantly on foreground objects, neglecting ...
Sanghwan Kim   +4 more
openalex   +2 more sources

Leave No One Behind: Online Self-Supervised Self-Distillation for Sequential Recommendation [PDF]

open access: yesThe Web Conference
Sequential recommendation methods play a pivotal role in modern recommendation systems. A key challenge lies in accurately modeling user preferences in the face of data sparsity. To tackle this challenge, recent methods leverage contrastive learning (CL)
Shaowei Wei   +7 more
semanticscholar   +3 more sources

Self-Distillation Improves DNA Sequence Inference [PDF]

open access: green
Self-supervised pretraining (SSP) has been recognized as a method to enhance prediction accuracy in various downstream tasks. However, its efficacy for DNA sequences remains somewhat constrained. This limitation stems primarily from the fact that most existing SSP approaches in genomics focus on masked language modeling of individual sequences ...
Tong Yu   +4 more
openalex   +3 more sources

Self-Distillation for Unsupervised 3D Domain Adaptation [PDF]

open access: green, 2023
Adriano Cardace   +4 more
openalex   +4 more sources

A Teacher-Free Graph Knowledge Distillation Framework With Dual Self-Distillation

open access: yesIEEE Transactions on Knowledge and Data Engineering
Recent years have witnessed great success in handling graph-related tasks with Graph Neural Networks (GNNs). Despite their great academic success, Multi-Layer Perceptrons (MLPs) remain the primary workhorse for practical industrial applications.
Lirong Wu   +4 more
semanticscholar   +3 more sources

Understanding the Gains from Repeated Self-Distillation

open access: yesNeural Information Processing Systems
Self-Distillation is a special type of knowledge distillation where the student model has the same architecture as the teacher model. Despite using the same architecture and the same training data, self-distillation has been empirically observed to ...
Divyansh Pareek, S. S. Du, Sewoong Oh
semanticscholar   +3 more sources

Home - About - Disclaimer - Privacy