Results 331 to 340 of about 4,930,132 (374)
Some of the next articles are maybe not open access.

Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages

Annual Meeting of the Association for Computational Linguistics
While large language models (LLMs) have been pre-trained on multilingual corpora, their performance still lags behind in most languages compared to a few resource-rich languages.
Yuan Zhang   +7 more
semanticscholar   +1 more source

Probabilistic online self-distillation

Neurocomputing, 2022
Tzelepi, Maria   +2 more
openaire   +1 more source

UNDIAL: Self-Distillation with Adjusted Logits for Robust Unlearning in Large Language Models

North American Chapter of the Association for Computational Linguistics
Mitigating the retention of sensitive or private information in large language models is essential for enhancing privacy and safety. Existing unlearning methods, like Gradient Ascent and Negative Preference Optimization, directly tune models to remove ...
Yijiang River Dong   +4 more
semanticscholar   +1 more source

Self-Distillation for Nonlinear Process Monitoring

International Symposium on Computer Science and Intelligent Control
Currently, lightweight deep latent variable models are widely applied in industrial process monitoring, combining the strengths of traditional latent variable methods and deep neural networks.
Honghai Zhang, Guangjie Chen, Le Zhou
semanticscholar   +1 more source

Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction

arXiv.org
3D Gaussian Splatting has demonstrated notable success in large-scale scene reconstruction, but challenges persist due to high training memory consumption and storage overhead.
Jixuan Fan   +3 more
semanticscholar   +1 more source

SD-FSOD: Self-Distillation Paradigm via Distribution Calibration for Few-Shot Object Detection

IEEE transactions on circuits and systems for video technology (Print)
Few-shot object detection (FSOD) aims to detect novel targets with only a few instances of the associated samples. Although combinations of distillation techniques and meta-learning paradigms have been acknowledged as the primary strategies for FSOD ...
Han Chen   +7 more
semanticscholar   +1 more source

How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks

Neural Information Processing Systems
Two competing paradigms exist for self-supervised learning of data representations. Joint Embedding Predictive Architecture (JEPA) is a class of architectures in which semantically similar inputs are encoded into representations that are predictive of ...
Etai Littwin   +6 more
semanticscholar   +1 more source

CADS: A Self-Supervised Learner via Cross-Modal Alignment and Deep Self-Distillation for CT Volume Segmentation

IEEE Transactions on Medical Imaging
Self-supervised learning (SSL) has long had great success in advancing the field of annotation-efficient learning. However, when applied to CT volume segmentation, most SSL methods suffer from two limitations, including rarely using the information ...
Yiwen Ye   +3 more
semanticscholar   +1 more source

SSSD: Self-Supervised Self Distillation

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023
Wei-Chi Chen, Wei-Ta Chu
openaire   +1 more source

Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation

European Conference on Computer Vision
The scarcity of large-scale 3D-text paired data poses a great challenge on open vocabulary 3D scene understanding, and hence it is popular to leverage internet-scale 2D data and transfer their open vocabulary capabilities to 3D models through knowledge ...
Pengfei Wang   +5 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy