Results 331 to 340 of about 4,930,132 (374)
Some of the next articles are maybe not open access.
Annual Meeting of the Association for Computational Linguistics
While large language models (LLMs) have been pre-trained on multilingual corpora, their performance still lags behind in most languages compared to a few resource-rich languages.
Yuan Zhang +7 more
semanticscholar +1 more source
While large language models (LLMs) have been pre-trained on multilingual corpora, their performance still lags behind in most languages compared to a few resource-rich languages.
Yuan Zhang +7 more
semanticscholar +1 more source
Probabilistic online self-distillation
Neurocomputing, 2022Tzelepi, Maria +2 more
openaire +1 more source
UNDIAL: Self-Distillation with Adjusted Logits for Robust Unlearning in Large Language Models
North American Chapter of the Association for Computational LinguisticsMitigating the retention of sensitive or private information in large language models is essential for enhancing privacy and safety. Existing unlearning methods, like Gradient Ascent and Negative Preference Optimization, directly tune models to remove ...
Yijiang River Dong +4 more
semanticscholar +1 more source
Self-Distillation for Nonlinear Process Monitoring
International Symposium on Computer Science and Intelligent ControlCurrently, lightweight deep latent variable models are widely applied in industrial process monitoring, combining the strengths of traditional latent variable methods and deep neural networks.
Honghai Zhang, Guangjie Chen, Le Zhou
semanticscholar +1 more source
Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction
arXiv.org3D Gaussian Splatting has demonstrated notable success in large-scale scene reconstruction, but challenges persist due to high training memory consumption and storage overhead.
Jixuan Fan +3 more
semanticscholar +1 more source
SD-FSOD: Self-Distillation Paradigm via Distribution Calibration for Few-Shot Object Detection
IEEE transactions on circuits and systems for video technology (Print)Few-shot object detection (FSOD) aims to detect novel targets with only a few instances of the associated samples. Although combinations of distillation techniques and meta-learning paradigms have been acknowledged as the primary strategies for FSOD ...
Han Chen +7 more
semanticscholar +1 more source
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks
Neural Information Processing SystemsTwo competing paradigms exist for self-supervised learning of data representations. Joint Embedding Predictive Architecture (JEPA) is a class of architectures in which semantically similar inputs are encoded into representations that are predictive of ...
Etai Littwin +6 more
semanticscholar +1 more source
IEEE Transactions on Medical Imaging
Self-supervised learning (SSL) has long had great success in advancing the field of annotation-efficient learning. However, when applied to CT volume segmentation, most SSL methods suffer from two limitations, including rarely using the information ...
Yiwen Ye +3 more
semanticscholar +1 more source
Self-supervised learning (SSL) has long had great success in advancing the field of annotation-efficient learning. However, when applied to CT volume segmentation, most SSL methods suffer from two limitations, including rarely using the information ...
Yiwen Ye +3 more
semanticscholar +1 more source
SSSD: Self-Supervised Self Distillation
2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023Wei-Chi Chen, Wei-Ta Chu
openaire +1 more source
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
European Conference on Computer VisionThe scarcity of large-scale 3D-text paired data poses a great challenge on open vocabulary 3D scene understanding, and hence it is popular to leverage internet-scale 2D data and transfer their open vocabulary capabilities to 3D models through knowledge ...
Pengfei Wang +5 more
semanticscholar +1 more source

