Results 11 to 20 of about 8,846,390 (352)

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints [PDF]

open access: yesConference on Empirical Methods in Natural Language Processing, 2023
Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference.
J. Ainslie   +5 more
semanticscholar   +1 more source

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios [PDF]

open access: yes2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2021
Object detection on drone-captured scenarios is a recent popular task. As drones always navigate in different altitudes, the object scale varies violently, which burdens the optimization of networks.
Xingkui Zhu   +3 more
semanticscholar   +1 more source

Dynamic Head: Unifying Object Detection Heads with Attentions [PDF]

open access: yesComputer Vision and Pattern Recognition, 2021
The complex nature of combining localization and classification in object detection has resulted in the flourished development of methods. Previous works tried to improve the performance in various object detection heads but failed to present a unified ...
Xiyang Dai   +6 more
semanticscholar   +1 more source

Head and neck squamous cell carcinoma

open access: yesNature Reviews Disease Primers, 2020
Head and neck squamous cell carcinomas (HNSCCs) originate from the mucosal epithelium in the oral cavity, pharynx and larynx, and are caused by viral infection or carcinogen exposure.
Daniel E. Johnson   +5 more
semanticscholar   +2 more sources

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing [PDF]

open access: yesComputer Vision and Pattern Recognition, 2020
We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing. Our model learns to synthesize a talking-head video using a source image containing the target person’s appearance and a driving video that ...
Ting-Chun Wang, Arun Mallya, Ming-Yu Liu
semanticscholar   +1 more source

Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned [PDF]

open access: yesAnnual Meeting of the Association for Computational Linguistics, 2019
Multi-head self-attention is a key component of the Transformer, a state-of-the-art architecture for neural machine translation. In this work we evaluate the contribution made by individual attention heads to the overall performance of the model and ...
Elena Voita   +4 more
semanticscholar   +1 more source

Ethics in educational research: review boards, ethical issues and researcher development [PDF]

open access: yes, 2020
Educational research, and research in the Social Sciences more generally, has experienced a growth in the introduction of ethical review boards since the 1990s.
Head, George
core   +1 more source

Effect of Operating Head on Dynamic Behavior of a Pump–Turbine Runner in Turbine Mode

open access: yesEnergies, 2022
Pumped storage units improve the stability of the power grid, and the key component is the pump–turbine. A pump–turbine usually needs to start and shutdown frequently, and the operating head varies greatly due to changes in the water level of the ...
Xiangyang Li   +8 more
doaj   +1 more source

OPTIMASI UNJUK KERJA KINCIR AIR UNDERSHOT

open access: yesRekayasa Mesin, 2023
The purpose in this research, the performance of the undershot waterwheel with hydraulic channel modifications were investigated. Testing was carried out on undershot waterwheel with diameter of 0,48 m, width of 0,10 m and the number of blades of 12 ...
Agato Agato   +4 more
doaj   +1 more source

SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning [PDF]

open access: yesInternational Symposium on High-Performance Computer Architecture, 2020
The attention mechanism is becoming increasingly popular in Natural Language Processing (NLP) applications, showing superior performance than convolutional and recurrent architectures.
Hanrui Wang, Zhekai Zhang, Song Han
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy