GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints [PDF]
Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference.
J. Ainslie +5 more
semanticscholar +1 more source
TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios [PDF]
Object detection on drone-captured scenarios is a recent popular task. As drones always navigate in different altitudes, the object scale varies violently, which burdens the optimization of networks.
Xingkui Zhu +3 more
semanticscholar +1 more source
Dynamic Head: Unifying Object Detection Heads with Attentions [PDF]
The complex nature of combining localization and classification in object detection has resulted in the flourished development of methods. Previous works tried to improve the performance in various object detection heads but failed to present a unified ...
Xiyang Dai +6 more
semanticscholar +1 more source
Head and neck squamous cell carcinoma
Head and neck squamous cell carcinomas (HNSCCs) originate from the mucosal epithelium in the oral cavity, pharynx and larynx, and are caused by viral infection or carcinogen exposure.
Daniel E. Johnson +5 more
semanticscholar +2 more sources
One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing [PDF]
We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing. Our model learns to synthesize a talking-head video using a source image containing the target person’s appearance and a driving video that ...
Ting-Chun Wang, Arun Mallya, Ming-Yu Liu
semanticscholar +1 more source
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned [PDF]
Multi-head self-attention is a key component of the Transformer, a state-of-the-art architecture for neural machine translation. In this work we evaluate the contribution made by individual attention heads to the overall performance of the model and ...
Elena Voita +4 more
semanticscholar +1 more source
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning [PDF]
The attention mechanism is becoming increasingly popular in Natural Language Processing (NLP) applications, showing superior performance than convolutional and recurrent architectures.
Hanrui Wang, Zhekai Zhang, Song Han
semanticscholar +1 more source
Reviewing the epidemiology of head and neck cancer: definitions, trends and risk factors
Introduction Head and neck cancer appears to be increasing in incidence, with potential changes in aetiology proposed. This paper aims to provide a narrative overview of the epidemiological literature to describe the disease burden and trends in terms of
M. Gormley +4 more
semanticscholar +1 more source
Head to toe, in the head [PDF]
Sometime about 250,000 y ago, primates started talking to each other (1). Before that time facial expressions and body language were the main modes of communication among primates. Even today in the presence of our sophisticated language system, face and body gestures play a major role in human communication.
openaire +3 more sources
Head-to-Head Polymers XXXV. Head-to-Head Poly(2-vinylnaphthalene) [PDF]
Head-to-head poly(2-vinylnaphthalene) was prepared in five steps from 2-naphthylacetic acid. The methyl ester was brominated with N-bromosuccinimide and 2(2-naphthyl)2-bromoacetate treated with a copper/zinc couple, which gave dimethyl 2,3-di(2-naphthyl)succinate in moderate yield.
Nanasawa, Masato, Hu, Liping, Vogl, Otto
openaire +1 more source

