Results 211 to 218 of about 51,355 (218)
Some of the next articles are maybe not open access.

ViT-YOLO:Transformer-Based YOLO for Object Detection

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2021
Drone captured images have overwhelming characteristics including dramatic scale variance, complicated background filled with distractors, and flexible viewpoints, which pose enormous challenges for general object detectors based on common convolutional ...
Zixiao Zhang   +5 more
semanticscholar   +1 more source

CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications

IEEE Transactions on Image Processing
Vision Transformers (ViTs) mark a revolutionary advance in neural networks with their token mixer's powerful global context capability. However, the pairwise token affinity and complex matrix operations limit its deployment on resource-constrained ...
Tianfang Zhang   +5 more
semanticscholar   +1 more source

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

Neural Information Processing Systems
Existing parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.
Wangbo Zhao   +7 more
semanticscholar   +1 more source

LA-ViT: A Network With Transformers Constrained by Learned-Parameter-Free Attention for Interpretable Grading in a New Laryngeal Histopathology Image Dataset

IEEE journal of biomedical and health informatics
Grading laryngeal squamous cell carcinoma (LSCC) based on histopathological images is a clinically significant yet challenging task. However, more low-effect background semantic information appeared in the feature maps, feature channels, and class ...
Pan Huang   +11 more
semanticscholar   +1 more source

LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition

AAAI Conference on Artificial Intelligence
The Vision Transformer (ViT) excels in accuracy when handling high-resolution images, yet it confronts the challenge of significant spatial redundancy, leading to increased computational and memory requirements.
Youbing Hu   +6 more
semanticscholar   +1 more source

Hybrid ViT-CNN Network for Fine-Grained Image Classification

IEEE Signal Processing Letters
In recent years, vision transformer (ViT) has achieved remarkable breakthroughs in fine-grained visual classification (FGVC) because of its self-attention mechanism that excels in extracting distinctive features from different pixels.
Ran Shao, Xiaojun Bi, Zheng Chen
semanticscholar   +1 more source

RFAConv-CBM-ViT: enhanced vision transformer for metal surface defect detection

Journal of Supercomputing
Hao Wei   +3 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy