Results 211 to 218 of about 51,355 (218)
Some of the next articles are maybe not open access.
ViT-YOLO:Transformer-Based YOLO for Object Detection
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2021Drone captured images have overwhelming characteristics including dramatic scale variance, complicated background filled with distractors, and flexible viewpoints, which pose enormous challenges for general object detectors based on common convolutional ...
Zixiao Zhang +5 more
semanticscholar +1 more source
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications
IEEE Transactions on Image ProcessingVision Transformers (ViTs) mark a revolutionary advance in neural networks with their token mixer's powerful global context capability. However, the pairwise token affinity and complex matrix operations limit its deployment on resource-constrained ...
Tianfang Zhang +5 more
semanticscholar +1 more source
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
Neural Information Processing SystemsExisting parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.
Wangbo Zhao +7 more
semanticscholar +1 more source
IEEE journal of biomedical and health informatics
Grading laryngeal squamous cell carcinoma (LSCC) based on histopathological images is a clinically significant yet challenging task. However, more low-effect background semantic information appeared in the feature maps, feature channels, and class ...
Pan Huang +11 more
semanticscholar +1 more source
Grading laryngeal squamous cell carcinoma (LSCC) based on histopathological images is a clinically significant yet challenging task. However, more low-effect background semantic information appeared in the feature maps, feature channels, and class ...
Pan Huang +11 more
semanticscholar +1 more source
LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition
AAAI Conference on Artificial IntelligenceThe Vision Transformer (ViT) excels in accuracy when handling high-resolution images, yet it confronts the challenge of significant spatial redundancy, leading to increased computational and memory requirements.
Youbing Hu +6 more
semanticscholar +1 more source
Hybrid ViT-CNN Network for Fine-Grained Image Classification
IEEE Signal Processing LettersIn recent years, vision transformer (ViT) has achieved remarkable breakthroughs in fine-grained visual classification (FGVC) because of its self-attention mechanism that excels in extracting distinctive features from different pixels.
Ran Shao, Xiaojun Bi, Zheng Chen
semanticscholar +1 more source
RFAConv-CBM-ViT: enhanced vision transformer for metal surface defect detection
Journal of SupercomputingHao Wei +3 more
semanticscholar +1 more source

