Results 11 to 20 of about 157,275 (228)

Multiscale Vision Transformers [PDF]

open access: yes2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021
Technical ...
Haoqi Fan 0001   +6 more
openaire   +2 more sources

Vicinity Vision Transformer

open access: yesIEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
code: https://github.com/OpenNLPLab/Vicinity-Vision ...
Weixuan Sun   +9 more
openaire   +3 more sources

Transformers in Vision: A Survey [PDF]

open access: yesACM Computing Surveys, 2022
Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence as compared to recurrent networks, e.g.,
Salman H. Khan 0001   +5 more
openaire   +2 more sources

Dual Vision Transformer

open access: yesIEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
Prior works have proposed several strategies to reduce the computational cost of self-attention mechanism. Many of these works consider decomposing the self-attention procedure into regional and local feature extraction procedures that each incurs a much smaller computational complexity.
Ting Yao   +5 more
openaire   +3 more sources

Super Vision Transformer

open access: yesInternational Journal of Computer Vision, 2023
We attempt to reduce the computational costs in vision transformers (ViTs), which increase quadratically in the token number. We present a novel training paradigm that trains only one ViT model at a time, but is capable of providing improved image recognition performance with various computational costs. Here, the trained ViT model, termed super vision
Mingbao Lin   +6 more
openaire   +2 more sources

Scaling Vision Transformers

open access: yes2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Attention-based neural networks such as the Vision Transformer (ViT) have recently attained state-of-the-art results on many computer vision benchmarks. Scale is a primary ingredient in attaining excellent results, therefore, understanding a model's scaling properties is a key to designing future generations effectively.
Xiaohua Zhai   +3 more
openaire   +2 more sources

Building Extraction With Vision Transformer [PDF]

open access: yesIEEE Transactions on Geoscience and Remote Sensing, 2022
Submitted to ...
Libo Wang   +3 more
openaire   +2 more sources

Peripheral Vision Transformer

open access: yesAdvances in Neural Information Processing Systems 35, 2022
Human vision possesses a special type of visual processing systems called peripheral vision. Partitioning the entire visual field into multiple contour regions based on the distance to the center of our gaze, the peripheral vision provides us the ability to perceive various visual features at different regions.
Juhong Min   +3 more
openaire   +3 more sources

Reversible Vision Transformers

open access: yes2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
We present Reversible Vision Transformers, a memory efficient architecture design for visual recognition. By decoupling the GPU memory requirement from the depth of the model, Reversible Vision Transformers enable scaling up architectures with efficient memory usage.
Karttikeya Mangalam   +6 more
openaire   +2 more sources

Vision Transformer with Progressive Sampling [PDF]

open access: yes2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021
Accepted to ICCV ...
Xiaoyu Yue   +6 more
openaire   +3 more sources

Home - About - Disclaimer - Privacy