Results 11 to 20 of about 170,334 (258)
Prior works have proposed several strategies to reduce the computational cost of self-attention mechanism. Many of these works consider decomposing the self-attention procedure into regional and local feature extraction procedures that each incurs a much smaller computational complexity.
Ting Yao +5 more
openaire +3 more sources
Multiscale Vision Transformers [PDF]
Technical ...
Fan, Haoqi +6 more
openaire +2 more sources
code: https://github.com/OpenNLPLab/Vicinity-Vision ...
Weixuan Sun +9 more
openaire +3 more sources
A Survey on Vision Transformer [PDF]
Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation capabilities, researchers are looking at ways to apply transformer to computer vision tasks.
Kai Han +12 more
openaire +2 more sources
Reversible Vision Transformers
We present Reversible Vision Transformers, a memory efficient architecture design for visual recognition. By decoupling the GPU memory requirement from the depth of the model, Reversible Vision Transformers enable scaling up architectures with efficient memory usage.
Mangalam, Karttikeya +6 more
openaire +2 more sources
Building Extraction With Vision Transformer [PDF]
Submitted to ...
Libo Wang +3 more
openaire +2 more sources
Vision Transformer with Progressive Sampling [PDF]
Accepted to ICCV ...
Yue, X +6 more
openaire +3 more sources
Semi-supervised Vision Transformers
We study the training of Vision Transformers for semi-supervised image classification. Transformers have recently demonstrated impressive performance on a multitude of supervised learning tasks. Surprisingly, we show Vision Transformers perform significantly worse than Convolutional Neural Networks when only a small set of labeled data is available ...
Zejia Weng +4 more
openaire +2 more sources
This study introduces an optimal topology of vision transformers for real-time video action recognition in a cloud-based solution. Although model performance is a key criterion for real-time video analysis use cases, inference latency plays a more ...
Saman Sarraf, Milton Kabia
doaj +1 more source
Measuring miniature eye movements by means of a SQUID magnetometer [PDF]
A new technique to measure small eye movements is reported. The precise recording of human eye movements is necessary for research on visual fatigue induced by visual display units.1 So far all methods used have disadvantages: especially those which are ...
Breukink, E.W. +4 more
core +4 more sources

