Results 31 to 40 of about 82,164 (262)
We attempt to reduce the computational costs in vision transformers (ViTs), which increase quadratically in the token number. We present a novel training paradigm that trains only one ViT model at a time, but is capable of providing improved image recognition performance with various computational costs. Here, the trained ViT model, termed super vision
Lin, Mingbao +5 more
openaire +2 more sources
BUViTNet: Breast Ultrasound Detection via Vision Transformers
Convolutional neural networks (CNNs) have enhanced ultrasound image-based early breast cancer detection. Vision transformers (ViTs) have recently surpassed CNNs as the most effective method for natural image analysis. ViTs have proven their capability of
Gelan Ayana, Se-woon Choe
doaj +1 more source
LIFT: Learned Invariant Feature Transform [PDF]
We introduce a novel Deep Network architecture that implements the full feature point handling pipeline, that is, detection, orientation estimation, and feature description.
Fua, Pascal +3 more
core +2 more sources
3D-Vision-Transformer Stacking Ensemble for Assessing Prostate Cancer Aggressiveness from T2w Images
Vision transformers represent the cutting-edge topic in computer vision and are usually employed on two-dimensional data following a transfer learning approach.
Eva Pachetti, Sara Colantonio
doaj +1 more source
Survey of Vision Transformers(ViT) [PDF]
The Vision Transformer(ViT),an application of the Transformer architecture with an encoder-decoder structure,has garnered remarkable success in the field of computer vision.Over the past few years,research centered around ViT has witnessed a prolific ...
LI Yujie, MA Zihang, WANG Yifu, WANG Xinghe, TAN Benying
doaj +1 more source
Vision Transformers for Remote Sensing Image Classification
In this paper, we propose a remote-sensing scene-classification method based on vision transformers. These types of networks, which are now recognized as state-of-the-art models in natural language processing, do not rely on convolution layers as in ...
Yakoub Bazi +4 more
doaj +1 more source
Sign language recognition with transformer networks [PDF]
Sign languages are complex languages. Research into them is ongoing, supported by large video corpora of which only small parts are annotated. Sign language recognition can be used to speed up the annotation process of these corpora, in order to aid ...
Dambre, Joni +2 more
core
Measurements With A Quantum Vision Transformer: A Naive Approach [PDF]
In mainstream machine learning, transformers are gaining widespread usage. As Vision Transformers rise in popularity in computer vision, they now aim to tackle a wide variety of machine learning applications.
Pasquali Dominic +2 more
doaj +1 more source
Vision Transformers in Image Restoration: A Survey
The Vision Transformer (ViT) architecture has been remarkably successful in image restoration. For a while, Convolutional Neural Networks (CNN) predominated in most computer vision tasks.
Anas M. Ali +5 more
doaj +1 more source
Vision Transformers Are Robust Learners
Transformers, composed of multiple self-attention layers, hold strong promises toward a generic learning primitive applicable to different data modalities, including the recent breakthroughs in computer vision achieving state-of-the-art (SOTA) standard accuracy. What remains largely unexplored is their robustness evaluation and attribution.
Paul, Sayak, Chen, Pin-Yu
openaire +2 more sources

