Vision transformer - Open Access .click

Results 11 to 20 of about 33,786 (306)

Vision Transformer with Progressive Sampling [PDF]

2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021
Accepted to ICCV ...
Xiaoyu Yue +6 more
openaire +5 more sources

Vision Transformer in Industrial Visual Inspection

Applied Sciences, 2022
Artificial intelligence as an approach to visual inspection in industrial applications has been considered for decades. Recent successes, driven by advances in deep learning, present a potential paradigm shift and have the potential to facilitate an ...
Nils Hütten, Richard Meyes, Tobias Meisen +2 more
doaj +2 more sources

Enhancing Security: Infused Hybrid Vision Transformer for Signature Verification [PDF]

IEEE Access
Handwritten signature verification is challenging because there is a huge variation between the orientation thickness and appearance of handwritten signatures.
Muhammad Ishfaq +3 more
doaj +2 more sources

Super Vision Transformer

International Journal of Computer Vision, 2023
We attempt to reduce the computational costs in vision transformers (ViTs), which increase quadratically in the token number. We present a novel training paradigm that trains only one ViT model at a time, but is capable of providing improved image recognition performance with various computational costs. Here, the trained ViT model, termed super vision
Mingbao Lin +6 more
openaire +2 more sources

Scaling Vision Transformers

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Attention-based neural networks such as the Vision Transformer (ViT) have recently attained state-of-the-art results on many computer vision benchmarks. Scale is a primary ingredient in attaining excellent results, therefore, understanding a model's scaling properties is a key to designing future generations effectively.
Xiaohua Zhai +3 more
openaire +2 more sources

A Review of Transformer-Based Approaches for Image Captioning

Applied Sciences, 2023
Visual understanding is a research area that bridges the gap between computer vision and natural language processing. Image captioning is a visual understanding task in which natural language descriptions of images are automatically generated using ...
Oscar Ondeng, Heywood Ouma, Peter Akuon
doaj +1 more source

STHarDNet: Swin Transformer with HarDNet for MRI Segmentation

Applied Sciences, 2022
In magnetic resonance imaging (MRI) segmentation, conventional approaches utilize U-Net models with encoder–decoder structures, segmentation models using vision transformers, or models that combine a vision transformer with an encoder–decoder model ...
Yeonghyeon Gu, Zhegao Piao, Seong Joon Yoo +2 more
doaj +1 more source

Building Extraction With Vision Transformer [PDF]

IEEE Transactions on Geoscience and Remote Sensing, 2022
Submitted to ...
Libo Wang +3 more
openaire +2 more sources

Peripheral Vision Transformer

Advances in Neural Information Processing Systems 35, 2022
Human vision possesses a special type of visual processing systems called peripheral vision. Partitioning the entire visual field into multiple contour regions based on the distance to the center of our gaze, the peripheral vision provides us the ability to perceive various visual features at different regions.
Juhong Min +3 more
openaire +3 more sources

Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?

Sensors, 2023
Understanding actions in videos remains a significant challenge in computer vision, which has been the subject of several pieces of research in the last decades.
Oumaima Moutik +6 more
doaj +1 more source

deep learning
artificial intelligence cs.ai
computer science - machine learning

machine learning cs.lg
computer science - artificial intelligence
vision transformers

swin transformer
transformer
artificial intelligence