Results 1 to 10 of about 170,334 (258)

ViTT: Vision Transformer Tracker [PDF]

open access: yesSensors, 2021
This paper presents a new model for multi-object tracking (MOT) with a transformer. MOT is a spatiotemporal correlation task among interest objects and one of the crucial technologies of multi-unmanned aerial vehicles (Multi-UAV).
Xiaoning Zhu   +4 more
doaj   +3 more sources

Review of Transformer in Computer Vision [PDF]

open access: yesJisuanji kexue, 2023
Transformer is an attention-based encoder-decoder architecture.Due to its long-range sequence modeling and parallel computing capability,Transformer have made a significant breakthrough in natural language processing and is gradually expanding to ...
CHEN Luoxuan, LIN Chengchuang, ZHENG Zhaoliang, MO Zefeng, HUANG Xinyi, ZHAO Gansen
doaj   +1 more source

Evaluation and Comparison of Semantic Segmentation Networks for Rice Identification Based on Sentinel-2 Imagery

open access: yesRemote Sensing, 2023
Efficient and accurate rice identification based on high spatial and temporal resolution remote sensing imagery is essential for achieving precision agriculture and ensuring food security.
Huiyao Xu, Jia Song, Yunqiang Zhu
doaj   +1 more source

CSiT: A Multiscale Vision Transformer for Hyperspectral Image Classification

open access: yesIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022
The hyperspectral image (HSI) has nearly continuous spectral information; thus, the target of interest can be accurately identified by the subtle details of spectral properties.
Wenxuan He   +4 more
doaj   +1 more source

Transformer-Based Semantic Segmentation for Extraction of Building Footprints from Very-High-Resolution Images

open access: yesSensors, 2023
Semantic segmentation with deep learning networks has become an important approach to the extraction of objects from very high-resolution remote sensing images.
Jia Song, A-Xing Zhu, Yunqiang Zhu
doaj   +1 more source

A Review of Transformer-Based Approaches for Image Captioning

open access: yesApplied Sciences, 2023
Visual understanding is a research area that bridges the gap between computer vision and natural language processing. Image captioning is a visual understanding task in which natural language descriptions of images are automatically generated using ...
Oscar Ondeng, Heywood Ouma, Peter Akuon
doaj   +1 more source

STHarDNet: Swin Transformer with HarDNet for MRI Segmentation

open access: yesApplied Sciences, 2022
In magnetic resonance imaging (MRI) segmentation, conventional approaches utilize U-Net models with encoder–decoder structures, segmentation models using vision transformers, or models that combine a vision transformer with an encoder–decoder model ...
Yeonghyeon Gu   +2 more
doaj   +1 more source

Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?

open access: yesSensors, 2023
Understanding actions in videos remains a significant challenge in computer vision, which has been the subject of several pieces of research in the last decades.
Oumaima Moutik   +6 more
doaj   +1 more source

Transformer-based ripeness segmentation for tomatoes

open access: yesSmart Agricultural Technology, 2023
With the recent development of computer vision technology, various computer vision techniques have been applied to agriculture. Recently, the Transformer network has been introduced to image recognition, which allows a different approach to extracting ...
Risa Shinoda   +3 more
doaj   +1 more source

Transformer architectures for computer vision: A comprehensive review and future research directions [PDF]

open access: yesEPJ Web of Conferences
Long-range dependencies and contextual relationships in videos were captured by using Convolutional Neural Networks (CNNs) in past. Recently the use of Transformers is started for capturing the long-range dependencies and contextual relationships in ...
Ugile Tukaram, Uke Nilesh
doaj   +1 more source

Home - About - Disclaimer - Privacy