gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window
Following the success in language domain, the self-attention mechanism (transformer) is adopted in the vision domain and achieving great success recently. Additionally, as another stream, multi-layer perceptron (MLP) is also explored in the vision domain.
Go, Mocho, Tachibana, Hideyuki
core +1 more source
Center Point Target Detection Algorithm Based on Improved Swin Transformer [PDF]
Aiming at the shortcomings of Swin Transformer in extracting local feature information and expressing features,this paper proposes a center point target detection algorithm based on improved Swin Transformer to improve its performance in target detection.
LIU Jiasen, HUANG Jun
doaj +1 more source
A wheat spike detection method based on Transformer
Wheat spike detection has important research significance for production estimation and crop field management. With the development of deep learning-based algorithms, researchers tend to solve the detection task by convolutional neural networks (CNNs ...
Qiong Zhou +11 more
doaj +1 more source
SSformer: A Lightweight Transformer for Semantic Segmentation
It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer may lack of inductive biases of local neighborhoods and possess a high time complexity.
Gao, Pan, Shi, Wentao, Xu, Jing
core
Swin on Axes: Extending Swin Transformers to Quadtree Image Representations [PDF]
In recent years, Transformer models have revolutionized machine learning. While this has resulted in impressive re-sults in the field of Natural Language Processing, Computer Vision quickly stumbled upon computation and memory problems due to the high resolution and dimensionality of the input data. This is particularly true for video, where the number
Marc Oliu +3 more
openaire +2 more sources
Classification and Model Explanation of Traditional Dwellings Based on Improved Swin Transformer
The extraction of features and classification of traditional dwellings plays significant roles in preserving and ensuring the sustainable development of these structures.
Shangbo Miao +3 more
doaj +1 more source
A Swin Transformer-Based Encoding Booster Integrated in U-Shaped Network for Building Extraction
Building extraction is a popular topic in remote sensing image processing. Efficient building extraction algorithms can identify and segment building areas to provide informative data for downstream tasks.
Xiao Xiao +5 more
doaj +1 more source
STUCNET – SWIN TRANSFORMER-V2 UNET FOR CRACK SEGMENTATION NETWORK [PDF]
Automatic crack detection on road surfaces is an important task for supporting the quality control of road infrastructure in transportation. Various methods have been proposed for crack segmentation, but their accuracy is still limited.
Nguyen, Le Hoang Tung, Phan, Hai-Hong
core +1 more source
Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography [PDF]
Renal failure, a public health concern, and the scarcity of nephrologists around the globe have necessitated the development of an AI-based system to auto-diagnose kidney diseases.
Alam, Md Golam Rabiul +5 more
core +2 more sources
Evaluation and Mitigation of Faults Affecting Swin Transformers
In the last decade, a huge effort has been spent on assessing the reliability of Convolutional Neural networks (CNNs), probably the most popular architecture for image classification tasks. However, modern Deep Neural Networks (DNNs) are rapidly overtaking CNNs, as state-of-the-art results for many tasks are achieved with the Transformers, innovative ...
Gabriele Gavarini +2 more
openaire +2 more sources

