Swin transformer - Open Access .click

Results 51 to 60 of about 15,271 (229)

BMPCQA: Bioinspired Metaverse Point Cloud Quality Assessment Based on Large Multimodal Models

Advanced Intelligent Systems, EarlyView.
This study presents a bioinspired metaverse point cloud quality assessment metric, which simulates the human visual evaluation process to perform the point cloud quality assessment task. It first extracts rendering projection video features, normal image features, and point cloud patch features, which are then fed into a large multimodal model to ...
Huiyu Duan +7 more
wiley +1 more source

SWCGAN: Generative Adversarial Network Combining Swin Transformer and CNN for Remote Sensing Image Super-Resolution

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022
Easy and efficient acquisition of high-resolution remote sensing images is of importance in geographic information systems. Previously, deep neural networks composed of convolutional layers have achieved impressive progress in super-resolution ...
Jingzhi Tu +3 more
doaj +1 more source

Semantic-Aware Local-Global Vision Transformer

, 2022
Vision Transformers have achieved remarkable progresses, among which Swin Transformer has demonstrated the tremendous potential of Transformer for vision tasks.
Chen, Fanglin +4 more
core

Swin-FER: Swin Transformer for Facial Expression Recognition

Applied Sciences
The ability of transformers to capture global context information is highly beneficial for recognizing subtle differences in facial expressions. However, compared to convolutional neural networks, transformers require the computation of dependencies between each element and all other elements, leading to high computational complexity. Additionally, the
Mei Bie +4 more
openaire +2 more sources

Source Microphone Identification Using Swin Transformer

Applied Sciences, 2023
Microphone identification is a crucial challenge in the field of digital audio forensics. The ability to accurately identify the type of microphone used to record a piece of audio can provide important information for forensic analysis and crime investigations.
Mustafa Qamhan +2 more
openaire +2 more sources

VAE+DDPG: An Attention‐Enhanced Variational Autoencoder for Deep Reinforcement Learning‐Based Autonomous Navigation in Low‐Light Environments

Advanced Intelligent Systems, EarlyView.
Variational Autoencoder+Deep Deterministic Policy Gradient addresses low‐light failures of infrared depth sensing for indoor robot navigation. Stage 1 pretrains an attention‐enhanced Variational Autoencoder (Convolutional Block Attention Module+Feature Pyramid Network) to map dark depth frames to a well‐lit reconstruction, yielding a 128‐D latent code ...
Uiseok Lee +7 more
wiley +1 more source

Multi-Focus Microscopy Image Fusion Based on Swin Transformer Architecture

Applied Sciences, 2023
In this study, we introduce the U-Swin fusion model, an effective and efficient transformer-based architecture designed for the fusion of multi-focus microscope images.
Han Hank Xia, Hao Gao, Hang Shao, Kun Gao, Wei Liu +4 more
doaj +1 more source

Pattern Attention Transformer with Doughnut Kernel

, 2023
We present in this paper a new architecture, the Pattern Attention Transformer (PAT), that is composed of the new doughnut kernel. Compared with tokens in the NLP field, Transformer in computer vision has the problem of handling the high resolution of ...
Sheng, WenYuan
core

KDLM: Lightweight Brain Tumor Segmentation via Knowledge Distillation

Advanced Intelligent Systems, EarlyView.
A lightweight student network is designed, which is based on multiscale and multilevel feature fusion and combined with the residual channel attention mechanism to achieve efficient feature extraction and fusion with very few parameters. A dual‐teacher collaborative knowledge distillation framework is proposed.
Baotian Li +4 more
wiley +1 more source

HEAL-SWIN: A Vision Transformer On The Sphere

, 2023
High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to ...
Carlsson, Oscar +6 more
core

deep learning
fos: computer and information sciences
computer vision and pattern recognition cs.cv

transformer
artificial intelligence
computer science - machine learning

machine learning cs.lg
image classification
computer science - artificial intelligence