Swin transformer - Open Access .click

Results 121 to 130 of about 14,679 (231)

Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition

Swin-Transformer has demonstrated remarkable success in computer vision by leveraging its hierarchical feature representation based on Transformer. In speech signals, emotional information is distributed across different scales of speech features, e.\,g.,
Lian, Hailun +6 more
core

SwinV2DNet: Pyramid and Self-Supervision Compounded Feature Learning for Remote Sensing Images Change Detection

, 2023
Among the current mainstream change detection networks, transformer is deficient in the ability to capture accurate low-level details, while convolutional neural network (CNN) is wanting in the capacity to understand global information and establish ...
Liu, Jia, Wei, Zhihui, Wu, Zebin, Zheng, Dalong +3 more
core

LLVMs4Protest: Harnessing the Power of Large Language and Vision Models for Deciphering Protests in the News

, 2023
Large language and vision models have transformed how social movements scholars identify protest and extract key protest attributes from multi-modal data such as texts, images, and videos.
Zhang, Yongjun
core

Efficient Wheat Disease Identification Using Hybrid Swin-SHARP Vision Model

IEEE Access
Accurate identification of wheat diseases is an essential component for increasing crop yields and guaranteeing global food security. However, subjective opinions, errors, and laborious procedures frequently limit traditional approaches, which are based ...
Waqar Khalid +3 more
doaj +1 more source

YotoR-You Only Transform One Representation

This paper introduces YotoR (You Only Transform One Representation), a novel deep learning model for object detection that combines Swin Transformers and YoloR architectures.
Loncomilla, Patricio +2 more
core

Sounds like gambling : detection of gambling venue visitation from sounds in gamblers’ environments using a transformer [PDF]

Objective digital measurement of gamblers visiting gambling venues is conducted using cashless cards and facial recognition systems, but these methods are confined within a single gambling venue.
304901/profile-ja.html +12 more
core

Swin Transformer Fusion Network for Image Quality Assessment

IEEE Access
This paper presents an efficient deep-learning model named Swin Transformer fusion network (STFN) for full-reference image quality assessment (FR-IQA). The STFN model uses the first and second stages of the Swin Transformer for feature extraction.
Hyeongmyeon Kim, Changhoon Yim
doaj +1 more source

DCHT: Deep Complex Hybrid Transformer for Speech Enhancement

, 2023
Most of the current deep learning-based approaches for speech enhancement only operate in the spectrogram or waveform domain. Although a cross-domain transformer combining waveform- and spectrogram-domain inputs has been proposed, its performance can be ...
Li, Jialu, Li, Junhui, Wang, Pu, Zhang, Youshan +3 more
core

PPLA-Transformer: An Efficient Transformer for Defect Detection with Linear Attention Based on Pyramid Pooling

Sensors
Defect detection is crucial for quality control in industrial products. The defects in industrial products are typically subtle, leading to reduced accuracy in detection.
Xiaona Song +4 more
doaj +1 more source

Railway Signal Relay Voiceprint Fault Diagnosis Method Based on Swin-Transformer and Fusion of Gaussian-Laplacian Pyramid

Mathematics
Fault diagnosis of railway signal relays is crucial for the operational safety and efficiency of railway systems. With the continuous advancement of deep learning techniques in various applications, voiceprint-based fault diagnosis has emerged as a ...
Yi Liu +4 more
doaj +1 more source

deep learning
transformer
semantic segmentation

image classification
convolutional neural network
medicine

artificial intelligence