Content understanding - Open Access .click

Results 1 to 10 of about 1,640,961 (217)

HierTTS: Expressive End-to-End Text-to-Waveform Using a Multi-Scale Hierarchical Variational Auto-Encoder

Applied Sciences, 2023
End-to-end text-to-speech (TTS) models that directly generate waveforms from text are gaining popularity. However, existing end-to-end models are still not natural enough in their prosodic expressiveness.
Zengqiang Shang +4 more
doaj +1 more source

Explore Long-Range Context Features for Speaker Verification

Applied Sciences, 2023
Multi-scale context information, especially long-range dependency, has shown to be beneficial for speaker verification (SV) tasks. In this paper, we propose three methods to systematically explore long-range context SV feature extraction based on ResNet ...
Zhuo Li +4 more
doaj +1 more source

An individualization approach for head-related transfer function in arbitrary directions based on deep learning [PDF]

JASA Express Letters, 2022
This paper provides an individualization approach for head-related transfer function (HRTF) in arbitrary directions based on deep learning by utilizing dual-autoencoder architecture to establish the relationship between HRTF magnitude spectrum and ...
Dingding Yao +6 more
doaj +1 more source

Improving Transformer Based End-to-End Code-Switching Speech Recognition Using Language Identification

Applied Sciences, 2021
A Recurrent Neural Networks (RNN) based attention model has been used in code-switching speech recognition (CSSR). However, due to the sequential computation constraint of RNN, there are stronger short-range dependencies and weaker long-range ...
Zheying Huang +5 more
doaj +1 more source

Confidence Learning for Semi-Supervised Acoustic Event Detection

Applied Sciences, 2021
In recent years, the involvement of synthetic strongly labeled data, weakly labeled data, and unlabeled data has drawn much research attention in semi-supervised acoustic event detection (SAED).
Yuzhuo Liu +4 more
doaj +1 more source

Query-by-Example with Acoustic Word Embeddings Using wav2vec Pretraining [PDF]

Jisuanji kexue, 2022
Query-by-Example is a popular keyword detection method in the absence of speech resources.It can build a keyword query system with excellent performance when there are few labeled voice resources and a lack of pronunciation dictionaries.In recent years ...
LI Zhao-qi, LI Ta
doaj +1 more source

A Feature Optimization Approach Based on Inter-Class and Intra-Class Distance for Ship Type Classification

Sensors, 2020
Deep learning based methods have achieved state-of-the-art results on the task of ship type classification. However, most existing ship type classification algorithms take time–frequency (TF) features as input, the underlying discriminative information ...
Chen Li +4 more
doaj +1 more source

Temporal Convolution Network Based Joint Optimization of Acoustic-to-Articulatory Inversion

Applied Sciences, 2021
Articulatory features are proved to be efficient in the area of speech recognition and speech synthesis. However, acquiring articulatory features has always been a difficult research hotspot.
Guolun Sun, Zhihua Huang, Li Wang, Pengyuan Zhang +3 more
doaj +1 more source

A Pronunciation Prior Assisted Vowel Reduction Detection Framework with Multi-Stream Attention Method

Applied Sciences, 2021
Vowel reduction is a common pronunciation phenomenon in stress-timed languages like English. Native speakers tend to weaken unstressed vowels into a schwa-like sound.
Zongming Liu, Zhihua Huang, Li Wang, Pengyuan Zhang +3 more
doaj +1 more source

Continuous speech recognition by convolutional neural networks

工程科学学报, 2015
Convolutional neural networks (CNNs), which show success in achieving translation invariance for many image processing tasks, were investigated for continuous speech recognition.
ZHANG Qing-qing, LIU Yong, PAN Jie-lin, YAN Yong-hong +3 more
doaj +1 more source

deep learning
automatic speech recognition
stigma

risk of suicide
depression
mental health literacy