Speech processing - Open Access .click

Results 41 to 50 of about 173,179 (314)

, 2019
The prosody of the speech signal conveys information over the linguistic content of the message: prosody structures the utterance, and also brings information on speaker's attitude and speaker's emotion. Duration of sounds, energy and fundamental frequency are the prosodic features. However their automatic computation and usage are not obvious.
openaire +2 more sources

Evaluating Prosodic Processing for Incremental Speech Synthesis

, 2012
Baumann T, Schlangen D. Evaluating Prosodic Processing for Incremental Speech Synthesis.
Schlangen, David +4 more
core +1 more source

Federated Learning for privacy-Friendly Health Apps: A Case Study on Ovulation Tracking

Journal of Sensor and Actuator Networks
In an era of increasing reliance on digital health solutions, safeguarding user privacy has emerged as a paramount concern. Health applications often need to balance advanced AI functionalities with sufficient privacy measures to ensure user engagement ...
Nikolaos Pavlidis +12 more
doaj +1 more source

Effective Exploitation of Posterior Information for Attention-Based Speech Recognition

IEEE Access, 2020
End-to-end attention-based modeling is increasingly popular for tackling sequence-to-sequence mapping tasks. Traditional attention mechanisms utilize prior input information to derive attention, which then conditions the output.
Jian Tang +4 more
doaj +1 more source

Effective Dereverberation with a Lower Complexity at Presence of the Noise

Applied Sciences, 2022
Adaptive beamforming and deconvolution techniques have shown effectiveness for reducing noise and reverberation. The minimum variance distortionless response (MVDR) beamformer is the most widely used for adaptive beamforming, whereas multichannel linear ...
Fengqi Tan, Changchun Bao, Jing Zhou
doaj +1 more source

Speech Structure and Its Application to Robust Speech Processing [PDF]

New Generation Computing, 2010
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Nobuaki Minematsu +3 more
openaire +2 more sources

Overlap-add Windows with Maximum Energy Concentration for Speech and Audio Processing

, 2019
Processing of speech and audio signals with time-frequency representations require windowing methods which allow perfect reconstruction of the original signal and where processing artifacts have a predictable behavior.
Bäckström, T., Tom Backstrom
core +1 more source

Automatic Assessment of Parkinson’s Disease Using Speech Representations of Phonation and Articulation

, 2022
Speech from people with Parkinson's disease (PD) are likely to be degraded on phonation, articulation, and prosody. Motivated to describe articulation deficits comprehensively, we investigated 1) the universal phonological features that model ...
Mittapalle, Kiran +5 more
core +1 more source

A Formant Modiﬁcation Method for Improved ASR of Children’s Speech

, 2022
Diﬀerences in acoustic characteristics between children’s and adults’ speech degrade performance of automatic speech recognition systems when systems trained using adults’ speech are used to recognize children’s speech.
Alku, Paavo +3 more
core +1 more source

Segment boundary detection directed attention for online end-to-end speech recognition

EURASIP Journal on Audio, Speech, and Music Processing, 2020
Attention-based encoder-decoder models have recently shown competitive performance for automatic speech recognition (ASR) compared to conventional ASR systems.
Junfeng Hou, Wu Guo, Yan Song, Li-Rong Dai +3 more
doaj +1 more source

10. no inequality
speech perception
humans

natural language processing
4. education
speech

speech enhancement
deep learning