Results 41 to 50 of about 173,179 (314)
Speech Processing and Prosody [PDF]
The prosody of the speech signal conveys information over the linguistic content of the message: prosody structures the utterance, and also brings information on speaker's attitude and speaker's emotion. Duration of sounds, energy and fundamental frequency are the prosodic features. However their automatic computation and usage are not obvious.
openaire +2 more sources
Evaluating Prosodic Processing for Incremental Speech Synthesis
Baumann T, Schlangen D. Evaluating Prosodic Processing for Incremental Speech Synthesis.
Schlangen, David +4 more
core +1 more source
Federated Learning for privacy-Friendly Health Apps: A Case Study on Ovulation Tracking
In an era of increasing reliance on digital health solutions, safeguarding user privacy has emerged as a paramount concern. Health applications often need to balance advanced AI functionalities with sufficient privacy measures to ensure user engagement ...
Nikolaos Pavlidis +12 more
doaj +1 more source
Effective Exploitation of Posterior Information for Attention-Based Speech Recognition
End-to-end attention-based modeling is increasingly popular for tackling sequence-to-sequence mapping tasks. Traditional attention mechanisms utilize prior input information to derive attention, which then conditions the output.
Jian Tang +4 more
doaj +1 more source
Effective Dereverberation with a Lower Complexity at Presence of the Noise
Adaptive beamforming and deconvolution techniques have shown effectiveness for reducing noise and reverberation. The minimum variance distortionless response (MVDR) beamformer is the most widely used for adaptive beamforming, whereas multichannel linear ...
Fengqi Tan, Changchun Bao, Jing Zhou
doaj +1 more source
Speech Structure and Its Application to Robust Speech Processing [PDF]
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Nobuaki Minematsu +3 more
openaire +2 more sources
Overlap-add Windows with Maximum Energy Concentration for Speech and Audio Processing
Processing of speech and audio signals with time-frequency representations require windowing methods which allow perfect reconstruction of the original signal and where processing artifacts have a predictable behavior.
Bäckström, T., Tom Backstrom
core +1 more source
Speech from people with Parkinson's disease (PD) are likely to be degraded on phonation, articulation, and prosody. Motivated to describe articulation deficits comprehensively, we investigated 1) the universal phonological features that model ...
Mittapalle, Kiran +5 more
core +1 more source
A Formant Modification Method for Improved ASR of Children’s Speech
Differences in acoustic characteristics between children’s and adults’ speech degrade performance of automatic speech recognition systems when systems trained using adults’ speech are used to recognize children’s speech.
Alku, Paavo +3 more
core +1 more source
Segment boundary detection directed attention for online end-to-end speech recognition
Attention-based encoder-decoder models have recently shown competitive performance for automatic speech recognition (ASR) compared to conventional ASR systems.
Junfeng Hou +3 more
doaj +1 more source

