Building competitive direct acoustics-to-word models for English conversational speech recognition [PDF]
Direct acoustics-to-word (A2W) models in the end-to-end paradigm have received increasing attention compared to conventional sub-word based automatic speech recognition models using phones, characters, or context-dependent hidden Markov model states ...
Audhkhasi, Kartik +4 more
core +2 more sources
Neural Markers of Speech Comprehension: Measuring EEG Tracking of Linguistic Speech Representations, Controlling the Speech Acoustics. [PDF]
Gillis M +4 more
europepmc +3 more sources
Aligning syntactic structure to the dynamics of verbal communication: A pipeline for annotating syntactic phrases onto speech acoustics. [PDF]
Iaia C, Tavano A.
europepmc +3 more sources
How Do Enriched Speech Acoustics Support Language Acquisition in Children With Hearing Loss? A Narrative Review. [PDF]
Hahn LE +3 more
europepmc +2 more sources
Digital remote assessment of speech acoustics in cognitively unimpaired adults: feasibility, reliability and associations with amyloid pathology. [PDF]
van den Berg RL +17 more
europepmc +3 more sources
Identification of Affective State Change in Adults With Aphasia Using Speech Acoustics. [PDF]
Gillespie S +5 more
europepmc +3 more sources
Vowel reduction is a common pronunciation phenomenon in stress-timed languages like English. Native speakers tend to weaken unstressed vowels into a schwa-like sound.
Zongming Liu +3 more
doaj +1 more source
Nowadays, most end-to-end task-oriented dialog systems are based on sequence-to-sequence (Seq2seq), which is an encoder-decoder framework. These systems sometimes produce grammatically correct, but logically incorrect responses.
Junqing He +4 more
doaj +1 more source
GuidedMix: An on‐the‐fly data augmentation approach for robust speaker recognition system
Data augmentation is an essential technique for building a high‐robustness speaker recognition system. this letter proposes a novel on‐the‐fly data augmentation strategy called GuidedMix.
Runqiu Xiao +4 more
doaj +1 more source
Collecting language, speech acoustics, and facial expression to predict psychosis and other clinical outcomes: strategies from the AMP® SCZ initiative. [PDF]
Bilgrami ZR +77 more
europepmc +3 more sources

