Speech segmentation - Open Access .click

Results 11 to 20 of about 1,383,551 (338)

Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference. [PDF]

Sci Rep, 2016
Speech segmentation is a crucial step in automatic speech recognition because additional speech analyses are performed for each framed speech segment.
Lee B, Cho KH.
europepmc +2 more sources

Automatic Speech Segmentation Based on HMM [PDF]

Radioengineering, 2007
This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the ...
M. Kroul
doaj +2 more sources

Lexical knowledge boosts statistically-driven speech segmentation. [PDF]

J Exp Psychol Learn Mem Cogn, 2019
The hypothesis that known words can serve as anchors for discovering new words in connected speech has computational and empirical support. However, evidence for how the bootstrapping effect of known words interacts with other mechanisms of lexical acquisition, such as statistical learning, is incomplete.
Palmer SD, Hutson J, White L, Mattys SL.
europepmc +6 more sources

Prosodic cues enhance rule learning by changing speech segmentation mechanisms. [PDF]

Front Psychol, 2015
Prosody has been claimed to have a critical role in the acquisition of grammatical information from speech. The exact mechanisms by which prosodic cues enhance learning are fully unknown.
de Diego-Balaguer R +2 more
europepmc +2 more sources

Segmentation of Speech and Humming in Vocal Input [PDF]

Radioengineering, 2012
Non-verbal vocal interaction (NVVI) is an interaction method in which sounds other than speech produced by a human are used, such as humming. NVVI complements traditional speech recognition systems with continuous control.
A. J. Sporka, O. Polacek, J. Havlik
doaj +2 more sources

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units [PDF]

IEEE/ACM Transactions on Audio Speech and Language Processing, 2021
Self-supervised approaches for speech representation learning are challenged by three unique problems: (1) there are multiple sound units in each input utterance, (2) there is no lexicon of input sound units during the pre-training phase, and (3) sound ...
Wei-Ning Hsu +5 more
semanticscholar +1 more source

Selection of acoustic modeling unit for Tibetan speech recognition based on deep learning [PDF]

MATEC Web of Conferences, 2021
The selection of the speech recognition modeling unit is the primary problem of acoustic modeling in speech recognition, and different acoustic modeling units will directly affect the overall performance of speech recognition.
Gong Baojia +4 more
doaj +1 more source

End-to-End Simultaneous Speech Translation with Differentiable Segmentation [PDF]

Annual Meeting of the Association for Computational Linguistics, 2023
End-to-end simultaneous speech translation (SimulST) outputs translation while receiving the streaming speech inputs (a.k.a. streaming speech translation), and hence needs to segment the speech inputs and then translate based on the current received ...
Shaolei Zhang, Yang Feng
semanticscholar +1 more source

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation [PDF]

Interspeech, 2022
Speech translation models are unable to directly process long audios, like TED talks, which have to be split into shorter segments. Speech translation datasets provide manual segmentations of the audios, which are not available in real-world scenarios ...
Yiannis (Ioannis) Tsiamas +3 more
semanticscholar +1 more source

Phonemic segmentation of narrative speech in human cerebral cortex

Nature Communications, 2023
Speech processing requires extracting meaning from acoustic patterns using a set of intermediate representations based on a dynamic segmentation of the speech stream.
Xue L Gong +5 more
semanticscholar +1 more source

computer science
medicine
linguistics

psychology
humans
speech perception

engineering
phonetics
adult