Results 281 to 290 of about 1,383,551 (338)
Some of the next articles are maybe not open access.
Persian speech sentence segmentation without speech recognition
2014 Iranian Conference on Intelligent Systems (ICIS), 2014In this paper, we propose a method for detection of Persian speech sentence boundaries using a set of prosodic features and spectral centroid. No speech recognizer is used in our proposed method. Silent regions are first detected using four features including spectral centroid, zero crossing rate, energy and pitch.
Hoda Sadat Jafari +1 more
openaire +1 more source
Audiovusual automatic speech segmentation
2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU), 2011Audiovisual speech segmentation using visual information together with audio data is introduced. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results that directly affects the quality of speech processing systems using the segmented database.
Eren Akdemir, Tolga Ciloglu
openaire +1 more source
Segmenting Speech by Mouth: The Role of Oral Prosodic Cues for Visual Speech Segmentation
Language and Speech, 2022Adults are able to use visual prosodic cues in the speaker’s face to segment speech. Furthermore, eye-tracking data suggest that learners will shift their gaze to the mouth during visual speech segmentation. Although these findings suggest that the mouth may be viewed more than the eyes or nose during visual speech segmentation, no study has examined ...
Aaron D. Mitchel +3 more
openaire +2 more sources
CTC-Segmentation of Large Corpora for German End-to-End Speech Recognition
International Conference on Speech and Computer, 2020Recent end-to-end Automatic Speech Recognition (ASR) systems demonstrated the ability to outperform conventional hybrid DNN/HMM ASR. Aside from architectural improvements in those systems, those models grew in terms of depth, parameters and model ...
Ludwig Kurzinger +4 more
semanticscholar +1 more source
Segmenting speech using dynamic programming
The Journal of the Acoustical Society of America, 1981Speech is modeled as a Markov chain. Scoring is developed to convert observations of the speech signal into estimated probabilities of the locations of segment boundaries. Dynamic programming is then used to compute a most-probable segmentation for the speech.
openaire +2 more sources
Automatic visual speech segmentation
2011 IEEE 3rd International Conference on Communication Software and Networks, 2011Speech recognition techniques which rely on audio features of speech degrade in performance in noisy environments. Visual Speech Recognition helps this by incorporating a visual signal into the recognition process. The performance of automatic speech recognition (ASR) system can be significantly enhanced with additional information from visual speech ...
Hamed Talea, Khashayar Yaghmaie
openaire +1 more source
Phoneme segmentation of speech
18th International Conference on Pattern Recognition (ICPR'06), 2006In most approaches to speech recognition, the speech signals are segmented using constant-time segmentation, for example into 25 ms blocks. Constant segmentation risks losing information about the phonemes. Different sounds may be merged into single blocks and individual phonemes lost completely.
B. Zioko, S. Manandhar, R.C. Wilson
openaire +1 more source
Wavelets in speech segmentation
MELECON 2008 - The 14th IEEE Mediterranean Electrotechnical Conference, 2008A new event-driven method of speech signals segmentation is presented. The wavelet discrete transform was used for spectral analysis and to create a segmentation procedure. Innovative event detector is the core of the process. Efficiency of the algorithm is tested against the hand annotated speech corpus.
J. Galka, M. Ziolko
openaire +1 more source
Automatic Speech Segmentation for Automatic Speech Translation
2013The article presents selected, effective speech signal processing algorithms and their use in order to improve the automatic speech translation. Automatic speech translation uses natural language processing techniques implemented using algorithms of automatic speech recognition, speaker recognition, automatic text translation and text-to-speech ...
Piotr Kłosowski, Adam Dustor
openaire +1 more source
2018
Spoken language is typically produced in a continuous stream, with the realization of phonemes influenced by adjacent sounds (coarticulation) and few silent boundaries between words. Despite these problems of variability and continuity, listeners generally perceive speech as a coherent sequence of discrete words.
openaire +2 more sources
Spoken language is typically produced in a continuous stream, with the realization of phonemes influenced by adjacent sounds (coarticulation) and few silent boundaries between words. Despite these problems of variability and continuity, listeners generally perceive speech as a coherent sequence of discrete words.
openaire +2 more sources

