Results 61 to 70 of about 132,260 (194)
Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition
The success of self-attention in NLP has led to recent applications in end-to-end encoder-decoder architectures for speech recognition. Separately, connectionist temporal classification (CTC) has matured as an alignment-free, non-autoregressive approach ...
Huang, Zhiheng +2 more
core +1 more source
Time and information in perceptual adaptation to speech [PDF]
Presubmission manuscript and supplementary files (stimuli, stimulus presentation code, data, data analysis code).Perceptual adaptation to a talker enables listeners to efficiently resolve the many-to-many mapping between variable speech acoustics and ...
Choi, Ja Young, Perrachione, Tyler
core
Large margin filtering for signal sequence labeling
Signal Sequence Labeling consists in predicting a sequence of labels given an observed sequence of samples. A naive way is to filter the signal in order to reduce the noise and to apply a classification algorithm on the filtered samples.
Flamary, Rémi +2 more
core +4 more sources
Perception and analysis of Chinese accented German vowels
This report describes an investigation to obtain specific knowledge about the production of German vowels by Chinese speakers. The experiments have been conducted to acquire both perceptual and acoustic measures of the vowels.
Hongwei DING +2 more
doaj
Education in basic acoustics for acoustic phonetics and speech science
Students in acoustic phonetics and speech science classes often do not have much technical background; an intuitive means to teach acoustic phenomena to them would, thus, be useful. Regarding speech production, physical demonstrations using vocal-tract models have been shown to be an intuitive way to teach acoustic phenomena. In particular, a series of
openaire +2 more sources
Numerous studies have shown that teachers often speak louder in classrooms because of the acoustic properties of the spaces. To improve the acoustics in classrooms, it is necessary to develop relevant acoustic criteria.
Jan RADOSZ
doaj +1 more source
Towards Language-Universal End-to-End Speech Recognition
Building speech recognizers in multiple languages typically involves replicating a monolingual training recipe for each language, or utilizing a multi-task learning approach where models for different languages have separate output labels but share some ...
Kim, Suyoun, Seltzer, Michael L.
core +1 more source
Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition
End-to-end training of deep learning-based models allows for implicit learning of intermediate representations based on the final task loss. However, the end-to-end approach ignores the useful domain knowledge encoded in explicit intermediate-level ...
Livescu, Karen +3 more
core +1 more source
Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models
In this paper, we describe how to efficiently implement an acoustic room simulator to generate large-scale simulated data for training deep neural networks.
Bacchiani, Michiel +3 more
core +1 more source
Mel-Weighted Single Frequency Filtering Spectrogram for Dialect Identification
In this study, we propose Mel-weighted single frequency filtering (SFF) spectrograms for dialect identification. The spectrum derived using SFF has high spectral resolution for harmonics and resonances while simultaneously maintaining good time ...
Rashmi Kethireddy +3 more
doaj +1 more source

