Attention-Based End-To-End Named Entity Recognition From Speech
| openaire: EC/H2020/780069/EU//MeMADNamed entities are heavily used in the field of spoken language understanding, which uses speech as an input. The standard way of doing named entity recognition from speech involves a pipeline of two systems, where ...
Mikko Kurimo +5 more
core +1 more source
SVMs for Automatic Speech Recognition: a Survey [PDF]
Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems.
Peláez-Moreno, Carmen +8 more
core +1 more source
Avoiding distortions due to speech coding and transmission errors [PDF]
We have extended our previous research on a new approach to automatic speech recognition (ASR) in the GSM environment. Instead of recognizing from the decoded speech signal, our system works from the digital speech representation used by the GSM encoder.
Valverde Albacete, Francisco José +8 more
core +1 more source
A Formant Modification Method for Improved ASR of Children’s Speech
Differences in acoustic characteristics between children’s and adults’ speech degrade performance of automatic speech recognition systems when systems trained using adults’ speech are used to recognize children’s speech.
Alku, Paavo +3 more
core +1 more source
Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema [PDF]
06.08.13 KB. Ok to add accepted version to spiral, embargo period expired. SpringerIn this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition.
Kotti, Margarita, Paternò, Fabio
core +1 more source
Recognizing Voice Over IP: A Robust Front-End for Speech Recognition on the World Wide Web [PDF]
The Internet Protocol (IP) environment poses two relevant sources of distortion to the speech recognition problem: lossy speech coding and packet loss. In this paper, we propose a new front-end for speech recognition over IP networks.
Peláez-Moreno, Carmen +5 more
core +1 more source
Band-pass filtering of the time sequences of spectral parameters for robust wireless speech recognition [PDF]
In this paper we address the problem of automatic speech recognition when wireless speech communication systems are involved. In this context, three main sources of distortion should be considered: acoustic environment, speech coding and transmission ...
Peláez-Moreno, Carmen +11 more
core +1 more source
Character-based units for Unlimited Vocabulary Continuous Speech Recognition
We study character-based language models in the state-of-the-art speech recognition framework. This approach has advantages over both word-based systems and so-called end-to-end ASR systems that do not have separate acoustic and language models.
Virpioja, Sami +9 more
core +1 more source
The Effect of Silence Feature in Dimensional Speech Emotion Recognition [PDF]
Silence is a part of human-to-human communication, which can be a clue for human emotion perception. For automatic emotion recognition by a computer, it is not clear whether silence is useful to determine human emotion within a speech.
Atmaja, Bagus Tris +3 more
core +1 more source
Phoneme and sentence-level ensembles for speech recognition
We address the question of whether and how boosting and bagging can be used for speech recognition. In order to do this, we compare two different boosting schemes, one at the phoneme level and one at the utterance level, with a phoneme-level bagging ...
Dimitrakakis, Christos +5 more
core +1 more source

