Speaker Re-identification with Speaker Dependent Speech Enhancement [PDF]
While the use of deep neural networks has significantly boosted speaker recognition performance, it is still challenging to separate speakers in poor acoustic environments.
Hain, Thomas, Huang, Qiang, Shi, Yanpei
core +2 more sources
Research Status and Prospect of Transformer in Speech Recognition
As a new deep learning algorithm framework, Transformer has attracted more and more researchers?? attention and has become a current research hotspot. Inspired by humans focusing on important things only, the self-attention mechanism in the Transformer ...
ZHANG Xiaoxu, MA Zhiqiang, LIU Zhiqiang, ZHU Fangyuan, WANG Chunyu
doaj +1 more source
Audio-visual speech recognition with background music using single-channel source separation [PDF]
In this paper, we consider audio-visual speech recognition with background music. The proposed algorithm is an integration of audio-visual speech recognition and single channel source separation (SCSS). We apply the proposed algorithm to recognize spoken
Erdogan, Hakan+4 more
core +1 more source
Recognizing Voice Over IP: A Robust Front-End for Speech Recognition on the World Wide Web [PDF]
The Internet Protocol (IP) environment poses two relevant sources of distortion to the speech recognition problem: lossy speech coding and packet loss. In this paper, we propose a new front-end for speech recognition over IP networks.
Díaz de María, Fernando+2 more
core +2 more sources
Emotional Interactive Simulation System of English Speech Recognition in Virtual Context
With the development of virtual scenes, the degree of simulation and functions of virtual reality have been very complete, providing a new platform and perspective for teaching design.
Dan Li
doaj +1 more source
Selection of acoustic modeling unit for Tibetan speech recognition based on deep learning [PDF]
The selection of the speech recognition modeling unit is the primary problem of acoustic modeling in speech recognition, and different acoustic modeling units will directly affect the overall performance of speech recognition.
Gong Baojia+4 more
doaj +1 more source
The performance of speech recognition systems trained with neutral utterances degrades significantly when these systems are tested with emotional speech. Since everybody can speak emotionally in the real-world environment, it is necessary to take account
Masoud Geravanchizadeh+2 more
doaj +1 more source
Effect of Time-domain Windowing on Isolated Speech Recognition System Performance [PDF]
Speech recognition system extract the textual data from the speech signal. The research in speech recognition domain is challenging due to the large variabilities involved with the speech signal.
Ananthakrishna Thalengala+2 more
doaj +1 more source
A novel privacy-preserving speech recognition framework using bidirectional LSTM
Utilizing speech as the transmission medium in Internet of things (IoTs) is an effective way to reduce latency while improving the efficiency of human-machine interaction. In the field of speech recognition, Recurrent Neural Network (RNN) has significant
Qingren Wang+4 more
doaj +1 more source
Automatic speech recognition with deep neural networks for impaired speech [PDF]
The final publication is available at https://link.springer.com/chapter/10.1007%2F978-3-319-49169-1_10Automatic Speech Recognition has reached almost human performance in some controlled scenarios.
España-i-Bonet, Cristina+1 more
core +1 more source