Abstract Various time‐frequency (T‐F) masks are being applied to sound source localization tasks. Moreover, deep learning has dramatically advanced T‐F mask estimation. However, existing masks are usually designed for speech separation tasks and are suitable only for single‐channel signals.
Hong Liu +4 more
wiley +1 more source
A Multi-Source Separation Approach Based on DOA Cue and DNN
Multiple sound source separation in a reverberant environment has become popular in recent years. To improve the quality of the separated signal in a reverberant environment, a separation method based on a DOA cue and a deep neural network (DNN) is ...
Yu Zhang +3 more
doaj +1 more source
DeepVQE: Real Time Deep Voice Quality Enhancement for Joint Acoustic Echo Cancellation, Noise Suppression and Dereverberation [PDF]
Acoustic echo cancellation (AEC), noise suppression (NS) and dereverberation (DR) are an integral part of modern full-duplex communication systems. As the demand for teleconferencing systems increases, addressing these tasks is required for an effective ...
Evgenii Indenbom +5 more
semanticscholar +1 more source
[Retracted] Serialized Recommendation Technology Based on Deep Neural Network
Since the construction of brain network is like organic brain organization, profound brain network has high effectiveness and high accuracy in separating data from profound elements, fit for multifacet learning, conceptual component portrayal, cross‐space learning capacity, multisource, heterogeneous data content.
Long Jin, Chia-Huei Wu
wiley +1 more source
Machine Learning for Predictive Analytics in the Improvement of English Speech Feature Recognition
The use of deep learning to improve English speaking has seen tremendous development in recent years. This study evaluates the noise that is present in the English speech environment, employs a two‐way search method to select the optimum feature set, and applies a quick correlation filter to remove redundant features in order to increase the accuracy ...
Yan Chen +2 more
wiley +1 more source
CycleGAN-based Unpaired Speech Dereverberation
Typically, neural network-based speech dereverberation models are trained on paired data, composed of a dry utterance and its corresponding reverberant utterance. The main limitation of this approach is that such models can only be trained on large amounts of data and a variety of room impulse responses when the data is synthetically reverberated ...
Muckenhirn, Hannah +6 more
openaire +2 more sources
Crossband Filtering for Weighted Prediction Error-Based Speech Dereverberation
Weighted prediction error (WPE) is a linear prediction-based method extensively used to predict and attenuate the late reverberation component of an observed speech signal.
Tomer Rosenbaum +2 more
doaj +1 more source
Audio-Visual End-to-End Multi-Channel Speech Separation, Dereverberation and Recognition [PDF]
Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberation remains a highly challenging task to date. Motivated by the invariance of visual modality to acoustic signal corruption, an audio-visual multi-channel ...
Guinan Li +8 more
semanticscholar +1 more source
Learning Audio-Visual Dereverberation
Reverberation not only degrades the quality of speech for human perception, but also severely impacts the accuracy of automatic speech recognition. Prior work attempts to remove reverberation based on the audio modality only. Our idea is to learn to dereverberate speech from audio-visual observations.
Chen, Changan +3 more
openaire +2 more sources
A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, And Extraction [PDF]
We propose a multi-task universal speech enhancement (MUSE) model that can perform five speech enhancement (SE) tasks: dereverberation, denoising, speech separation (SS), target speaker extraction (TSE), and speaker counting.
Kohei Saijo +5 more
semanticscholar +1 more source

