Speech Enhancement by Multiple Propagation through the Same Neural Network [PDF]
Monaural speech enhancement aims to remove background noise from an audio recording containing speech in order to improve its clarity and intelligibility. Currently, the most successful solutions for speech enhancement use deep neural networks.
Tomasz Grzywalski, Szymon Drgas
doaj +2 more sources
CST: Complex Sparse Transformer for Low-SNR Speech Enhancement [PDF]
Speech enhancement tasks for audio with a low SNR are challenging. Existing speech enhancement methods are mainly designed for high SNR audio, and they usually use RNNs to model audio sequence features, which causes the model to be unable to learn long ...
Kaijun Tan+6 more
doaj +2 more sources
Towards Model Compression for Deep Learning Based Speech Enhancement. [PDF]
The use of deep neural networks (DNNs) has dramatically elevated the performance of speech enhancement over the last decade. However, to achieve strong enhancement performance typically requires a large DNN, which is both memory and computation consuming,
Tan K, Wang D.
europepmc +2 more sources
Dense CNN with Self-Attention for Time-Domain Speech Enhancement. [PDF]
Speech enhancement in the time domain is becoming increasingly popular in recent years, due to its capability to jointly enhance both the magnitude and the phase of speech.
Pandey A, Wang D.
europepmc +3 more sources
Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR. [PDF]
This study proposes a complex spectral mapping approach for single- and multi-channel speech enhancement, where deep neural networks (DNNs) are used to predict the real and imaginary (RI) components of the direct-path signal from noisy and reverberant ...
Wang ZQ, Wang P, Wang D.
europepmc +2 more sources
Speech Enhancement and Dereverberation With Diffusion-Based Generative Models [PDF]
In this work, we build upon our previous publication and use diffusion-based generative models for speech enhancement. We present a detailed overview of the diffusion process that is based on a stochastic differential equation and delve into an extensive
Julius Richter+4 more
semanticscholar +1 more source
Conditional Diffusion Probabilistic Model for Speech Enhancement [PDF]
Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs.
Yen-Ju Lu+5 more
semanticscholar +1 more source
Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain [PDF]
Score-based generative models (SGMs) have recently shown impressive results for difficult generative tasks such as the unconditional and conditional generation of natural images and audio signals. In this work, we extend these models to the complex short-
Simon Welker+2 more
semanticscholar +1 more source
FullSubNet+: Channel Attention Fullsubnet with Complex Spectrograms for Speech Enhancement [PDF]
Previously proposed FullSubNet has achieved outstanding performance in Deep Noise Suppression (DNS) Challenge and attracted much attention. However, it still encounters issues such as input-output mismatch and coarse processing for frequency bands.
Jun Chen+5 more
semanticscholar +1 more source
Universal Speech Enhancement with Score-based Diffusion [PDF]
Removing background noise from speech audio has been the subject of considerable effort, especially in recent years due to the rise of virtual communication and amateur recordings.
J. SerrĂ +4 more
semanticscholar +1 more source