Neural Target Speech Extraction: An overview [PDF]
Humans can listen to a target speaker even in challenging acoustic conditions that have noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail party effect.
Kateřina Žmolíková +5 more
semanticscholar +1 more source
Deep Learning and Bidirectional Optical Flow Based Viewport Predictions for 360° Video Coding
The rapid development of virtual reality applications continues to urge better compression of 360° videos owing to the large volume of content.
Jayasingam Adhuran +2 more
doaj +1 more source
DeepVoCoder: A CNN model for compression and coding of narrow band speech [PDF]
This paper proposes a convolutional neural network (CNN)-based encoder model to compress and code speech signal directly from raw input speech. Although the model can synthesize wideband speech by implicit bandwidth extension, narrowband is preferred for
Ilk, Hakki Gokhan +3 more
core +1 more source
With the proliferation of sensors and IoT technologies, stream data are increasingly stored and analysed, but rarely combined, due to the heterogeneity of sources and technologies.
Tarek Elsaleh +5 more
doaj +1 more source
Single channel speech-music separation using matching pursuit and spectral masks [PDF]
A single-channel speech music separation algorithm based on matching pursuit (MP) with multiple dictionaries and spectral masks is proposed in this work. A training data for speech and music signals is used to build two sets of magnitude spectral vectors
Erdogan, Hakan +2 more
core +1 more source
Rehaussement du signal de parole par EMD et opérateur de Teager-Kaiser [PDF]
The authors would like to thank Professor Mohamed Bahoura from Universite de Quebec a Rimouski for fruitful discussions on time adaptive thresholdingIn this paper a speech denoising strategy based on time adaptive thresholding of intrinsic modes ...
BOUDRAA, Abdel-Ouahab +2 more
core +5 more sources
Torchaudio-Squim: Reference-Less Speech Quality and Intelligibility Measures in Torchaudio [PDF]
Measuring quality and intelligibility of a speech signal is usually a critical step in development of speech processing systems. To enable this, a variety of metrics to measure quality and intelligibility under different assumptions have been developed ...
Anurag Kumar +6 more
semanticscholar +1 more source
The auditory-brainstem response to continuous, non repetitive speech is modulated by the speech envelope and reflects speech processing [PDF]
The auditory-brainstem response (ABR) to short and simple acoustical signals is an important clinical tool used to diagnose the integrity of the brainstem.
Braiman, C +4 more
core +1 more source
Tactile modulation of emotional speech samples [PDF]
Copyright © 2012 Katri Salminen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly ...
Ahmaniemi, Teemu +7 more
core +2 more sources
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation [PDF]
We propose TF-GridNet for speech separation. The model is a novel deep neural network (DNN) integrating full- and sub-band modeling in the time-frequency (T-F) domain.
Zhongqiu Wang +5 more
semanticscholar +1 more source

