Results 11 to 20 of about 3,552,310 (382)

Wavenet based low rate speech coding [PDF]

open access: yesIEEE International Conference on Acoustics, Speech, and Signal Processing, 2017
Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit ...
Kleijn, W. Bastiaan   +6 more
core   +2 more sources

Scaling Transformers for Low-Bitrate High-Quality Speech Coding [PDF]

open access: greenarXiv.org
The tokenization of speech with neural audio codec models is a vital part of modern AI pipelines for the generation or understanding of speech, alone or in a multimodal context.
Julian D. Parker   +6 more
openalex   +2 more sources

End-to-End Neural Speech Coding for Real-Time Communications [PDF]

open access: yesIEEE International Conference on Acoustics, Speech, and Signal Processing, 2022
Deep-learning based methods have shown their advantages in audio coding over traditional ones but limited attention has been paid on real-time communications (RTC). This paper proposes the TFNet, an end-to-end neural speech codec with low latency for RTC.
Xue Jiang   +5 more
semanticscholar   +1 more source

Generative Speech Coding with Predictive Variance Regularization [PDF]

open access: yesIEEE International Conference on Acoustics, Speech, and Signal Processing, 2021
The recent emergence of machine-learning based generative models for speech suggests a significant reduction in bit rate for speech codecs is possible. However, the performance of generative models deteriorates significantly with the distortions present ...
W. Kleijn   +7 more
semanticscholar   +1 more source

NESC: Robust Neural End-2-End Speech Coding with GANs [PDF]

open access: yesInterspeech, 2022
Neural networks have proven to be a formidable tool to tackle the problem of speech coding at very low bit rates. However, the design of a neural coder that can be operated robustly under real-world conditions remains a major challenge.
N. Pia   +4 more
semanticscholar   +1 more source

Latent-Domain Predictive Neural Speech Coding [PDF]

open access: yesIEEE/ACM Transactions on Audio Speech and Language Processing, 2022
Neural audio/speech coding has recently demonstrated its capability to deliver high quality at much lower bitrates than traditional methods. However, existing neural audio/speech codecs employ either acoustic features or learned blind features with a ...
Xue Jiang   +4 more
semanticscholar   +1 more source

Disentangled Feature Learning for Real-Time Neural Speech Coding [PDF]

open access: yesIEEE International Conference on Acoustics, Speech, and Signal Processing, 2022
Recently end-to-end neural audio/speech coding has shown its great potential to outperform traditional signal analysis based audio codecs. This is mostly achieved by following the VQ-VAE paradigm where blind features are learned, vector-quantized and ...
Xue Jiang   +3 more
semanticscholar   +1 more source

A Streamwise Gan Vocoder for Wideband Speech Coding at Very Low Bit Rate [PDF]

open access: yesIEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021
Recently, GAN vocoders have seen rapid progress in speech synthesis, starting to outperform autoregressive models in perceptual quality with much higher generation speed.
Ahmed Mustafa   +5 more
semanticscholar   +1 more source

Scalable and Efficient Neural Speech Coding: A Hybrid Design [PDF]

open access: yesIEEE/ACM Transactions on Audio Speech and Language Processing, 2021
We present a scalable and efficient neural waveform coding system for speech compression. We formulate the speech coding problem as an autoencoding task, where a convolutional neural network (CNN) performs encoding and decoding as a neural waveform codec
Kai Zhen   +4 more
semanticscholar   +1 more source

Enhancing into the Codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders [PDF]

open access: yesIEEE International Conference on Acoustics, Speech, and Signal Processing, 2021
Audio codecs based on discretized neural autoencoders have recently been developed and shown to provide significantly higher compression levels for comparable quality speech out-put.
Jonah Casebeer   +5 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy