Results 11 to 20 of about 18,884,699 (343)

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing [PDF]

open access: yesIEEE Journal on Selected Topics in Signal Processing, 2021
Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other speech processing tasks.
Sanyuan Chen   +16 more
semanticscholar   +1 more source

SUPERB: Speech processing Universal PERformance Benchmark [PDF]

open access: yesInterspeech, 2021
Self-supervised learning (SSL) has proven vital for advancing research in natural language processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on large volumes of unlabeled data and achieves state-of-the-art (SOTA) for ...
Shu-Wen Yang   +19 more
semanticscholar   +1 more source

SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing [PDF]

open access: yesIEEE/ACM Transactions on Audio Speech and Language Processing, 2023
Paralinguistic speech processing is important in addressing many issues, such as sentiment and neurocognitive disorder analyses. Recently, Transformer has achieved remarkable success in the natural language processing field and has demonstrated its ...
Weidong Chen   +4 more
semanticscholar   +1 more source

Transformers in Speech Processing: A Survey [PDF]

open access: yesarXiv.org, 2023
The remarkable success of transformers in the field of natural language processing has sparked the interest of the speech-processing community, leading to an exploration of their potential for modeling long-range dependencies within speech sequences ...
S. Latif   +5 more
semanticscholar   +1 more source

Toward a realistic model of speech processing in the brain with self-supervised learning [PDF]

open access: yesNeural Information Processing Systems, 2022
Several deep neural networks have recently been shown to generate activations similar to those of the brain in response to the same input. These algorithms, however, remain largely implausible: they require (1) extraordinarily large amounts of data, (2 ...
Juliette Millet   +7 more
semanticscholar   +1 more source

Torchaudio: Building Blocks for Audio and Speech Processing [PDF]

open access: yesIEEE International Conference on Acoustics, Speech, and Signal Processing, 2021
This document describes version 0.10 of TorchAudio: building blocks for machine learning applications in the audio and speech processing domain. The objective of TorchAudio is to accelerate the development and deployment of machine learning applications ...
Yao-Yuan Yang   +22 more
semanticscholar   +1 more source

Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation [PDF]

open access: yesInterspeech, 2022
Speech distortions are a long-standing problem that degrades the performance of supervisely trained speech processing models. It is high time that we enhance the robustness of speech processing models to obtain good performance when encountering speech ...
Kuan-Po Huang   +3 more
semanticscholar   +1 more source

Continuous speech processing. [PDF]

open access: yesCurr Opin Physiol, 2020
Brodbeck C, Simon JZ.
europepmc   +2 more sources

The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results

open access: yesIEEE International Conference on Acoustics, Speech, and Signal Processing, 2022
In this paper we discuss the rational of the Multi-model Information based Speech Processing (MISP) Challenge, and provide a detailed description of the data recorded, the two evaluation tasks and the corresponding baselines, followed by a summary of ...
Hang Chen   +12 more
semanticscholar   +1 more source

ESPnet: End-to-End Speech Processing Toolkit [PDF]

open access: yesInterspeech, 2018
This paper introduces a new open source platform for end-to-end speech processing named ESPnet. ESPnet mainly focuses on end-to-end automatic speech recognition (ASR), and adopts widely-used dynamic neural network toolkits, Chainer and PyTorch, as a main
Shinji Watanabe   +11 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy