Results 21 to 30 of about 7,097,775 (327)
Recent works in spoken language translation (SLT) have attempted to build end-to-end speech-to-text translation without using source language transcription during learning or decoding.
Besacier, Laurent +2 more
core +2 more sources
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation [PDF]
Direct speech-to-speech translation (S2ST) models suffer from data scarcity issues as there exists little parallel S2ST data, compared to the amount of data available for conventional cascaded systems that consist of automatic speech recognition (ASR ...
Sravya Popuri +7 more
semanticscholar +1 more source
StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation [PDF]
Direct speech-to-speech translation (S2ST) has gradually become popular as it has many advantages compared with cascade S2ST. However, current research mainly focuses on the accuracy of semantic translation and ignores the speech style transfer from a ...
Kun Song +7 more
semanticscholar +1 more source
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation [PDF]
Direct Speech-to-speech translation (S2ST) has drawn more and more attention recently. The task is very challenging due to data scarcity and complex speech-to-speech mapping. In this paper, we report our recent achievements in S2ST.
Qianqian Dong +5 more
semanticscholar +1 more source
Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation [PDF]
Direct speech-to-speech translation (S2ST) is an attractive research topic with many advantages compared to cascaded S2ST. However, direct S2ST suffers from the data scarcity problem because the corpora from the speech of the source language to the ...
Kun Wei +7 more
semanticscholar +1 more source
Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference? [PDF]
Five years after the first published proofs of concept, direct approaches to speech translation (ST) are now competing with traditional cascade solutions.
L. Bentivogli +6 more
semanticscholar +1 more source
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation [PDF]
End-to-end speech-to-speech translation (S2ST) without relying on intermediate text representations is a rapidly emerging frontier of research. Recent works have demonstrated that the performance of such direct S2ST systems is approaching that of ...
Ye Jia +6 more
semanticscholar +1 more source
CTC-based Compression for Direct Speech Translation [PDF]
Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST). However, they required a dedicated model for phone recognition and did not test this solution for direct ST, in which a ...
Marco Gaido +3 more
semanticscholar +1 more source
Embodied theories propose that language is understood via mental simulations of sensory states related to perception and action. Given that direct speech (e.g., 'She says, “It’s a lovely day!”') is perceived to be more vivid than indirect speech (e.g ...
Bo Yao
doaj +1 more source
Cascaded Models with Cyclic Feedback for Direct Speech Translation [PDF]
Direct speech translation describes a scenario where only speech inputs and corresponding translations are available. Such data are notoriously limited.
Tsz Kin Lam +2 more
semanticscholar +1 more source

