Results 61 to 70 of about 420 (159)
Realizar transcripciones de entrevistas a texto en catalán con Whisper AI y Softcatalà
Aquest vídeo ofereix una guia detallada sobre com utilitzar el lloc web de Softcatalà per a transcriure fitxers d'audio i vídeo a text. En el portal de Softcatalà, accessible a través de https://www.softcatala.org/transcripcio/, es fa servir Whisper, un ...
Boté-Vericad, Juan-José +1 more
core
Whisper in Medusa\u27s Ear: Multi-head Efficient Decoding for Transformer-based ASR
Large transformer-based models have significant potential for speech transcription and translation. Their self-attention mechanisms and parallel processing enable them to capture complex patterns and dependencies in audio sequences.
Keshet, Joseph +4 more
core
Chuchotons, approche réflexive de la transcription automatisée
International audienceLa transcription d’entretiens est une tâche laborieuse qui peut s’avérer chronophage. Rite de passage autrefois obligatoire pour les étudiant·es qui débutaient leur parcours en recherche en SHS, les solutions de transcription ...
Ducasse, Arthur +4 more
core +1 more source
mmWave-Whisper: Phone Call Eavesdropping and Transcription Using Millimeter-Wave Radar
This paper introduces mmWave-Whisper, a system that demonstrates the feasibility of full-corpus automated speech recognition (ASR) on phone calls eavesdropped remotely using off-the-shelf frequency modulated continuous wave (FMCW) millimeter-wave radars.
Basak, Suryoday +2 more
core
Transcripción de audio a texto en Sesiones Municipales de Planeta Rica
The document addresses the issue of manually transcribing municipal sessions in Planeta Rica. It investigates the use of open-source tools to automate the transcription of audio to text in these sessions with the aim of improving efficiency and accuracy ...
Ruiz-Melendres, Jaime Andrés +1 more
core
Can Whisper perform speech-based in-context learning?
This paper investigates the in-context learning abilities of the Whisper automatic speech recognition (ASR) models released by OpenAI. A novel speech-based in-context learning (SICL) approach is proposed for test-time adaptation, which can reduce the ...
Zhang, Chao +3 more
core
Deep Acoustic Models for Speech Quality Assessment in Children [PDF]
I Norge har økt innvandring ført til høyere etterspørsel etter norskopplæring, særlig for barn i skolealder som trenger å lære seg norsk raskt for å holde tritt med jevnaldrende.
Truyen, Amanda Johanne Thunes
core
Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis
This paper explores advancements in real-time talking-head generation, focusing on overcoming challenges in Audio Feature Extraction (AFE), which often introduces latency and limits responsiveness in real-time applications.
Sajad Amouei Sheshkal +7 more
core +1 more source
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Recent studies have highlighted the importance of fully open foundation models. The Open Whisper-style Speech Model (OWSM) is an initial step towards reproducing OpenAI Whisper using public data and open-source toolkits.
Tian, Jinchuan +11 more
core
Nástroj pro programování hlasem [PDF]
Systémy automatického rozpoznávání řeči (ASR) jsou významnou součástí dnešního uživatelského prostředí, protože hlas/zvuk je jedním z mála možných způsobů, jak se vyjádřit, a jejich použití by mohlo zpříjemnit nebo dokonce zrychlit interakci mezi lidmi a
Kaňa, Roman
core

