Results 61 to 70 of about 420 (159)

Realizar transcripciones de entrevistas a texto en catalán con Whisper AI y Softcatalà

open access: yes
Aquest vídeo ofereix una guia detallada sobre com utilitzar el lloc web de Softcatalà per a transcriure fitxers d'audio i vídeo a text. En el portal de Softcatalà, accessible a través de https://www.softcatala.org/transcripcio/, es fa servir Whisper, un ...
Boté-Vericad, Juan-José   +1 more
core  

Whisper in Medusa\u27s Ear: Multi-head Efficient Decoding for Transformer-based ASR

open access: yes
Large transformer-based models have significant potential for speech transcription and translation. Their self-attention mechanisms and parallel processing enable them to capture complex patterns and dependencies in audio sequences.
Keshet, Joseph   +4 more
core  

Chuchotons, approche réflexive de la transcription automatisée

open access: yes
International audienceLa transcription d’entretiens est une tâche laborieuse qui peut s’avérer chronophage. Rite de passage autrefois obligatoire pour les étudiant·es qui débutaient leur parcours en recherche en SHS, les solutions de transcription ...
Ducasse, Arthur   +4 more
core   +1 more source

mmWave-Whisper: Phone Call Eavesdropping and Transcription Using Millimeter-Wave Radar

open access: yes
This paper introduces mmWave-Whisper, a system that demonstrates the feasibility of full-corpus automated speech recognition (ASR) on phone calls eavesdropped remotely using off-the-shelf frequency modulated continuous wave (FMCW) millimeter-wave radars.
Basak, Suryoday   +2 more
core  

Transcripción de audio a texto en Sesiones Municipales de Planeta Rica

open access: yes, 2023
The document addresses the issue of manually transcribing municipal sessions in Planeta Rica. It investigates the use of open-source tools to automate the transcription of audio to text in these sessions with the aim of improving efficiency and accuracy ...
Ruiz-Melendres, Jaime Andrés   +1 more
core  

Can Whisper perform speech-based in-context learning?

open access: yes
This paper investigates the in-context learning abilities of the Whisper automatic speech recognition (ASR) models released by OpenAI. A novel speech-based in-context learning (SICL) approach is proposed for test-time adaptation, which can reduce the ...
Zhang, Chao   +3 more
core  

Deep Acoustic Models for Speech Quality Assessment in Children [PDF]

open access: yes
I Norge har økt innvandring ført til høyere etterspørsel etter norskopplæring, særlig for barn i skolealder som trenger å lære seg norsk raskt for å holde tritt med jevnaldrende.
Truyen, Amanda Johanne Thunes
core  

Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis

open access: yes
This paper explores advancements in real-time talking-head generation, focusing on overcoming challenges in Audio Feature Extraction (AFE), which often introduces latency and limits responsiveness in real-time applications.
Sajad Amouei Sheshkal   +7 more
core   +1 more source

OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer

open access: yes
Recent studies have highlighted the importance of fully open foundation models. The Open Whisper-style Speech Model (OWSM) is an initial step towards reproducing OpenAI Whisper using public data and open-source toolkits.
Tian, Jinchuan   +11 more
core  

Nástroj pro programování hlasem [PDF]

open access: yes
Systémy automatického rozpoznávání řeči (ASR) jsou významnou součástí dnešního uživatelského prostředí, protože hlas/zvuk je jedním z mála možných způsobů, jak se vyjádřit, a jejich použití by mohlo zpříjemnit nebo dokonce zrychlit interakci mezi lidmi a
Kaňa, Roman
core  

Home - About - Disclaimer - Privacy