Openai whisper - Open Access .click

Results 1 to 10 of about 420 (159)

Automated Caption Generation for Video Call with Language Translation [PDF]

E3S Web of Conferences, 2023
In the modern era, virtual communication between individuals is common. Many people’s lives have been made simpler in a number of circumstances by providing subtitles, generating automated captions for social media videos, and language translation from a
Polepaka Sanjeeva +4 more
doaj +2 more sources

Automated Assessment of Word- and Sentence-Level Speech Intelligibility in Developmental Motor Speech Disorders: A Cross-Linguistic Investigation [PDF]

Diagnostics
Background/Objectives: Accurate assessment of speech intelligibility is necessary for individuals with motor speech disorders. Transcription or scaled rating methods by naïve listeners are the most reliable tasks for these purposes; however, they are ...
Micalle Carl, Michal Icht
doaj +2 more sources

Enhancing supermarket robot interaction: an equitable multi-level LLM conversational interface for handling diverse customer intents [PDF]

Frontiers in Robotics and AI
This paper presents the design and evaluation of a comprehensive system to develop voice-based interfaces to support users in supermarkets. These interfaces enable shoppers to convey their needs through both generic and specific queries.
Chandran Nandkumar, Luka Peternel
doaj +2 more sources

From voice to ink (Vink): development and assessment of an automated, free-of-charge transcription tool [PDF]

BMC Research Notes
Background Verbatim transcription of qualitative audio data is a cornerstone of analytic quality and rigor, yet the time and energy required for such transcription can drain resources, delay analysis, and hinder the timely dissemination of qualitative ...
Hannah Tolle +6 more
doaj +2 more sources

Spoken Language Analysis in Aging Research: The Validity of AI-Generated Speech to Text Using OpenAI's Whisper. [PDF]

Gerontology
Introduction: Studying what older adults say can provide important insights into cognitive, affective, and social aspects of aging. Available language analysis tools generally require audio-recorded speech to be transcribed into verbatim text, a task that has historically been performed by humans.
Naffah A, Pfeifer VA, Mehl MR.
europepmc +3 more sources

Adapting OpenAI’s Whisper for Speech Recognition on Code-Switch Mandarin-English SEAME and ASRU2019 Datasets

2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
This paper details the experimental results of adapting the OpenAI's Whisper model for Code-Switch Mandarin-English Speech Recognition (ASR) on the SEAME and ASRU2019 corpora. We conducted 2 experiments: a) using adaptation data from 1 to 100/200 hours to demonstrate effectiveness of adaptation, b) examining different language ID setup on Whisper ...
Yizhou Peng, Eng Siong Chng
exaly +3 more sources

Аналiза і клясифікация русиньской бесіды языковым модельом штучной інтеліґенциі OpenAI Whisper

Rìčnik Ruskoj Bursy
ANALYSIS AND CLASSIFICATION OF THE RUSYN LANGUAGE USING THE OPENAI WHISPER ASR MODELThe paper presents a linguistic analysis of the Rusyn language, focusing on its complex and dynamic aspects, such as pronunciation and individual, regional, and historical variations.
Pawel Malecki
exaly +2 more sources

Video Transcripts Summarization using OpenAI Whisper and GPT Model

International Journal for Research in Applied Science and Engineering Technology
Abstract: In today’s digital age, a vast amount of video content is generated and shared on the internet every minute. However, extracting relevant information from these videos can be time-consuming and challenging. This is where video transcript summarization comes in, providing a concise summary of video content without the need to watch the entire ...
exaly +2 more sources

Fine-Tuning OpenAI Whisper-Small for Domain-Specific Medical Speech Recognition within a Microservice Architecture

Informatica (Slovenia)
We fine-tune Whisper-small (244M parameters) on 8.5 hours of in-domain medical audio and evaluatewith word error rate (WER). Compared to an unadapted Whisper-small baseline, our fine-tuned modelreduces WER from ∼63% to ∼32%. While the relative gain is substantial, this accuracy is not suitablefor unsupervised clinical use; we position the system as a ...
Alaeddine Moussa, Noursene Drine
exaly +2 more sources

Instant Transcription and Translation Tool using OpenAI?s Whisper ASR Model

International Journal of Science and Research (Raipur, India), 2022
Akarsh Ghale, Janaki K, Devaraj Verma C
exaly +2 more sources

artificial intelligence
speech recognition
fos: computer and information sciences

machine learning cs.lg
personalized rehabilitation
speech disorders

mobile health
speech therapy
audio and speech processing eess.as