Results 1 to 10 of about 243 (116)

Enhancing supermarket robot interaction: an equitable multi-level LLM conversational interface for handling diverse customer intents [PDF]

open access: yesFrontiers in Robotics and AI
This paper presents the design and evaluation of a comprehensive system to develop voice-based interfaces to support users in supermarkets. These interfaces enable shoppers to convey their needs through both generic and specific queries.
Chandran Nandkumar, Luka Peternel
doaj   +2 more sources

Automated Assessment of Word- and Sentence-Level Speech Intelligibility in Developmental Motor Speech Disorders: A Cross-Linguistic Investigation [PDF]

open access: yesDiagnostics
Background/Objectives: Accurate assessment of speech intelligibility is necessary for individuals with motor speech disorders. Transcription or scaled rating methods by naïve listeners are the most reliable tasks for these purposes; however, they are ...
Micalle Carl, Michal Icht
doaj   +2 more sources

Нови тенденциї у розвою сучасней линґвистики у Сербї

open access: yesRìčnik Ruskoj Bursy
ANALIZA I KLASYFIKACJA JĘZYKA RUSIŃSKIEGO PRZY UŻYCIU MODELU SZTUCZNEJ SIECI NEURONOWEJ ASR OPENAI WHISPER Artykuł przedstawia analizę lingwistyczną języka rusińskiego, koncentrując się na jego złożonych i zmieniających się aspektach, takich jak wymowa
Paweł Małecki, Magdalena Piotrowska
doaj   +2 more sources

From voice to ink (Vink): development and assessment of an automated, free-of-charge transcription tool [PDF]

open access: yesBMC Research Notes
Background Verbatim transcription of qualitative audio data is a cornerstone of analytic quality and rigor, yet the time and energy required for such transcription can drain resources, delay analysis, and hinder the timely dissemination of qualitative ...
Hannah Tolle   +6 more
doaj   +2 more sources

Spoken Language Analysis in Aging Research: The Validity of AI-Generated Speech to Text Using OpenAI's Whisper. [PDF]

open access: yesGerontology
Introduction: Studying what older adults say can provide important insights into cognitive, affective, and social aspects of aging. Available language analysis tools generally require audio-recorded speech to be transcribed into verbatim text, a task that has historically been performed by humans.
Naffah A, Pfeifer VA, Mehl MR.
europepmc   +2 more sources

Automated Caption Generation for Video Call with Language Translation [PDF]

open access: yesE3S Web of Conferences, 2023
In the modern era, virtual communication between individuals is common. Many people’s lives have been made simpler in a number of circumstances by providing subtitles, generating automated captions for social media videos, and language translation from a
Polepaka Sanjeeva   +4 more
doaj   +1 more source

Evaluating OpenAI’s Whisper ASR: Performance Analysis Across Diverse Accents and Speaker Traits

open access: yesJASA Express Letters, 2023
This research explores the performance of the Whisper's ASR system on different native and non-native English accents. The findings indicate better performance on North American vs British and Irish English accents; and on native vs native accents. The analysis also unearths links between speaker traits (sex, L1 typology, and L2 proficiency) and word ...
Graham, Calbert, Roll, Nathan
openaire   +2 more sources

Using HIPAA (Health Insurance Portability and Accountability Act)–Compliant Transcription Services for Virtual Psychiatric Interviews: Pilot Comparison Study

open access: yesJMIR Mental Health, 2023
BackgroundAutomatic speech recognition (ASR) technology is increasingly being used for transcription in clinical contexts. Although there are numerous transcription services using ASR, few studies have compared the word error rate
Salman Seyedi   +11 more
doaj   +1 more source

Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper

open access: yes, 2023
This article introduces Mi-Go, a novel testing framework aimed at evaluating the performance and adaptability of general-purpose speech recognition machine learning models across diverse real-world scenarios. The framework leverages YouTube as a rich and continuously updated data source, accounting for multiple languages, accents, dialects, speaking ...
Wojnar, Tomasz   +2 more
openaire   +2 more sources

Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person

open access: yes, 2023
Automatic speech recognition (ASR) systems play a key role in applications involving human-machine interactions. Despite their importance, ASR models for the Portuguese language proposed in the last decade have limitations in relation to the correct identification of punctuation marks in automatic transcriptions, which hinder the use of transcriptions ...
Gris, Lucas Rafael Stefanel   +5 more
openaire   +2 more sources

Home - About - Disclaimer - Privacy