Results 21 to 30 of about 420 (159)
Speech data has rich acoustic and paralinguistic information with important cues for understanding a speaker's tone, emotion, and intent, yet traditional large language models such as BERT do not incorporate this information.
Li, Yulong +4 more
core +2 more sources
Нови тенденциї у розвою сучасней линґвистики у Сербї
ANALIZA I KLASYFIKACJA JĘZYKA RUSIŃSKIEGO PRZY UŻYCIU MODELU SZTUCZNEJ SIECI NEURONOWEJ ASR OPENAI WHISPER Artykuł przedstawia analizę lingwistyczną języka rusińskiego, koncentrując się na jego złożonych i zmieniających się aspektach, takich jak wymowa
Paweł Małecki, Magdalena Piotrowska
doaj +1 more source
HPB SmartNotes: The impact of artificial intelligence on surgeon workload in the outpatient office
In this study, we aimed to evaluate the feasibility, linguistic accuracy, and coherence of medical notes generated by the integration of an automatic speech recognition system (ASR) and a generative pre-trained transformer (GPT) in an outpatient surgical
Rodrigo Antonio Gasque +6 more
doaj +1 more source
Speech Recognition and Synthesis Models and Platforms for the Kazakh Language
With the rapid development of artificial intelligence and machine learning technologies, automatic speech recognition (ASR) and text-to-speech (TTS) have become key components of the digital transformation of society.
Aidana Karibayeva +3 more
doaj +1 more source
Estimación de incertidumbre para un sistema de reconocimiento de voz
Whisper es un sistema de reconocimiento de voz diseñado por la compañía OpenAI, dicho sistema ha sido entrenado con 680,000 horas de datos supervisados multilingües y multitarea recopilados de la web.
Walter Morales-Muñoz +1 more
doaj +1 more source
LLM‐Integrated Human–Robot Interaction System for Microrobots
This paper proposes an LLM‐based control framework for guiding microrobots using human natural language. This framework can convert the natural human speech into safe and executable command sets for reliable navigation in complex environments. The experimental results show high accuracy and robustness in task performance, demonstrating the potential of
Bairong Zhu, Amar Salehi, Tingting Yu
wiley +1 more source
Emotion and Engagement Across the Idol Spectrum: Comparing Virtual and Human Idols
ABSTRACT The increasing adoption of virtual idols in entertainment platforms raises critical questions about how viewers respond to their emotional performances compared to human idols. Despite their growing presence, little is known about whether and how emotional expressivity differs across performer modalities and content formats, or how these ...
Lin Kim +4 more
wiley +1 more source
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Pre-training speech models on large volumes of data has achieved remarkable success. OpenAI Whisper is a multilingual multitask model trained on 680k hours of supervised speech data.
Tian, Jinchuan +15 more
core
ABSTRACT Aim Children born to mothers with chronic Hepatitis B virus (HBV) infection are at substantial risk of developing chronic HBV‐infection without appropriate perinatal post‐exposure treatment. This study aimed to explore midwives' and public health nurses' (PHNs) experiences with HBV‐post‐exposure treatment for infants and identify factors ...
Brita Askeland Winje +4 more
wiley +1 more source
ABSTRACT The location of public services impacts children's living and service‐reception conditions, as well as the work of child welfare service providers. Against the background of growing inequality and segregation in Sweden, this article explores the work of child welfare services when located in the urban periphery.
Tobias Jansson, Kajsa Nolbeck
wiley +1 more source

