Results 61 to 70 of about 227,711 (241)
Abstract Skin flash is typically added to breast and chestwall plans to ensure robust target coverage in the presence of respiratory motion, anatomic changes, and small setup uncertainties. Adding skin flash in volumetric modulated arc therapy (VMAT) plans is an iterative and manual process.
Emily Hubley+6 more
wiley +1 more source
DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis [PDF]
Automatic speaker recognition algorithms typically characterize speech audio using short-term spectral features that encode the physiological and anatomical aspects of speech production. Such algorithms do not fully capitalize on speaker-dependent characteristics present in behavioral speech features. In this work, we propose a prosody encoding network
arxiv
Automatic speech recognition and speech variability: A review
Major progress is being recorded regularly on both the technology and exploitation of automatic speech recognition (ASR) and spoken language systems. However, there are still technological barriers to flexible solutions and user satisfaction under some circumstances.
BENZEGHIBA M.+12 more
openaire +4 more sources
Abstract Purpose/objectives Recent technological advancements have increased efficiency for clinical deliverability of online‐adaptive‐radiotherapy (oART). Previous cone‐beam‐computed‐tomography (CBCT) generations lacked the ability to provide reliable Hounsfield‐units (HU), thus requiring oART workflows to rely on synthetic‐CT (sCT) images derived ...
Jingwei Duan+7 more
wiley +1 more source
Improved I-vector-based Speaker Recognition for Utterances with Speaker Generated Non-speech sounds [PDF]
Conversational speech not only contains several variants of neutral speech but is also prominently interlaced with several speaker generated non-speech sounds such as laughter and breath. A robust speaker recognition system should be capable of recognizing a speaker irrespective of these variations in his speech. An understanding of whether the speaker-
arxiv
Closing the gap in plan quality: Leveraging deep‐learning dose prediction for adaptive radiotherapy
Abstract Purpose Balancing quality and efficiency has been a challenge for online adaptive therapy. Most systems start the online re‐optimization with the original planning goals. While some systems allow planners to modify the planning goals, achieving a high‐quality plan within time constraints remains a common barrier.
Sean J. Domal+9 more
wiley +1 more source
Speech-Driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition [PDF]
Speech recognition has of late become a practical technology for real world applications. Aiming at speech-driven text retrieval, which facilitates retrieving information with spoken queries, we propose a method to integrate speech recognition and retrieval methods.
arxiv
Dose rate correction of a diode array for universal wedge field dosimetric verification
Abstract Purpose To study the performance of MapCHECK 3 (MC3) in measuring universal wedge fields and propose a dose rate correction strategy to improve MC3 measurement accuracy. Materials and methods Universal wedge fields with different wedge angles and field sizes were measured at different depths using MC3.
Linyi Shen+6 more
wiley +1 more source
Abstract Purpose This study evaluates a novel cone‐beam computed tomography (CBCT) imaging solution integrated onto a conventional C‐arm linear accelerator (linac) with increased gantry speed. The purpose is to assess the impact of improved imaging hardware and reconstruction algorithms on image quality.
Theodore Arsenault+9 more
wiley +1 more source
Modified Mel Filter Bank to Compute MFCC of Subsampled Speech [PDF]
Mel Frequency Cepstral Coefficients (MFCCs) are the most popularly used speech features in most speech and speaker recognition applications. In this work, we propose a modified Mel filter bank to extract MFCCs from subsampled speech. We also propose a stronger metric which effectively captures the correlation between MFCCs of original speech and MFCC ...
arxiv