Common sound source localization algorithms focus on localizing all the active sources in the environment. While the source identities are generally unknown, retrieving the location of a speaker of interest requires extra effort. This paper addresses the
Ziteng Wang, Junfeng Li, Yonghong Yan
doaj +1 more source
Building competitive direct acoustics-to-word models for English conversational speech recognition
Direct acoustics-to-word (A2W) models in the end-to-end paradigm have received increasing attention compared to conventional sub-word based automatic speech recognition models using phones, characters, or context-dependent hidden Markov model states ...
Audhkhasi, Kartik +4 more
core +1 more source
Waveguide physical modeling of vocal tract acoustics: flexible formant bandwidth control from increased model dimensionality [PDF]
Digital waveguide physical modeling is often used as an efficient representation of acoustical resonators such as the human vocal tract. Building on the basic one-dimensional (1-D) Kelly-Lochbaum tract model, various speech synthesis techniques ...
Howard, D M, Mullen, J, Murphy, D T
core +1 more source
Polyphonic Piano Transcription with a Note-Based Music Language Model
This paper proposes a note-based music language model (MLM) for improving note-level polyphonic piano transcription. The MLM is based on the recurrent structure, which could model the temporal correlations between notes in music sequences. To combine the
Qi Wang, Ruohua Zhou, Yonghong Yan
doaj +1 more source
Modeling Speech Sound Radiation With Different Degrees of Realism for Articulatory Synthesis
Articulatory synthesis is based on modeling various physical phenomena of speech production, including sound radiation from the mouth. With regard to sound radiation, the most common approach is to approximate it in terms of a simple spherical source of ...
Peter Birkholz +5 more
doaj +1 more source
Robust Beamforming for Amplify-and-Forward MIMO Relay Systems Based on Quadratic Matrix Programming [PDF]
In this paper, robust transceiver design based on minimum-mean-square-error (MMSE) criterion for dual-hop amplify-and-forward MIMO relay systems is investigated.
Ma, Shaodan +3 more
core +3 more sources
The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) Corpus of Acoustic and 3D Articulatory Kinematic Data [PDF]
There is a significant need for more comprehensive electromagnetic articulography (EMA) datasets that can provide matched acoustics and articulatory kinematic data with good spatial and temporal resolution.
Berry, Jeffrey J. +2 more
core +2 more sources
Investigating perceptual discrimination thresholds for attributes of whole-body vibration
Understanding the limitations of haptic perception in humans is critical for the successful design of effective haptic feedback systems, however, it is unclear how perceived discrimination thresholds relate to specific qualitative perceptual attributes ...
Berkay Kullukcu +4 more
doaj +1 more source
An Archaeoacoustics Analysis of Cistercian Architecture: The Case of the Beaulieu Abbey
The Cistercian order is of acoustic interest because previous research has hypothesized that Cistercian architectural structures were designed for longer reverberation times in order to reinforce Gregorian chants.
Sebastian Duran +2 more
doaj +1 more source
Despite the remarkable progress achieved on automatic speech recognition, recognizing far-field speeches mixed with various noise sources is still a challenging task.
El-Khamy, Mostafa +2 more
core +1 more source

