Results 21 to 30 of about 246,748 (291)
Automatic annotation of tennis games: An integration of audio, vision, and learning [PDF]
Fully automatic annotation of tennis game using broadcast video is a task with a great potential but with enormous challenges. In this paper we describe our approach to this task, which integrates computer vision, machine listening, and machine learning.
Fei Yan +27 more
core +1 more source
Limitations and Performance Analysis of Spherical Sector Harmonics for Sound Field Processing
Developing spherical sector harmonics (SSHs) benefits sound field decomposition and analysis over spherical sector regions. Although SSHs demonstrate potential in the field of spatial audio, a comprehensive investigation into their properties and ...
Hanwen Bi +4 more
doaj +1 more source
Speech enhancement methods based on binaural cue coding
According to the encoding and decoding mechanism of binaural cue coding (BCC), in this paper, the speech and noise are considered as left channel signal and right channel signal of the BCC framework, respectively.
Xianyun Wang, Changchun Bao
doaj +1 more source
A framework for invertible, real-time constant-Q transforms
Audio signal processing frequently requires time-frequency representations and in many applications, a non-linear spacing of frequency-bands is preferable.
Dörfler, Monika +3 more
core +1 more source
Asynchronous spiking neurons, the natural key to exploit temporal sparsity [PDF]
Inference of Deep Neural Networks for stream signal (Video/Audio) processing in edge devices is still challenging. Unlike the most state of the art inference engines which are efficient for static signals, our brain is optimized for real-time dynamic ...
Cavalcante Holanda, Priscila +10 more
core +1 more source
This paper introduces RSoANU, a dataset of real multichannel room impulse responses (RIRs) obtained in a recording studio. Compared to the current publicly available datasets, RSoANU distinguishes itself by featuring RIRs captured using both a 32-channel
Grace Chesworth +2 more
doaj +1 more source
Intelligibility and Listening Effort of Spanish Oesophageal Speech
Communication is a huge challenge for oesophageal speakers, be it for interactions with fellow humans or with digital voice assistants. We aim to quantify these communication challenges (both human−human and human−machine interactions) by ...
Sneha Raman +4 more
doaj +1 more source
Deep Learning for Audio Signal Processing
Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing.
Chang, Shuo-yiin +5 more
core +1 more source
Adaptive DCTNet for Audio Signal Classification
In this paper, we investigate DCTNet for audio signal classification. Its output feature is related to Cohen's class of time-frequency distributions. We introduce the use of adaptive DCTNet (A-DCTNet) for audio signals feature extraction.
Gan, Zhe +4 more
core +1 more source
Special Issue on “Sound and Music Computing”
Sound and music computing is a young and highly multidisciplinary research field. [...]
Tapio Lokki +3 more
doaj +1 more source

