Results 21 to 30 of about 1,179 (120)
Nativization of foreign names in TTS for automatic reading of world news in Swahili [PDF]
When a text-To-speech (TTS) system is required to speak world news, a large fraction of the words to be spoken will be proper names originating in a wide variety of languages.
King, Simon +3 more
core +1 more source
Comparison of spatial sound recording techniques with usage of ambisonics and object-based audio [PDF]
In this article spatial audio recording techniques are compared: scene-based audio and object-based audio. The study involved mixing recordings from a higher-order ambisonic microphone and support microphones, ambisonically encoded on a virtual sphere ...
Bartłomiej Mróz, Patryk Kosior
doaj +1 more source
Optimal Non-Uniform Sampling by Branch-and-Bound Approach for Speech Coding
Speech coding plays a significant role in voice communication and improving network bandwidth efficiency for applications that require long-distance communication or storage space utilization. Non-uniform sampling (NUS) is a technique for the same, which
Sakshi Pandey, Amit Banerjee
doaj +1 more source
An adaptive stereo basis method for convolutive blind audio source separation [PDF]
NOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may ...
Abdallah +40 more
core +1 more source
Audio data compression has revolutionised the way in which the music industry and musicians sell and distribute their products. Our previous research presented a novel codec named ACER (Audio Compression Exploiting Repetition), which achieves data reduction by exploiting irrelevancy and redundancy in musical structure whilst generally maintaining ...
Stuart Cunningham +2 more
wiley +1 more source
Direct Modelling of Magnitude and Phase Spectra for Statistical Parametric Speech Synthesis [PDF]
We propose a simple new representation for the FFT spectrum tailored to statistical parametric speech synthesis. It consists of four feature streams that describe magnitude, phase and fundamental frequency using real numbers.
Espic calderón, Felipe +2 more
core +1 more source
Real‐Time Audio‐Visual Analysis for Multiperson Videoconferencing
We describe the design of a system consisting of several state‐of‐the‐art real‐time audio and video processing components enabling multimodal stream manipulation (e.g., automatic online editing for multiparty videoconferencing applications) in open, unconstrained environments.
Petr Motlicek +9 more
wiley +1 more source
Perceptual Optimization of Room-In-Room Reproduction with Spatially Distributed Loudspeakers [PDF]
It is often desirable to reproduce a specific room-acoustic scene, e.g. a concert hall in a playback room, in such a way that the listener has a plausible and authentic spatial impression of the original sound source including the room acoustical ...
Grosse, Julian, Par, Steven van de
core +1 more source
A Multi-Frame PCA-Based Stereo Audio Coding Method
With the increasing demand for high quality audio, stereo audio coding has become more and more important. In this paper, a multi-frame coding method based on Principal Component Analysis (PCA) is proposed for the compression of audio signals, including ...
Jing Wang +3 more
doaj +1 more source
Deep Denoising for Hearing Aid Applications
Reduction of unwanted environmental noises is an important feature of today's hearing aids (HA), which is why noise reduction is nowadays included in almost every commercially available device.
Aubreville, Marc +5 more
core +1 more source

