Results 41 to 50 of about 19,125,289 (344)
Localization of multiple speakers using microphone arrays remains a challenging problem, especially in the presence of noise and reverberation. State-of-the-art localization algorithms generally exploit the sparsity of speech in some representation for ...
Sushmita Thakallapalli +2 more
doaj +1 more source
iCUS: Intelligent CU Size Selection for HEVC Inter Prediction
The hierarchical quadtree partitioning of Coding Tree Units (CTU) is one of the striking features in HEVC that contributes towards its superior coding performance over its predecessors.
Buddhiprabha Erabadda +3 more
doaj +1 more source
Elastic CRFs for Open-Ontology Slot Filling
Slot filling is a crucial component in task-oriented dialog systems that is used to parse (user) utterances into semantic concepts called slots. An ontology is defined by the collection of slots and the values that each slot can take.
Yinpei Dai +5 more
doaj +1 more source
Speech Processing in Computer Vision Applications [PDF]
Deep learning has been recently proven to be a viable asset in determining features in the field of Speech Analysis. Deep learning methods like Convolutional Neural Networks facilitate the expansion of specific feature information in waveforms, allowing ...
Waterworth, Nicholas
core +2 more sources
Effective Exploitation of Posterior Information for Attention-Based Speech Recognition
End-to-end attention-based modeling is increasingly popular for tackling sequence-to-sequence mapping tasks. Traditional attention mechanisms utilize prior input information to derive attention, which then conditions the output.
Jian Tang +4 more
doaj +1 more source
Effective Dereverberation with a Lower Complexity at Presence of the Noise
Adaptive beamforming and deconvolution techniques have shown effectiveness for reducing noise and reverberation. The minimum variance distortionless response (MVDR) beamformer is the most widely used for adaptive beamforming, whereas multichannel linear ...
Fengqi Tan, Changchun Bao, Jing Zhou
doaj +1 more source
Transfer Learning for Speech and Language Processing [PDF]
Transfer learning is a vital technique that generalizes models trained for one setting or task to other settings or tasks. For example in speech recognition, an acoustic model trained for one language can be used to recognize speech in another language ...
Wang, Dong, Zheng, Thomas Fang
core +1 more source
Federated Learning for privacy-Friendly Health Apps: A Case Study on Ovulation Tracking
In an era of increasing reliance on digital health solutions, safeguarding user privacy has emerged as a paramount concern. Health applications often need to balance advanced AI functionalities with sufficient privacy measures to ensure user engagement ...
Nikolaos Pavlidis +12 more
doaj +1 more source
Impaired Auditory Temporal Selectivity in the Inferior Colliculus of Aged Mongolian Gerbils [PDF]
Aged humans show severe difficulties in temporal auditory processing tasks (e.g., speech recognition in noise, low-frequency sound localization, gap detection). A degradation of auditory function with age is also evident in experimental animals.
Grothe, Benedikt +2 more
core +1 more source
Segment boundary detection directed attention for online end-to-end speech recognition
Attention-based encoder-decoder models have recently shown competitive performance for automatic speech recognition (ASR) compared to conventional ASR systems.
Junfeng Hou +3 more
doaj +1 more source

