Results 41 to 50 of about 14,332 (195)

Spoken term detection with Connectionist Temporal Classification: A novel hybrid CTC-DBN decoder [PDF]

open access: yes2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010
This paper proposes a novel system for robust keyword detection in continuous speech. Our decoder is composed of a bidirectional Long Short-Term Memory recurrent neural network using a Connectionist Temporal Classification (CTC) output layer, and a Dynamic Bayesian Network (DBN).
Wöllmer, Martin   +3 more
openaire   +4 more sources

End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification [PDF]

open access: yesProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018
EMNLP ...
Libovický, Jindřich, Helcl, Jindřich
openaire   +4 more sources

Towards end-to-end speech recognition with transfer learning

open access: yesEURASIP Journal on Audio, Speech, and Music Processing, 2018
A transfer learning-based end-to-end speech recognition approach is presented in two levels in our framework. Firstly, a feature extraction approach combining multilingual deep neural network (DNN) training with matrix factorization algorithm is ...
Chu-Xiong Qin, Dan Qu, Lian-Hai Zhang
doaj   +1 more source

Bidirectional Representations for Low-Resource Spoken Language Understanding

open access: yesApplied Sciences, 2023
Speech representation models lack the ability to efficiently store semantic information and require fine tuning to deliver decent performance. In this research, we introduce a transformer encoder–decoder framework with a multiobjective training strategy,
Quentin Meeus   +2 more
doaj   +1 more source

Design of recognition algorithm for multiclass digital display instrument based on convolution neural network

open access: yesBiomimetic Intelligence and Robotics, 2023
Digital display instrument identification is a crucial approach for automating the collection of digital display data. In this study, we propose a digital display area detection CTPNpro algorithm to address the problem of recognizing multiclass digital ...
Xuanzhang Wen   +5 more
doaj   +1 more source

Learning Hard Alignments with Variational Inference

open access: yes, 2017
There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard attention can offer benefits over soft attention such as decreased computational cost, but training
Chiu, Chung-Cheng   +5 more
core   +1 more source

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

open access: yes, 2020
Keyword spotting (KWS) and speaker verification (SV) have been studied independently although it is known that acoustic and speaker domains are complementary. In this paper, we propose a multi-task network that performs KWS and SV simultaneously to fully
Goo, Jahyun   +3 more
core   +1 more source

Attention-based CNN-ConvLSTM for Handwritten Arabic Word Extraction

open access: yesELCVIA Electronic Letters on Computer Vision and Image Analysis, 2022
Word extraction is one of the most critical steps in handwritten recognition systems. It is challenging for many reasons, such as the variability of handwritten writing styles, touching and overlapping characters, skewness problems, diacritics ...
takwa Ben Aicha, Afef Kacem Echi
doaj   +1 more source

Simultaneous Neural Machine Translation using Connectionist Temporal Classification

open access: yes, 2019
Simultaneous machine translation is a variant of machine translation that starts the translation process before the end of an input. This task faces a trade-off between translation accuracy and latency. We have to determine when we start the translation for observed inputs so far, to achieve good practical performance. In this work, we propose a neural
Chousa, Katsuki   +2 more
openaire   +2 more sources

Inter‐Model Feature Fusion for Robust Low‐Resource Speech Recognition

open access: yesApplied AI Letters, Volume 7, Issue 2, June 2026.
Our Self‐Supervised Feature Fusion (SSF‐FT) method enhances low‐resource speech recognition by adaptively combining features from self‐supervised models trained with Contrastive, Predictive, and Reconstruction objectives. This attention‐weighted ensemble delivers robust performance, particularly in acoustically challenging conditions, extending current
Ussen Kimanuka   +2 more
wiley   +1 more source

Home - About - Disclaimer - Privacy