Connectionist temporal classification

Results 41 to 50 of about 14,332 (195)

Spoken term detection with Connectionist Temporal Classification: A novel hybrid CTC-DBN decoder [PDF]

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010
This paper proposes a novel system for robust keyword detection in continuous speech. Our decoder is composed of a bidirectional Long Short-Term Memory recurrent neural network using a Connectionist Temporal Classification (CTC) output layer, and a Dynamic Bayesian Network (DBN).
Wöllmer, Martin +3 more
openaire +4 more sources

End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification [PDF]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018
EMNLP ...
Libovický, Jindřich, Helcl, Jindřich
openaire +4 more sources

Towards end-to-end speech recognition with transfer learning

EURASIP Journal on Audio, Speech, and Music Processing, 2018
A transfer learning-based end-to-end speech recognition approach is presented in two levels in our framework. Firstly, a feature extraction approach combining multilingual deep neural network (DNN) training with matrix factorization algorithm is ...
Chu-Xiong Qin, Dan Qu, Lian-Hai Zhang
doaj +1 more source

Bidirectional Representations for Low-Resource Spoken Language Understanding

Applied Sciences, 2023
Speech representation models lack the ability to efficiently store semantic information and require fine tuning to deliver decent performance. In this research, we introduce a transformer encoder–decoder framework with a multiobjective training strategy,
Quentin Meeus, Marie-Francine Moens, Hugo Van hamme +2 more
doaj +1 more source

Design of recognition algorithm for multiclass digital display instrument based on convolution neural network

Biomimetic Intelligence and Robotics, 2023
Digital display instrument identification is a crucial approach for automating the collection of digital display data. In this study, we propose a digital display area detection CTPNpro algorithm to address the problem of recognizing multiclass digital ...
Xuanzhang Wen +5 more
doaj +1 more source

Learning Hard Alignments with Variational Inference

, 2017
There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard attention can offer benefits over soft attention such as decreased computational cost, but training
Chiu, Chung-Cheng +5 more
core +1 more source

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

, 2020
Keyword spotting (KWS) and speaker verification (SV) have been studied independently although it is known that acoustic and speaker domains are complementary. In this paper, we propose a multi-task network that performs KWS and SV simultaneously to fully
Goo, Jahyun +3 more
core +1 more source

Attention-based CNN-ConvLSTM for Handwritten Arabic Word Extraction

ELCVIA Electronic Letters on Computer Vision and Image Analysis, 2022
Word extraction is one of the most critical steps in handwritten recognition systems. It is challenging for many reasons, such as the variability of handwritten writing styles, touching and overlapping characters, skewness problems, diacritics ...
takwa Ben Aicha, Afef Kacem Echi
doaj +1 more source

Simultaneous Neural Machine Translation using Connectionist Temporal Classification

, 2019
Simultaneous machine translation is a variant of machine translation that starts the translation process before the end of an input. This task faces a trade-off between translation accuracy and latency. We have to determine when we start the translation for observed inputs so far, to achieve good practical performance. In this work, we propose a neural
Chousa, Katsuki, Sudoh, Katsuhito, Nakamura, Satoshi +2 more
openaire +2 more sources

Inter‐Model Feature Fusion for Robust Low‐Resource Speech Recognition

Applied AI Letters, Volume 7, Issue 2, June 2026.
Our Self‐Supervised Feature Fusion (SSF‐FT) method enhances low‐resource speech recognition by adaptively combining features from self‐supervised models trained with Contrastive, Predictive, and Reconstruction objectives. This attention‐weighted ensemble delivers robust performance, particularly in acoustically challenging conditions, extending current
Ussen Kimanuka, Ciira wa Maina, Osman Büyük +2 more
wiley +1 more source

fos: computer and information sciences
speech recognition
deep learning

computer science - computation and language
computation and language cs.cl
computer science - machine learning

convolutional neural network
end-to-end
machine learning cs.lg