Connectionist temporal classification

Results 21 to 30 of about 14,332 (195)

A study on constraining Connectionist Temporal Classification for temporal audio alignment

Interspeech 2022, 2022
Connectionist Temporal Classification (CTC) has become a standard for deep learning-based temporal alignment allowing relevant probabilistic distributions to be learned. However, by nature, CTC is a transcription objective that can be minimized without guaranteeing any alignment properties.
Teytaut, Yann, Bouvier, Baptiste, Roebel, Axel +2 more
openaire +2 more sources

End-to-End Sequence Labeling via Convolutional Recurrent Neural Network with a Connectionist Temporal Classification Layer

International Journal of Computational Intelligence Systems, 2020
Sequence labeling is a common machine-learning task which not only needs the most likely prediction of label for a local input but also seeks the most suitable annotation for the whole input sequence.
Xiaohui Huang +4 more
doaj +1 more source

End to End Alignment Learning of Instructional Videos with Spatiotemporal Hybrid Encoding and Decoding Space Reduction

Applied Sciences, 2021
We solve the problem of how to densely align actions in videos at frame level, with only the order of occurring actions available, in order to save the time-consuming efforts to accurately annotate the temporal boundaries of each action. We propose three
Lin Wang, Xingfu Wang, Ammar Hawbani, Yan Xiong +3 more
doaj +1 more source

Nasal Speech Sounds Detection Using Connectionist Temporal Classification [PDF]

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018
Phone attributes, known also as distinctive or phonological features, belong to important classification of the speech sounds used in automatic speech processing. Training of conventional phone attribute detectors (classifiers), either based on acoustic measurements or deep learning approaches, requires decent phone boundary segmentation.
Milos Cernak, Sibo Tong
openaire +1 more source

Training LDCRF model on unsegmented sequences using connectionist temporal classification [PDF]

2016 6th International Conference on Computer and Knowledge Engineering (ICCKE), 2016
Many machine learning problems such as speech recognition, gesture recognition, and handwriting recognition are concerned with simultaneous segmentation and labeling of sequence data. Latent-dynamic conditional random field (LDCRF) is a well-known discriminative method that has been successfully used for this task.
Atashin, Amir Ahooye +2 more
openaire +2 more sources

A Helium Speech Unscrambling Algorithm Based on Deep Learning

Information, 2023
Helium speech, the language spoken by divers in the deep sea who breathe a high-pressure helium–oxygen mixture, is almost unintelligible. To accurately unscramble helium speech, a neural network based on deep learning is proposed.
Yonghong Chen, Shibing Zhang
doaj +1 more source

Towards multilingual end‐to‐end speech recognition for air traffic control

IET Intelligent Transport Systems, 2021
In this work, an end‐to‐end framework is proposed to achieve multilingual automatic speech recognition (ASR) in air traffic control (ATC) systems. Considering the standard ATC procedure, a recurrent neural network (RNN) based framework is selected to ...
Yi Lin, Bo Yang, Dongyue Guo, Peng Fan
doaj +1 more source

Word Beam Search: A Connectionist Temporal Classification Decoding Algorithm [PDF]

2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2018
Recurrent Neural Networks (RNNs) are used for sequence recognition tasks such as Handwritten Text Recognition (HTR) or speech recognition. If trained with the Connectionist Temporal Classification (CTC) loss function, the output of such a RNN is a matrix containing character probabilities for each time-step.
Harald Scheidl, Stefan Fiel, Robert Sablatnig +2 more
openaire +1 more source

Decoding Handwriting Trajectories from Intracortical Brain Signals for Brain-to-Text Communication. [PDF]

Adv Sci (Weinh)
By developing a novel framework that optimizes both shape and temporal loss during decoder training, the authors successfully reconstruct human‐recognizable handwriting trajectories from intracortical neural signals for both Chinese characters and English letters, effectively resolving the temporal misalignment problem in clinical BCIs, thereby ...
Xu G +6 more
europepmc +2 more sources

Connectionist natural language parsing [PDF]

, 2002
The key developments of two decades of connectionist parsing are reviewed. Connectionist parsers are assessed according to their ability to learn to represent syntactic structures from examples automatically, without being presented with symbolic grammar
Berg +52 more
core +1 more source

fos: computer and information sciences
speech recognition
deep learning

computer science - computation and language
computation and language cs.cl
computer science - machine learning

convolutional neural network
end-to-end
machine learning cs.lg