Results 21 to 30 of about 21,687 (132)

Automatic tagging and geotagging in video collections and communities [PDF]

open access: yes, 2011
Automatically generated tags and geotags hold great promise to improve access to video collections and online communi- ties. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario ...
Jones, Gareth J.F.   +3 more
core   +1 more source

Acoustic Word Embedding System for Code-Switching Query-by-example Spoken Term Detection [PDF]

open access: yes2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP), 2021
In this paper, we propose a deep convolutional neural network-based acoustic word embedding system on code-switching query by example spoken term detection. Different from previous configurations, we combine audio data in two languages for training instead of only using one single language.
Ma, Murong   +5 more
openaire   +2 more sources

Query-by-Example Spoken Term Detection ALBAYZIN 2012 evaluation: overview, systems, results, and discussion [PDF]

open access: yesEURASIP Journal on Audio, Speech, and Music Processing, 2013
Query-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much interest due to the high volume of information stored in audio or audiovisual format.
Tejedor, J.   +6 more
openaire   +6 more sources

Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data

open access: yes, 2017
Audio Word2Vec offers vector representations of fixed dimensionality for variable-length audio segments using Sequence-to-sequence Autoencoder (SA).
Lee, Hung-Yi   +2 more
core   +1 more source

Unsupervised Spoken Term Detection with Spoken Queries by Multi-level Acoustic Patterns with Varying Model Granularity

open access: yes, 2015
This paper presents a new approach for unsupervised Spoken Term Detection with spoken queries using multiple sets of acoustic patterns automatically discovered from the target corpus.
Chan, Chun-an   +2 more
core   +1 more source

Simultaneous Localization and Recognition of Dynamic Hand Gestures [PDF]

open access: yes, 2004
A framework for the simultaneous localization and recognition of dynamic hand gestures is proposed. At the core of this framework is a dynamic space-time warping (DSTW) algorithm, that aligns a pair of query and model gestures in both space and time. For
J. Alon   +4 more
core   +7 more sources

Query-by-Example Spoken Term Detection [PDF]

open access: yes, 2014
Tato práce se zabývá vyhledáváním výrazů v řeči pomocí mluvených příkladů (QbE STD). Výrazy jsou zadávány v mluvené podobě a jsou vyhledány v množině řečových nahrávek, výstupem vyhledávání je seznam detekcí s jejich skóre a časováním. V práci popisujeme,
Fapšo, Michal
core  

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

open access: yes, 2020
Keyword spotting (KWS) and speaker verification (SV) have been studied independently although it is known that acoustic and speaker domains are complementary. In this paper, we propose a multi-task network that performs KWS and SV simultaneously to fully
Goo, Jahyun   +3 more
core   +1 more source

Fast and Accurate OOV Decoder on High-Level Features

open access: yes, 2017
This work proposes a novel approach to out-of-vocabulary (OOV) keyword search (KWS) task. The proposed approach is based on using high-level features from an automatic speech recognition (ASR) system, so called phoneme posterior based (PPB) features, for
Khokhlov, Yuri   +3 more
core   +1 more source

Evaluation of spoken document retrieval for historic speech collections [PDF]

open access: yes, 2008
The re-use of spoken word audio collections maintained by audiovisual archives is severely hindered by their generally limited access. The CHoral project, which is part of the CATCH program funded by the Dutch Research Council, aims to provide users of ...
Heeren, W.   +4 more
core   +1 more source

Home - About - Disclaimer - Privacy