Results 11 to 20 of about 21,687 (132)

Sparse Subspace Modeling for Query by Example Spoken Term Detection [PDF]

open access: yesIEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018
This paper focuses on the problem of query by example spoken term detection (QbE-STD) in zero-resource scenario. Current state-of-the-art approaches to tackle this problem rely on dynamic programming based template matching techniques using phone posterior features extracted at the output of a deep neural network. Previously, it has been shown that the
Dhananjay Ram   +2 more
openaire   +3 more sources

Query-by-Example Speech Search Using Recurrent Neural Acoustic Word Embeddings With Temporal Context

open access: yesIEEE Access, 2019
Acoustic word embeddings (AWEs) have been popular in low-resource query-by-example speech search. They are using vector distances to find the spoken query in search content, which has much lower computation than the conventional dynamic time warping (DTW)
Yougen Yuan   +4 more
doaj   +3 more sources

Exploring the Effectiveness of Feature Reduction and Kernel-Based Matching for Query-by- Example Spoken Term Detection Using CNN

open access: yesIEEE Access
Query-by-example spoken term detection (QbE-STD) refers to the search for an audio query in a repository of audio utterances. A common approach for QbE-STD involves computing a matching matrix between the feature representations of the query and the ...
Manisha Naik Gaonkar   +3 more
doaj   +3 more sources

Learning Acoustic Word Embeddings With Dynamic Time Warping Triplet Networks

open access: yesIEEE Access, 2020
In the last years, acoustic word embeddings (AWEs) have gained significant interest in the research community. It applies specifically to the application of acoustic embeddings in the Query-by-Example Spoken Term Detection (QbE-STD) search and related ...
Denis Shitov   +3 more
doaj   +1 more source

Query-by-example spoken term detection based on phonetic posteriorgram [PDF]

open access: yesAdvances in Social Science, Education and Humanities Research, 2015
Spoken term detection in low-resource situations is a challenging problem, because traditional large vocabulary continuous speech recognition (LVCSR) approaches are often unusable. This paper introduces a method to use deep neural network (DNN) softmax outputs as input features in a query-by-example (QBE) spoken term detection (STD) system.
Michael T. Johnson   +4 more
openaire   +1 more source

Query-by-Example Spoken Term Detection Using Attention-Based Multi-Hop Networks [PDF]

open access: yes2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018
Retrieving spoken content with spoken queries, or query-by- example spoken term detection (STD), is attractive because it makes possible the matching of signals directly on the acoustic level without transcribing them into text. Here, we propose an end-to-end query-by-example STD model based on an attention-based multi-hop network, whose input is a ...
Ao, Chia-Wei, Lee, Hung-yi
openaire   +2 more sources

Spoken content retrieval: A survey of techniques and technologies [PDF]

open access: yes, 2012
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings.
Ani Nenkova   +3 more
core   +3 more sources

Spoken query processing for interactive information retrieval [PDF]

open access: yes, 2002
It has long been recognised that interactivity improves the effectiveness of information retrieval systems. Speech is the most natural and interactive medium of communication and recent progress in speech recognition is making it possible to build ...
Barnett   +19 more
core   +1 more source

The uncertain representation ranking framework for concept-based video retrieval [PDF]

open access: yes, 2012
Concept based video retrieval often relies on imperfect and uncertain concept detectors. We propose a general ranking framework to define effective and robust ranking functions, through explicitly addressing detector uncertainty.
Aly, Robin   +4 more
core   +4 more sources

Interactive searching and browsing of video archives: using text and using image matching [PDF]

open access: yes, 2006
Over the last number of decades much research work has been done in the general area of video and audio analysis. Initially the applications driving this included capturing video in digital form and then being able to store, transmit and render it ...
Gurrin, Cathal   +2 more
core   +1 more source

Home - About - Disclaimer - Privacy