Results 61 to 70 of about 1,019,266 (189)

Deep Multimodal Semantic Embeddings for Speech and Images

open access: yes, 2015
In this paper, we present a model which takes as input a corpus of images with relevant spoken captions and finds a correspondence between the two modalities. We employ a pair of convolutional neural networks to model visual objects and speech signals at
Glass, James, Harwath, David
core   +1 more source

Multilingual search for cultural heritage archives via combining multiple translation resources [PDF]

open access: yes, 2007
The linguistic features of material in Cultural Heritage (CH) archives may be in various languages requiring a facility for effective multilingual search.
Debole, Franca   +4 more
core  

Speaker-Dependent Speech Enhancement Using Codebook-based Synthesis for Low SNR Applications

open access: yesInternational Journal of Information and Communication Technology Research, 2013
In this paper, a speaker-dependent speech enhancement is performed by using the codebooks. For this purpose, making use of the STFT parameters, two codebooks are designed for speech and noise separately.
Roghayeh Doost, Abolghasem Sayadian
doaj  

Chinese Speech Recognition Using Conformer Fused with Max Pooling [PDF]

open access: yesJisuanji gongcheng
Speech recognition technology enables machines to understand human speech using advanced algorithms and signal processing technologies, thereby making communication between humans and machines more convenient.
HU Conggang, YANG Lipeng, SUN Yongqi, CHEN Hualong, HAN Keke
doaj   +1 more source

GTH-UPM system for search on speech evaluation

open access: yes, 2014
This paper describes the GTH-UPM system for the Albayzin 2014 Search on Speech Evaluation. Teh evaluation task consists of searching a list of terms/queries in audio files. The GTH-UPM system we are presenting is based on a LVCSR (Large Vocabulary Continuous Speech Recognition) system.
Echeverry Correa, Julian David   +2 more
openaire   +1 more source

An Optimized Hyperparameter Tuning for Improved Hate Speech Detection with Multilayer Perceptron

open access: yesJurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
Hate speech classification is a critical task in the domain of natural language processing, aiming to mitigate the negative impacts of harmful content on digital platforms.
Muhamad Ridwan, Ema Utami
doaj   +1 more source

UTwente does Brave New Tasks for MediaEval 2012: Searching and Hyperlinking [PDF]

open access: yes, 2012
In this paper we report our experiments and results for the brave new searching and hyperlinking tasks for the MediaEval Benchmark Initiative 2012. The searching task involves nding target video segments based on a short natural language sentence query ...
Aly, R.B.N.   +2 more
core   +1 more source

A quick search method for audio signals based on a piecewise linear representation of feature trajectories

open access: yes, 2007
This paper presents a new method for a quick similarity-based search through long unlabeled audio streams to detect and locate audio clips provided by users.
Kashino, Kunio   +3 more
core   +1 more source

A lexical database tool for quantitative phonological research

open access: yes, 1997
A lexical database tool tailored for phonological research is described. Database fields include transcriptions, glosses and hyperlinks to speech files.
Bird, Steven
core   +3 more sources

Dublin City University video track experiments for TREC 2002 [PDF]

open access: yes, 2002
Dublin City University participated in the Feature Extraction task and the Search task of the TREC-2002 Video Track. In the Feature Extraction task, we submitted 3 features: Face, Speech, and Music. In the Search task, we developed an interactive video
Browne, Paul   +10 more
core  

Home - About - Disclaimer - Privacy