Results 11 to 20 of about 149,578 (172)
We report associations between vowel sounds, graphemes, and colors collected online from over 1,000 Dutch speakers. We also provide open materials, including a Python implementation of the structure measure and code for a single-page web application to ...
C. Cuskley +3 more
semanticscholar +3 more sources
Survey of Visual Question Answering Based on Deep Learning [PDF]
Visual question answering(VQA) is an interdisciplinary research paradigm that involves computer vision and natural language processing.VQA generally requires both image and text data to be encoded,their mappings learned,and their features fused,before ...
LI Xiang, FAN Zhiguang, LI Xuexiang, ZHANG Weixing, YANG Cong, CAO Yangjie
doaj +1 more source
Cross-modal distillation for flood extent mapping
Abstract The increasing intensity and frequency of floods is one of the many consequences of our changing climate. In this work, we explore ML techniques that improve the flood detection module of an operational early flood warning system. Our method exploits an unlabeled dataset of paired multi-spectral and synthetic aperture radar (SAR) imagery to
Shubhika Garg +6 more
openaire +3 more sources
Is Cross-Modal Information Retrieval Possible Without Training? [PDF]
Encoded representations from a pretrained deep learning model (e.g., BERT text embeddings, penultimate CNN layer activations of an image) convey a rich set of features beneficial for information retrieval.
Hyunjin Choi +3 more
semanticscholar +1 more source
Augmented reality flavor: cross-modal mapping across gustation, olfaction, and vision [PDF]
AbstractGustatory display research is still in its infancy despite being one of the essential everyday senses that human practice while eating and drinking. Indeed, the most important and frequent tasks that our brain deals with every day are foraging and feeding.
Osama Halabi, Mohammad Saleh
openaire +4 more sources
People conceptualize auditory pitch as vertical space: low and high pitch correspond to low and high space, respectively. The strength of this cross-modal correspondence, however, seems to vary across different cultural contexts and a debate on the ...
Valentijn Prové
doaj +1 more source
Future or Movement? The L2 Acquisition of Aller + V Forms
This study aims to advance the understanding of the impact of the discursive context in the form-function mappings of aller + V forms by native speakers (NSs) and learners of French (NNSs), and to further knowledge about the developmental patterns of use
Pascale Leclercq
doaj +1 more source
Pix2Map: Cross-Modal Retrieval for Inferring Street Maps from Images
12 pages, 8 ...
Wu, Xindi +4 more
openaire +2 more sources
That sounds sweet: using cross-modal correspondences to communicate gustatory attributes
Klemens Knoeferle +3 more
semanticscholar +3 more sources
Cross-modal Map Learning for Vision and Language Navigation
We consider the problem of Vision-and-Language Navigation (VLN). The majority of current methods for VLN are trained end-to-end using either unstructured memory such as LSTM, or using cross-modal attention over the egocentric observations of the agent.
Georgakis, Georgios +6 more
openaire +2 more sources

