Cross-modal mappings - Open Access .click

Results 241 to 250 of about 153,605 (299)

Some of the next articles are maybe not open access.

M3R: Masked Token Mixup and Cross-Modal Reconstruction for Zero-Shot Learning

ACM Multimedia, 2023
In the zero-shot learning (ZSL), learned representation spaces are often biased toward seen classes, thus limiting the ability to predict previously unseen classes. In this paper, we propose Masked token Mixup and cross-Modal Reconstruction for zero-shot
Peng Zhao, Qiangchang Wang, Yilong Yin
semanticscholar +1 more source

Image Tagging via Cross-Modal Semantic Mapping

Proceedings of the 23rd ACM international conference on Multimedia, 2015
Images without annotations are ubiquitous on the Internet, and recommending tags for them has become a challenging open task in image understanding. A common bottleneck of related work is the semantic gap between the image and text representations.
Zhi-Hong Deng, Hongliang Yu, Yunlun Yang
openaire +1 more source

Deep Semantic Mapping for Cross-Modal Retrieval

2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), 2015
Cross-Modal mapping plays an essential role in multimedia information retrieval systems. However, most of existing work paid much attention on learning mapping functions but neglected the exploration of high-level semantic representation of modalities.
Cheng Wang, Haojin Yang, Christoph Meinel +2 more
openaire +1 more source

Audio/visual mapping with cross-modal hidden Markov models

IEEE Transactions on Multimedia, 2005
The audio/visual mapping problem of speech-driven facial animation has intrigued researchers for years. Recent research efforts have demonstrated that hidden Markov model (HMM) techniques, which have been applied successfully to the problem of speech recognition, could achieve a similar level of success in audio/visual mapping problems. A number of HMM-
FU S. +4 more
openaire +2 more sources

Persistent Stereo Visual Localization on Cross-Modal Invariant Map

IEEE Transactions on Intelligent Transportation Systems, 2020
Autonomous mobile vehicles are expected to perform persistent and accurate localization with low-cost equipment. To achieve this goal, we propose a stereo camera based visual localization method using a modified laser map, which takes the advantage of both the low cost of camera, and high geometric precision of laser data to achieve long-term ...
Xiaqing Ding +6 more
openaire +2 more sources

DISPARITY MAP ESTIMATION FROM CROSS-MODAL STEREO

2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2018
Mono-modal stereo matching problem has been studied for decades. The introduction of cross-modal stereo systems in industrial scene increases the interest in cross-modal stereo matching. The existing algorithms mostly consider mono-modal setting so they do not translate well in cross-modal setting.
Thapanapong Rukkanchanunt +3 more
openaire +1 more source

Bidirectional Mapping-Based Domain Adaptation for Nucleus Detection in Cross-Modality Microscopy Images

IEEE Transactions on Medical Imaging, 2021
Cell or nucleus detection is a fundamental task in microscopy image analysis and has recently achieved state-of-the-art performance by using deep neural networks. However, training supervised deep models such as convolutional neural networks (CNNs) usually requires sufficient annotated image data, which is prohibitively expensive or unavailable in some
Fuyong Xing +3 more
openaire +2 more sources

Coupled dictionary learning and feature mapping for cross-modal retrieval

2015 IEEE International Conference on Multimedia and Expo (ICME), 2015
In this paper, we investigate the problem of modeling images and associated text for cross-modal retrieval tasks such as text-to-image search and image-to-text search. To make the data from image and text modalities comparable, previous cross-modal retrieval methods directly learn two projection matrices to map the raw features of the two modalities ...
Xing Xu +3 more
openaire +1 more source

Music2Palette: Emotion-aligned Color Palette Generation via Cross-Modal Representation Learning

ACM Multimedia
Emotion alignment between music and palettes is crucial for effective multimedia content, yet misalignment creates confusion that weakens the intended message.
Jiayun Hu +4 more
semanticscholar +1 more source

Cross-Modal Dual Learning for Sentence-to-Video Generation

ACM Multimedia, 2019
Automatic content generation has become an attractive while challenging topic in the past decade. Generating videos from sentences particularly poses great challenges to the multimedia community due to its multi-modal characteristics in essence, e.g ...
Yue Liu, Xin Wang, Yitian Yuan, Wenwu Zhu +3 more
semanticscholar +1 more source

fos: computer and information sciences
computer vision and pattern recognition cs.cv
machine learning cs.lg

computer science - machine learning
humans
adult

computer science - robotics
4. education
robotics cs.ro