Results 41 to 50 of about 149,578 (172)
Metaphor: The intertwinement of thought and language
The analysis of this article aims at reflecting on the nature of metaphoricity within the context of thought and language – inspired by the contributions of Elaine Botha in this regard commencing about three decades ago.
D.F.M. Strauss
doaj +1 more source
Key-Value Mapping-Based Text-to-Image Diffusion Model Backdoor Attacks
Text-to-image (T2I) generation, a core component of generative artificial intelligence(AI), is increasingly important for creative industries and human–computer interaction.
Lujia Chai +3 more
doaj +1 more source
AVCLNet: Multimodal Multispeaker Tracking Network Using Audio‐Visual Contrastive Learning
Audio‐visual speaker tracking aims to determine the locations of multiple speakers in the scene by leveraging signals captured from multisensor platforms. Multimodal fusion methods can improve both the accuracy and robustness of speaker tracking. However,
Yihan Li +5 more
doaj +1 more source
Applications Research Progress and Prospects of Multi-Agent Large Language Models in Agricultural [PDF]
[Significance] With the rapid advancement of large language models (LLM) and multi-agent systems, their integration, multi-agent large language models, is emerging as a transformative force in modern agriculture. Agricultural production involves complex,
ZHAO Yingping, LIANG Jinming, CHEN Beizhang, DENG Xiaoling, ZHANG Yi, XIONG Zheng, PAN Ming, MENG Xiangbao
doaj +1 more source
Trust, but Verify: Cross-Modality Fusion for HD Map Change Detection
High-definition (HD) map change detection is the task of determining when sensor data and map data are no longer in agreement with one another due to real-world changes. We collect the first dataset for the task, which we entitle the Trust, but Verify (TbV) dataset, by mining thousands of hours of data from over 9 months of autonomous vehicle fleet ...
Lambert, John, Hays, James
openaire +2 more sources
Deep Latent Space Learning for Cross-Modal Mapping of Audio and Visual Signals [PDF]
We propose a novel deep training algorithm for joint representation of audio and visual information which consists of a single stream network (SSNet) coupled with a novel loss function to learn a shared deep latent space representation of multimodal information.
Nawaz, Shah +4 more
openaire +2 more sources
Gradient-Map-Guided Adaptive Domain Generalization for Cross Modality MRI Segmentation
Cross-modal MRI segmentation is of great value for computer-aided medical diagnosis, enabling flexible data acquisition and model generalization. However, most existing methods have difficulty in handling local variations in domain shift and typically require a significant amount of data for training, which hinders their usage in practice.
Li, Bingnan, Gao, Zhitong, He, Xuming
openaire +2 more sources
Generative models for SAR–optical image translation: A systematic review
Growing demands in sustainable development and resource management are driving increasing reliance on remote sensing-based Earth observation and image interpretation.
Zhao Wang +4 more
doaj +1 more source
Cross-Modal Correspondences Enhance Performance on a Colour-to-Sound Sensory Substitution Device.
Visual sensory substitution devices (SSDs) can represent visual characteristics through distinct patterns of sound, allowing a visually impaired user access to visual information.
Giles Hamilton-Fletcher +2 more
semanticscholar +1 more source
The connectional brain template (CBT) captures the shared traits across all individuals of a given population of brain connectomes, thereby acting as a fingerprint. Estimating a CBT from a population where brain graphs are derived from diverse neuroimaging modalities (e.g., functional and structural) and at different resolutions (i.e., number of nodes)
Ece Cinar +3 more
openaire +3 more sources

