Results 81 to 90 of about 2,605 (177)
Enabling Multimodal Understanding: Lidar Data Meets VQA
This chapter explores the integration of Light Detection and Ranging (LiDAR) data with multimodal systems such as Visual Question Answering (VQA) to enable robust contextual understanding.
Dhananjay Thiruvady (13066857) +3 more
core
The current research direction in generative models, such as the recently developed GPT4, aims to find relevant knowledge information for multimodal and multilingual inputs to provide answers.
Lim, KyungTae +4 more
core +1 more source
This paper presents an overview of the fourth edition of the Medical Visual Question Answering (VQA-Med) task at ImageCLEF 2021. VQA-Med 2021 includes a task on Visual Question Answering (VQA),where participants are tasked with answering questions from ...
Ben Abacha, Asma +4 more
core
C3-VQA: Cryogenic Counter-Based Coprocessor for Variational Quantum Algorithms
Cryogenic quantum computers play a leading role in demonstrating quantum advantage. Given the severe constraints on the cooling capacity in cryogenic environments, thermal design is crucial for the scalability of these computers.
Satoshi Imamura +7 more
core +1 more source
Evaluating VQA Models' Consistency in the Scientific Domain
International audienceVisual Question Answering (VQA) in the scientific domain is a challenging task that requires a high-level understanding of the given image to answer a given question. Although having impressive results on the ScienceQA dataset, both
Guinaudeau, Camille +2 more
core +1 more source
Generating 3D vehicle assets from in-the-wild observations is crucial to autonomous driving. Existing image-to-3D methods cannot well address this problem because they learn generation merely from image RGB information without a deeper understanding of ...
Shan, Jinjun +7 more
core
MISS: A Generative Pretraining and Finetuning Approach for Med-VQA
Medical visual question answering (VQA) is a challenging multimodal task, where Vision-Language Pre-training (VLP) models can effectively improve the generalization performance.
Jiang, Yue +4 more
core
RAM-VQA: Restoration Assisted Multi-Modality Video Quality Assessment
International audienceVideo Quality Assessment (VQA) strives to computationally emulate human perceptual judgments and has garnered significant attention given its widespread applicability. However, existing methodologies face two primary impediments:(1)
Li, Leida +5 more
core +1 more source
PMA-VQA: Progressive Multi-Scale Feature Fusion with Spatially Adaptive Attention for Remote Sensing Visual Question Answering. [PDF]
He Y, Qiu C, Gu J.
europepmc +1 more source
The New Frontier of Quality Evaluation for Visual Sensors: A Survey of Large Multimodal Model-Based Methods. [PDF]
Ge Q, Min X, Wu S, Li Y, Zhai G.
europepmc +1 more source

