Vqa - Open Access .click

Results 81 to 90 of about 2,605 (177)

Enabling Multimodal Understanding: Lidar Data Meets VQA

This chapter explores the integration of Light Detection and Ranging (LiDAR) data with multimodal systems such as Visual Question Answering (VQA) to enable robust contextual understanding.
Dhananjay Thiruvady (13066857) +3 more
core

BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining

The current research direction in generative models, such as the recently developed GPT4, aims to find relevant knowledge information for multimodal and multilingual inputs to provide answers.
Lim, KyungTae +4 more
core +1 more source

Overview of the VQA-Med task at ImageCLEF 2021 ::visual question answering and generation in the medical domain

, 2021
This paper presents an overview of the fourth edition of the Medical Visual Question Answering (VQA-Med) task at ImageCLEF 2021. VQA-Med 2021 includes a task on Visual Question Answering (VQA),where participants are tasked with answering questions from ...
Ben Abacha, Asma +4 more
core

C3-VQA: Cryogenic Counter-Based Coprocessor for Variational Quantum Algorithms

Cryogenic quantum computers play a leading role in demonstrating quantum advantage. Given the severe constraints on the cooling capacity in cryogenic environments, thermal design is crucial for the scalability of these computers.
Satoshi Imamura +7 more
core +1 more source

Evaluating VQA Models' Consistency in the Scientific Domain

International audienceVisual Question Answering (VQA) in the scientific domain is a challenging task that requires a high-level understanding of the given image to answer a given question. Although having impressive results on the ScienceQA dataset, both
Guinaudeau, Camille, Satoh, Shin'Ichi, Quan, Khanh-An, C +2 more
core +1 more source

VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving

Generating 3D vehicle assets from in-the-wild observations is crucial to autonomous driving. Existing image-to-3D methods cannot well address this problem because they learn generation merely from image RGB information without a deeper understanding of ...
Shan, Jinjun +7 more
core

MISS: A Generative Pretraining and Finetuning Approach for Med-VQA

Medical visual question answering (VQA) is a challenging multimodal task, where Vision-Language Pre-training (VLP) models can effectively improve the generalization performance.
Jiang, Yue +4 more
core

RAM-VQA: Restoration Assisted Multi-Modality Video Quality Assessment

International audienceVideo Quality Assessment (VQA) strives to computationally emulate human perceptual judgments and has garnered significant attention given its widespread applicability. However, existing methodologies face two primary impediments:(1)
Li, Leida +5 more
core +1 more source

PMA-VQA: Progressive Multi-Scale Feature Fusion with Spatially Adaptive Attention for Remote Sensing Visual Question Answering. [PDF]

Sensors (Basel)
He Y, Qiu C, Gu J.
europepmc +1 more source

The New Frontier of Quality Evaluation for Visual Sensors: A Survey of Large Multimodal Model-Based Methods. [PDF]

Sensors (Basel)
Ge Q, Min X, Wu S, Li Y, Zhai G.
europepmc +1 more source

fos: computer and information sciences
computer vision and pattern recognition cs.cv
computer science - computation and language

computation and language cs.cl
machine learning cs.lg
computer science - machine learning

artificial intelligence cs.ai
computer science - artificial intelligence
4. education