Results 1 to 10 of about 34,250 (255)

Multi-modal adaptive gated mechanism for visual question answering. [PDF]

open access: yesPLoS ONE, 2023
Visual Question Answering (VQA) is a multimodal task that uses natural language to ask and answer questions based on image content. For multimodal tasks, obtaining accurate modality feature information is crucial.
Yangshuyi Xu, Lin Zhang, Xiang Shen
doaj   +2 more sources

COIN: Counterfactual Image Generation for Visual Question Answering Interpretation [PDF]

open access: yesSensors, 2022
Due to the significant advancement of Natural Language Processing and Computer Vision-based models, Visual Question Answering (VQA) systems are becoming more intelligent and advanced.
Zeyd Boukhers   +2 more
doaj   +2 more sources

Adversarial Learning with Bidirectional Attention for Visual Question Answering [PDF]

open access: yesSensors, 2021
In this paper, we provide external image features and use the internal attention mechanism to solve the VQA problem given a dataset of textual questions and related images. Most previous models for VQA use a pair of images and questions as input.
Qifeng Li, Xinyi Tang, Yi Jian
doaj   +2 more sources

Deep Modular Bilinear Attention Network for Visual Question Answering [PDF]

open access: yesSensors, 2022
VQA (Visual Question Answering) is a multi-model task. Given a picture and a question related to the image, it will determine the correct answer. The attention mechanism has become a de facto component of almost all VQA models. Most recent VQA approaches
Feng Yan, Wushouer Silamu, Yanbing Li
doaj   +2 more sources

Informed-Learning-Guided Visual Question Answering Model of Crop Disease [PDF]

open access: yesPlant Phenomics
In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image ...
Yunpeng Zhao   +6 more
doaj   +2 more sources

A visual question answering method based on task decomposition. [PDF]

open access: yesPLoS ONE
Visual question answering (VQA) as an interdisciplinary task of computer vision and natural language processing, estimating the model's visual reasoning ability, which requires the integration of image information extraction technology and natural ...
Yao Cong, Hongwei Mo
doaj   +2 more sources

Multi-View Visual Question Answering with Active Viewpoint Selection [PDF]

open access: yesSensors, 2020
This paper proposes a framework that allows the observation of a scene iteratively to answer a given question about the scene. Conventional visual question answering (VQA) methods are designed to answer given questions based on single-view images ...
Yue Qiu   +4 more
doaj   +2 more sources

BPI-MVQA: a bi-branch model for medical visual question answering [PDF]

open access: yesBMC Medical Imaging, 2022
Background Visual question answering in medical domain (VQA-Med) exhibits great potential for enhancing confidence in diagnosing diseases and helping patients better understand their medical conditions.
Shengyan Liu   +3 more
doaj   +2 more sources

Multi-Modal Explicit Sparse Attention Networks for Visual Question Answering [PDF]

open access: yesSensors, 2020
Visual question answering (VQA) is a multi-modal task involving natural language processing (NLP) and computer vision (CV), which requires models to understand of both visual information and textual information simultaneously to predict the correct ...
Zihan Guo, Dezhi Han
doaj   +2 more sources

Review of Visual Question Answering Technology [PDF]

open access: yesJisuanji kexue yu tansuo, 2023
Visual question answering (VQA) is a popular cross-modal task that combines natural language pro-cessing and computer vision techniques. The main objective of this task is to enable computers to intelligently recognize and retrieve visual content and ...
WANG Yu, SUN Haichun
doaj   +1 more source

Home - About - Disclaimer - Privacy