Results 41 to 50 of about 34,250 (255)
Data and Knowledge Enhanced Medical Visual Question Answer Network [PDF]
Med-VQA aims to accurately answer clinical questions based on a given medical image,which is key in advancing clinical medical intelligence.Despite some progress in this field,challenges remain in extracting deep multimodal information from both images ...
YAN Yujing, HOU Xia, GUO Yuting, ZHANG Mingliang, SONG Wenfeng
doaj +1 more source
Path-Wise Attention Memory Network for Visual Question Answering
Visual question answering (VQA) is regarded as a multi-modal fine-grained feature fusion task, which requires the construction of multi-level and omnidirectional relations between nodes.
Yingxin Xiang +5 more
doaj +1 more source
Multi-Question Learning for Visual Question Answering
Visual Question Answering (VQA) raises a great challenge for computer vision and natural language processing communities. Most of the existing approaches consider video-question pairs individually during training. However, we observe that there are usually multiple (either sequentially generated or not) questions for the target video in a VQA task, and
Chenyi Lei +6 more
openaire +2 more sources
VISUAL QUESTIONING AND ANSWERING
In general, a VQA system is an algorithm that takes a picture and natural language query about the image as input and produces natural language response as output. The nature of a multi-discipline research problem necessitates this. This is a diagnostic dataset that assesses a variety of visual reason skills.
Lokesh R +3 more
openaire +1 more source
Answer-Type Prediction for Visual Question Answering
Recently, algorithms for object recognition and related tasks have become sufficiently proficient that new vision tasks can now be pursued. In this paper, we build a system capable of answering open-ended text-based questions about images, which is known as Visual Question Answering (VQA).
Kafle, Kushal, Kanan, Christopher
openaire +2 more sources
Visual Question Generation as Dual Task of Visual Question Answering [PDF]
Visual question answering (VQA) and visual question generation (VQG) are two trending topics in the computer vision, but they are usually explored separately despite their intrinsic complementary relationship. In this paper, we propose an end-to-end unified model, the Invertible Question Answering Network (iQAN), to introduce question generation as a ...
Yikang Li 0002 +6 more
openaire +1 more source
A Survey on Visual Question Answering Methodologies [PDF]
Understanding visual question-answering (VQA) will be essential for many human tasks. However, it poses significant obstacles at the core of artificial intelligence as a multimodal system.
Aya Al-Zoghby, Aya Saleh, wael awad
doaj +1 more source
Fair-VQA: Fairness-Aware Visual Question Answering Through Sensitive Attribute Prediction
Visual Question Answering (VQA) is a task that answers questions on given images. Although previous works achieve a great improvement in VQA performance, they do not consider the fairness of answers in terms of ethically sensitive attributes, such as ...
Sungho Park +3 more
doaj +1 more source
Visual Storytelling Based on Planning Learning [PDF]
Visual storytelling is a growing area of interest for scholars in computer vision and natural language processing.Current models concentrate on enhancing image representation,like using external knowledge and scene diagrams.Despite some advancements have
WANG Yuanlong, ZHANG Ningqian, ZHANG Hu
doaj +1 more source
Visual question answering for medical diagnosis
The use of Artificial Intelligence (AI) in medical diagnosis is a breakthrough in healthcare, improving both accuracy and efficiency. Recently, a significant advancement has been made toward the development of multimodal AI systems that can process and ...
Nawel Ben Chaabane, Mohamed Bal-Ghaoui
doaj +1 more source

