Results 71 to 80 of about 2,605 (177)

Improving Automatic VQA Evaluation Using Large Language Models

open access: yes
8 years after the visual question answering (VQA) task was proposed, accuracy remains the primary metric for automatic evaluation. VQA Accuracy has been effective so far in the IID evaluation setting.
Agrawal, Aishwarya   +2 more
core   +1 more source

Question Modifiers in VQA: Evaluating Model Sensitivity

open access: yes, 2022
Visual Question Answering (VQA) is a challenge problem that can advance AI by integrating several important sub-disciplines including natural language understanding and computer vision.
Britton, William Johnstone
core  

Free VQA Models from Knowledge Inertia by Pairwise Inconformity Learning

open access: yes, 2019
In this paper, we uncover the issue of knowledge inertia in visual question answering (VQA), which commonly exists in most VQA models and forces the models to mainly rely on the question content to “guess” answer, without regard to the visual information.
Sun, Xiaoshuai   +4 more
core   +1 more source

Semi-Supervised Implicit Augmentation for Data-Scarce VQA

open access: yes
Vision-language models (VLMs) have demonstrated increasing potency in solving complex vision-language tasks in the recent past. Visual question answering (VQA) is one of the primary downstream tasks for assessing the capability of VLMs, as it helps in ...
Kartik Hegde   +2 more
core   +1 more source

Overcoming Language Priors in VQA via Decomposed Linguistic Representations

open access: yes, 2020
Most existing Visual Question Answering (VQA) models overly rely on language priors between questions and answers. In this paper, we present a novel method of language attention-based VQA that learns decomposed linguistic representations of questions and
Wu, Qi   +9 more
core   +1 more source

Subjective Scoring Framework for VQA Models in Autonomous Driving

open access: yes
The development of vision and language transformer models has paved the way for Visual Question Answering (VQA) models and related research. There are metrics to assess the general accuracy of VQA models but subjective assessment of the answers generated
Abbirah Ahmed   +5 more
core   +1 more source

VQA-LOL: Visual Question Answering Under the Lens of Logic

open access: yes, 2020
16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XXILogical connectives and their implications on the meaning of a natural language sentence are a fundamental aspect of understanding. In this paper, we investigate whether visual
Yang, Yezhou   +7 more
core   +1 more source

Estimating semantic structure for the VQA answer space

open access: yes, 2020
Since its appearance, Visual Question Answering (VQA, i.e. answering a question posed over an image), has always been treated as a classification problem over a set of predefined answers.
Baccouche, Moez   +3 more
core  

KnowIT VQA: Answering Knowledge-Based Questions about Videos

open access: yes, 2020
We propose a novel video understanding task by fusing knowledge-based and video question answering. First, we introduce KnowIT VQA, a video dataset with 24,282 human-generated question-answer pairs about a popular sitcom.
Nakashima, Yuta   +3 more
core   +1 more source

What Large Language Models Bring to Text-rich VQA?

open access: yes, 2023
Text-rich VQA, namely Visual Question Answering based on text recognition in the images, is a cross-modal task that requires both image comprehension and text recognition.
Lu, Jinghui   +6 more
core  

Home - About - Disclaimer - Privacy