Visual question answering - Open Access .click

Results 221 to 230 of about 34,250 (255)

Differential performance of large language models in advanced cardiac life support assessment: A comprehensive multi-dimensional analysis of accuracy, consistency, and visual recognition capabilities. [PDF]

PLoS One
Genc M +9 more
europepmc +1 more source

Specialized foundation models for intelligent operating rooms. [PDF]

NPJ Digit Med
Özsoy E +5 more
europepmc +1 more source

Collaborative positional attention for image to English question answering. [PDF]

Sci Rep
Li Y, Teng H.
europepmc +1 more source

Medical visual question answering: A survey

Artificial Intelligence in Medicine, 2023
Medical Visual Question Answering~(VQA) is a combination of medical artificial intelligence and popular VQA challenges. Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer.
Zhihong Lin, Donghao Zhang, Zongyuan Ge +2 more
exaly +4 more sources

Some of the next articles are maybe not open access.

Related searches:

vqa
attention mechanism
natural language processing

computer vision
deep learning
medicine

question answering

Multiple answers to a question: a new approach for visual question answering

The Visual Computer, 2020
With the advent of deep learning, multi-modal data have been of great interest. One of the multi-modal tasks which can be included in the computer vision domain is visual question answering (VQA). In VQA, a question and an image are entered into the model and the model tries to answer the question according to the image.
Sayedshayan Hashemi Hosseinabad, Mehran Safayani, Abdolreza Mirzaei +2 more
openaire +1 more source

Visual Question Answer Diversity

Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 2018
Visual questions (VQs) can lead multiple people to respond with different answers rather than a single, agreed upon response. Moreover, the answers from a crowd can include different numbers of unique answers that arise with different relative frequencies.
Chun-Ju Yang, Kristen Grauman, Danna Gurari +2 more
openaire +1 more source

Visual Question Answering

2024 International Conference on Computing, Networking and Communications (ICNC)
Abstract - Vision-Language Pre-Training (VLP) significantly improves performance for a variety of multimodal tasks. However, existing models are often specialized in understanding or generation, which limits their versatility. Furthermore, trust in text data for large, loud web text remains the optimal approach for monitoring.
Ahmed Nada, Min Chen
openaire +2 more sources

Answer Distillation for Visual Question Answering

2019
Answering open-ended questions in Visual Question Answering (VQA) is a challenging task. As the answers are totally free-form, the answer space for open-ended questions is infinite in theory. This increases the difficulty for algorithms to predict the correct answers. In this paper, we propose a method named answer distillation to decrease the scale of
Zhiwei Fang +4 more
openaire +1 more source

Sequential Visual Reasoning for Visual Question Answering

2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), 2018
Visual question answering (VQA) is a challenging task which addressing the learning and reasoning at the intersection of vision and language. This reasoning requires both understanding sequential and compositional linguistic structure from questions and sets of visual objects and their spatial relation from images.
Jinlai Liu +3 more
openaire +1 more source

vqa
attention mechanism
natural language processing

computer vision
deep learning
medicine

question answering