A visual question answering method based on task decomposition. [PDF]
Cong Y, Mo H.
europepmc +1 more source
Enhancing accessibility: a multi-level platform for visual question answering in diabetic retinopathy for individuals with disabilities. [PDF]
Alotaibi S, Al-Hadhrami S, Al-Ahmadi S.
europepmc +1 more source
Modularized Zero-shot VQA with Pre-trained Models
Large-scale pre-trained models (PTMs) show great zero-shot capabilities. In this paper, we study how to leverage them for zero-shot visual question answering (VQA). Our approach is motivated by a few observations.
Jiang, Jing, Cao, Rui
core
ECSA: Mitigating Catastrophic Forgetting and Few-Shot Generalization in Medical Visual Question Answering. [PDF]
Jia Q, Liu S, Chen M, Li T, Yang J.
europepmc +1 more source
Determining the Ensemble <i>N</i>-Representability of Reduced Density Matrices. [PDF]
Oña OB +7 more
europepmc +1 more source
A Systematic Literature Review on Integrated Deep Learning and Multiagent Vision-Language Frameworks for Pathology Image Analysis and Report Generation. [PDF]
Ali U +6 more
europepmc +1 more source
ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment
With the rapid growth of User-Generated Content (UGC) exchanged between users and sharing platforms, the need for video quality assessment in the wild has emerged. UGC is mostly acquired using consumer devices and undergoes multiple rounds of compression
Bull, David +2 more
core
MiniGPT-Pancreas: Multimodal Large Language Model for Pancreas Cancer Observation and Localization in CT Images. [PDF]
Moglia A +3 more
europepmc +1 more source
Generating three-dimensional genome structures with a variational quantum algorithm. [PDF]
Siciliano AJ, Wang Z.
europepmc +1 more source
Say It My Way: Exploring Control in Conversational Visual Question Answering with Blind Users. [PDF]
Zeraati FZ +4 more
europepmc +1 more source

