Vqa - Open Access .click

Results 121 to 130 of about 2,605 (177)

A multimodal transformer-based visual question answering method integrating local and global information. [PDF]

PLoS One
Huang C, Hu Z.
europepmc +1 more source

Efficient knowledge distillation and alignment for improved KB-VQA. [PDF]

Sci Rep
Qin X, Pei R, He C, Li F, Zhang X.
europepmc +1 more source

Video quality prediction and classification using XGBoost under variable encoding and network conditions. [PDF]

Sci Rep
Frnda J +4 more
europepmc +1 more source

Benchmarking large multimodal models for ophthalmic visual question answering with OphthalWeChat. [PDF]

Adv Ophthalmol Pract Res
Xu P +9 more
europepmc +1 more source

Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data. [PDF]

Nat Commun
Wu C +5 more
europepmc +1 more source

M3AE-Distill: An Efficient Distilled Model for Medical Vision-Language Downstream Tasks. [PDF]

Bioengineering (Basel)
Liang X, Xie J, Zhang M, Bi Z.
europepmc +1 more source

Context-Aware Multi-Agent Architecture for Wildfire Insights. [PDF]

Sensors (Basel)
Sandeep A +5 more
europepmc +1 more source

Evaluating the performance of large language & visual-language models in cervical cytology screening. [PDF]

NPJ Precis Oncol
Hong Q +15 more
europepmc +1 more source

Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models. [PDF]

Med Image Comput Comput Assist Interv
Khanal B +9 more
europepmc +1 more source

fos: computer and information sciences
computer vision and pattern recognition cs.cv
computer science - computation and language

computation and language cs.cl
machine learning cs.lg
computer science - machine learning

artificial intelligence cs.ai
computer science - artificial intelligence
4. education

previous 11 12 13 14 15 next

Home - About - Disclaimer - Privacy