Results 61 to 70 of about 2,605 (177)
Splattalk: 3D VQA with Gaussian Splatting
Language-guided 3D scene understanding is important for advancing applications in robotics, AR/VR, and human-computer interaction, enabling models to comprehend and interact with 3D environments through natural language. While 2D vision-language models (VLMs) have achieved remarkable success in 2D VQA tasks, progress in the 3D domain has been ...
Anh Thai +4 more
openaire +2 more sources
Efficient Few‐Shot Learning in Remote Sensing: Fusing Vision and Vision‐Language Models
ABSTRACT Remote sensing has become a vital tool across sectors such as urban planning, environmental monitoring, and disaster response. Although the volume of data generated has increased significantly, traditional vision models are often constrained by the requirement for extensive domain‐specific labelled data and their limited ability to understand ...
Jia Yun Chua +2 more
wiley +1 more source
Curriculum learning effectively improves low data VQA [PDF]
Visual question answering (VQA) models, in particular modular ones, are commonly trained on large-scale datasets to achieve state of the art performance. However, such datasets are sometimes not available.
Askarian, Narjes +4 more
core
Abstract Background and purpose Proton stereotactic body radiotherapy (SBRT) offers superior dose conformity. However, its clinical application remains limited due to uncertainties from setup errors and respiratory motion. This study quantified the dosimetric robustness of proton SBRT for early‐stage non‐small cell lung cancer under combined setup and ...
Akihiro Yamano +9 more
wiley +1 more source
Analysing human vs. neural attention in VQA [PDF]
Visual Question Answering (VQA) has drawn substantial interest in both academic and industrial research fields in recent years. Driven by Vision Transformers (ViT) and the vision-text co-attention mechanism, these models have shown notable performance ...
Ma, Yingpeng
core +1 more source
Zero-Shot Transfer VQA Dataset
Acquiring a large vocabulary is an important aspect of human intelligence. Onecommon approach for human to populating vocabulary is to learn words duringreading or listening, and then use them in writing or speaking. This ability totransfer from input to output is natural for human, but it is difficult for machines.Human spontaneously performs this ...
Yuanpeng Li 0001 +3 more
openaire +2 more sources
Multi‐Channel Convolutional Neural Quantum Embedding
This study presents convolutional neural quantum embedding (CNQE), a framework for optimizing quantum data embeddings for multi‐channel data classification, grounded in quantum state discrimination and Fourier analysis of quantum circuits. CNQE is validated through proof‐of‐principle demonstrations on CIFAR‐10 and Tiny ImageNet, showing improved ...
Yujin Kim +4 more
wiley +1 more source
An adaptive VLLMS‐based control strategy is developed for a PV‐powered UPQC, enabling simultaneous PQ mitigation and distributed power management. Filterless extraction and decoupling of non‐ideal voltage and current components provide fast convergence, improved dynamic response and stable DC‐link voltage under PV power fluctuations.
Pragnyashree Ray +6 more
wiley +1 more source
Optimizing feature pooling and prediction models of VQA algorithms
International audienceIn this paper, we propose a strategy to optimize feature pooling and prediction models of video quality assessment (VQA) algorithms with a much smaller number of parameters than methods based on machine learning, such as neural ...
Le Callet, Patrick +4 more
core +1 more source
Enhanced VQA : numerical quantification
May 2019School of ScienceWhile it is plausible to hold that artificial intelligence, AI, has made steady progress since its modern inception in 1956, it seems that over the last 10 years, AI has innovated at a particularly rapid pace.
Wang, Max
core

