Results 141 to 150 of about 4,890,018 (308)

Investigating the Neural Basis of Audiovisual Speech Perception with Intracranial Recordings in Humans [PDF]

open access: yes, 2017
Speech is inherently multisensory, containing auditory information from the voice and visual information from the mouth movements of the talker. Hearing the voice is usually sufficient to understand speech, however in noisy environments or when audition ...
Sertel, Muge O
core   +1 more source

Non-linear and Selective Fusion of Cross-Modal Images [PDF]

open access: yesarXiv, 2019
The human visual perception system has strong robustness in image fusion. This robustness is based on human visual perception system's characteristics of feature selection and non-linear fusion of different features. In order to simulate the human visual perception mechanism in image fusion tasks, we propose a multi-source image fusion framework that ...
arxiv  

MVP-Bench: Can Large Vision--Language Models Conduct Multi-level Visual Perception Like Humans? [PDF]

open access: yesarXiv
Humans perform visual perception at multiple levels, including low-level object recognition and high-level semantic interpretation such as behavior understanding. Subtle differences in low-level details can lead to substantial changes in high-level perception.
arxiv  

Perception-aware Sampling for Scatterplot Visualizations [PDF]

open access: yesarXiv
Visualizing data is often a crucial first step in data analytics workflows, but growing data sizes pose challenges due to computational and visual perception limitations. As a result, data analysts commonly down-sample their data and work with subsets. Deriving representative samples, however, remains a challenge.
arxiv  

Multimodal LLM Augmented Reasoning for Interpretable Visual Perception Analysis [PDF]

open access: yesarXiv
In this paper, we advance the study of AI-augmented reasoning in the context of Human-Computer Interaction (HCI), psychology and cognitive science, focusing on the critical task of visual perception. Specifically, we investigate the applicability of Multimodal Large Language Models (MLLMs) in this domain. To this end, we leverage established principles
arxiv  

Home - About - Disclaimer - Privacy