Results 11 to 20 of about 1,322,987 (378)

Multimodal Transformer for Unaligned Multimodal Language Sequences [PDF]

open access: yesProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019
Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors. However, two major challenges in modeling such multimodal human language time-series data exist: 1) inherent data non-alignment ...
Yao-Hung Hubert Tsai   +5 more
semanticscholar   +5 more sources

PaLM-E: An Embodied Multimodal Language Model [PDF]

open access: greenInternational Conference on Machine Learning, 2023
Large language models excel at a wide range of complex tasks. However, enabling general inference in the real world, e.g., for robotics problems, raises the challenge of grounding.
Danny Driess   +21 more
openalex   +3 more sources

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering [PDF]

open access: yesNeural Information Processing Systems, 2022
When answering a question, humans utilize the information available across different modalities to synthesize a consistent and complete chain of thought (CoT).
Pan Lu   +8 more
semanticscholar   +1 more source

MMMU: A Massive Multi-Discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI [PDF]

open access: yesComputer Vision and Pattern Recognition, 2023
We introduce MMMU: a new benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning.
Xiang Yue   +21 more
semanticscholar   +1 more source

Acknowledgment to Reviewers of Multimodal Technologies and Interaction in 2021

open access: yesMultimodal Technologies and Interaction, 2022
Rigorous peer-reviews are the basis of high-quality academic publishing [...]
Multimodal Technologies and Interaction Editorial Office
doaj   +1 more source

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models [PDF]

open access: yesarXiv.org, 2023
Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image. However, it is difficult for these case studies to fully reflect
Chaoyou Fu   +12 more
semanticscholar   +1 more source

Acknowledgment to Reviewers of Multimodal Technologies and Interaction in 2020

open access: yesMultimodal Technologies and Interaction, 2021
Peer review is the driving force of journal development, and reviewers are gatekeepers who ensure that Multimodal Technologies and Interaction maintains its standards for the high quality of its published papers [...]
Multimodal Technologies and Interaction Editorial Office
doaj   +1 more source

A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity [PDF]

open access: yesInternational Joint Conference on Natural Language Processing, 2023
This paper proposes a framework for quantitatively evaluating interactive LLMs such as ChatGPT using publicly available data sets. We carry out an extensive technical evaluation of ChatGPT using 23 data sets covering 8 different common NLP application ...
Yejin Bang   +12 more
semanticscholar   +1 more source

Multimode Mamyshev Oscillator [PDF]

open access: yesConference on Lasers and Electro-Optics, 2021
Spatiotemporal mode-locking (STML) is demonstrated in a Mamyshev Oscillator. We observe a variety of STML states with different degrees of spatiotemporal coupling. The design allows some control over the multimode output beam profile.
Henry Haig   +6 more
openaire   +3 more sources

nuScenes: A Multimodal Dataset for Autonomous Driving [PDF]

open access: yesComputer Vision and Pattern Recognition, 2019
Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the ...
Holger Caesar   +9 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy