Results 1 to 10 of about 431,827 (296)
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality [PDF]
Large language models (LLMs) have demonstrated impressive zero-shot abilities on a variety of open-ended tasks, while recent research has also explored the use of LLMs for multi-modal generation.
Qinghao Ye +16 more
semanticscholar +1 more source
Generative Pretraining in Multimodality [PDF]
We present Emu, a Transformer-based multimodal foundation model, which can seamlessly generate images and texts in multimodal context. This omnivore model can take in any single-modality or multimodal data input indiscriminately (e.g., interleaved image,
Quan Sun +9 more
semanticscholar +1 more source
The multimodality cell segmentation challenge: toward universal solutions [PDF]
Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyper-parameters in different ...
Jun Ma +38 more
semanticscholar +1 more source
Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models [PDF]
The ability to quickly learn a new task with minimal instruction - known as few-shot learning - is a central aspect of intelligent agents. Classical few-shot benchmarks make use of few-shot samples from a single modality, but such samples may not be ...
Zhiqiu Lin +4 more
semanticscholar +1 more source
Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications [PDF]
Multimodality Representation Learning, as a technique of learning to embed information from different modalities and their correlations, has achieved remarkable success on a variety of applications, such as Visual Question Answering (VQA), Natural ...
Muhammad Arslan Manzoor +5 more
semanticscholar +1 more source
MVP: Multimodality-guided Visual Pre-training [PDF]
Recently, masked image modeling (MIM) has become a promising direction for visual pre-training. In the context of vision transformers, MIM learns effective visual representation by aligning the token-level features with a pre-defined space (e.g., BEIT ...
Longhui Wei +4 more
semanticscholar +1 more source
Multimodality in VR: A Survey [PDF]
Virtual reality (VR) is rapidly growing, with the potential to change the way we create and consume content. In VR, users integrate multimodal sensory information they receive to create a unified perception of the virtual world. In this survey, we review
Daniel Martin +4 more
semanticscholar +1 more source
Immune checkpoint inhibitors (ICIs) have achieved prominent efficacy in the treatment of numerous cancers, which is the most significant breakthrough in cancer therapy in recent years.
Yi‐hui Li +8 more
doaj +1 more source
The Impact of Coronary Artery Calcification on Long-Term Cardiovascular Outcomes
Decades of research and experimental studies have investigated various strategies to prevent acute coronary events. However, significantly efficient preventive methods have not been developed and still remains a challenge to determine if a coronary ...
Mitra Noémi +5 more
doaj +1 more source
Semiotic Multimodality Communication in The Age of New Media
The age of new media is changing the social order of communication. The availability of the widest access for internet users to communicate without time and region limitations is a feature of this digital media era.
Muhammad Hasyim, Burhanuddin Arafah
semanticscholar +1 more source

