Results 11 to 20 of about 96,015 (310)

A Survey on Image-text Multimodal Models

open access: yesCoRR, 2023
With the significant advancements of Large Language Models (LLMs) in the field of Natural Language Processing (NLP), the development of image-text multimodal models has garnered widespread attention. Current surveys on image-text multimodal models mainly focus on representative models or application domains, but lack a review on how general technical ...
Ruifeng Guo   +9 more
openaire   +3 more sources

Magazine Cover as Multimodal Text

open access: yesНаучный диалог, 2019
A cover of a journal is considered as a sample of multimodal text. A theoretical review of the scientific literature devoted to the study of multimodality in the framework of linguistics is given.
O. A. Blinova
doaj   +3 more sources

Multilingual Multimodal Learning with Machine Translated Text

open access: yesFindings of the Association for Computational Linguistics: EMNLP 2022, 2022
Most vision-and-language pretraining research focuses on English tasks. However, the creation of multilingual multimodal evaluation datasets (e.g. Multi30K, xGQA, XVNLI, and MaRVL) poses a new challenge in finding high-quality training data that is both multilingual and multimodal.
Qiu, Chen   +4 more
openaire   +4 more sources

Multimodal Interactive Transcription of Ancient Text Images

open access: yes, 2012
Work supported by the Spanish Government (MICINN and “Plan E”) under the MITTRAL (TIN2009-14633-C03-01) research project and under the research programme Consolider Ingenio 2010: MIPRCV (CSD2007- 00018) and the Generalitat Valenciana under gran Prometeo/2009/14.
Verónica Romero 0001   +3 more
openaire   +4 more sources

Attention-Based Multimodal Deep Learning on Vision-Language Data: Models, Datasets, Tasks, Evaluation Metrics and Applications

open access: yesIEEE Access, 2023
Multimodal learning has gained immense popularity due to the explosive growth in the volume of image and textual data in various domains. Vision-language heterogeneous multimodal data has been utilized to solve a variety of tasks including classification,
Priyankar Bose   +2 more
doaj   +1 more source

Text-Image Gated Fusion Mechanism for Multimodal Aspect-based Sentiment Analysis [PDF]

open access: yesJisuanji kexue
Multimodal aspect-based sentiment analysis is an emerging task in multimodal sentiment analysis field,which aims to identify the sentiment of each given aspect in text and image.Although recent research on multimodal sentiment analysis has made ...
ZHANG Tianzhi, ZHOU Gang, LIU Hongbo, LIU Shuo, CHEN Jing
doaj   +1 more source

Teaching Multimodal Literacy Through Reading and Writing Graphic Novels

open access: yesLanguage and Literacy: A Canadian Educational e-journal, 2017
Scholarship suggests that writing teachers and instructors looking to integrate multimodal composition into their secondary or post-secondary classrooms should consider graphic novels as a mentor text for multimodal literacy.
Mike P. Cook, Jeffrey S.J. Kirchoff
doaj   +1 more source

Text-image semantic relevance identification for aspect-based multimodal sentiment analysis [PDF]

open access: yesPeerJ Computer Science
Aspect-based multimodal sentiment analysis (ABMSA) is an emerging task in the research of multimodal sentiment analysis, which aims to identify the sentiment of each aspect mentioned in multimodal sample.
Tianzhi Zhang   +5 more
doaj   +2 more sources

Covid-19 Pandemic in Political Cartoons of the American Press: An Experience of Multimodal Analysis

open access: yesНаучный диалог, 2021
An attempt is made to analyze the place of political cartoons in the current socio-political media discourse in the United States. The material was the cartoons published in the spring of 2020 from USA Today and Philadelphia Inquirer, the informational ...
E. M. Pozdnyakova, O. A. Blinova
doaj   +1 more source

Multimodal Representation Learning With Text and Images

open access: yesCoRR, 2022
In recent years, multimodal AI has seen an upward trend as researchers are integrating data of different types such as text, images, speech into modelling to get the best results. This project leverages multimodal AI and matrix factorization techniques for representation learning, on text and image data simultaneously, thereby employing the widely used
Aishwarya Jayagopal   +3 more
openaire   +2 more sources

Home - About - Disclaimer - Privacy