Results 31 to 40 of about 104,311 (276)
Compositional Generalization in Image Captioning [PDF]
To appear at CoNLL 2019 ...
Nikolaus, Mitja +4 more
openaire +5 more sources
Pre-gen Metrics: Predicting Caption Quality Metrics Without Generating Captions [PDF]
13 pages, 6 figures This publication will appear in the Proceedings of the First Workshop on Shortcomings in Vision and Language (2018).
Tanti, Marc +2 more
openaire +2 more sources
Middle-Level Attribute-Based Language Retouching for Image Caption Generation
Image caption generation is attractive research which focuses on generating natural language sentences to describe the visual content of a given image. It is an interdisciplinary subject combining computer vision (CV) and natural language processing (NLP)
Zhibin Guan +4 more
doaj +1 more source
3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model
In this paper, we build a multi-style generative model for stylish image captioning which uses multi-modality image features, ResNeXt features, and text features generated by DenseCap.
Chengxi Li, Brent Harrison
doaj +1 more source
Long-text caption generation for surgical image with a concept retrieval augmented large multimodal model [PDF]
Jiquan Liu +6 more
doaj +1 more source
Cross-Lingual Image Caption Generation Based on Visual Attention Model
As an interesting and challenging problem, generating image caption automatically has attracted increasingly attention in natural language processing and computer vision communities.
Bin Wang +5 more
doaj +1 more source
The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping.
Haitao Xiong +4 more
doaj +1 more source
Multi-Band Image Caption Generation Method Based on Feature Fusion [PDF]
This study proposes a multi-band detection image caption generation method based on feature fusion to address the common problem of poor performance in describing nighttime scenes, occluded target scenes, and captured blurred images in existing image ...
HE Shan, LIN Suzhen, WANG Yanbo, LI Dawei
doaj +1 more source
Classification and object recognition in image processing has significantly improved computer vision tasks. The method is often used for visual problems, especially in picture classification utilizing the Convolutional Neural Network (CNN).
Rifqi Mulyawan +2 more
doaj +1 more source
Sparse Adversarial Examples Attacking on Video Captioning Model [PDF]
Despite the fact that multi-modal deep learning such as image captioning model has been proved to be vulnerable to adversarial examples,the adversarial susceptibility in video caption generation is under-examined.There are two main reasons for this.On ...
QIU Jiangxing, TANG Xueming, WANG Tianmei, WANG Chen, CUI Yongquan, LUO Ting
doaj +1 more source

