Results 31 to 40 of about 141,410 (272)
Generating images from a text description is as challenging as it is interesting. The Adversarial network performs in a competitive fashion where the networks are the rivalry of each other. With the introduction of Generative Adversarial Network, lots of development is happening in the field of Computer Vision.
Mahima Pandya, Prof. Sonal Rami
openaire +1 more source
Self-Learning for Few-Shot Remote Sensing Image Captioning
Large-scale caption-labeled remote sensing image samples are expensive to acquire, and the training samples available in practical application scenarios are generally limited.
Haonan Zhou +3 more
doaj +1 more source
Deconfounded Image Captioning: A Causal Retrospect [PDF]
Dataset bias in vision-language tasks is becoming one of the main problems which hinders the progress of our community. Existing solutions lack a principled analysis about why modern image captioners easily collapse into dataset bias. In this paper, we present a novel perspective: Deconfounded Image Captioning (DIC), to find out the answer of this ...
Yang, Xu, Zhang, Hanwang, Cai, Jianfei
openaire +5 more sources
Multilayer Dense Attention Model for Image Caption
The image caption is a technology that enables us to understand the contents and generate descriptive text, of images using machines. With the development of deep learning, means of using it to understand image content and generate descriptive text has ...
Ke Wang +4 more
doaj +1 more source
With the rise of user-generated content (UGC) and deep learning technology, more and more researchers construct and measure the tourism destination image (TDI) through online travelogues. However, due to the impact of COVID-19 prevention and control, the
Xin Zhang +3 more
doaj +1 more source
Leveraging Visual Question Answering for Image-Caption Ranking
Visual Question Answering (VQA) is the task of taking as input an image and a free-form natural language question about the image, and producing an accurate answer.
Lin, Xiao, Parikh, Devi
core +1 more source
Learning a Recurrent Visual Representation for Image Caption Generation
In this paper we explore the bi-directional mapping between images and their sentence-based descriptions. We propose learning this mapping using a recurrent neural network.
Chen, Xinlei, Zitnick, C. Lawrence
core +1 more source
Show and Tell: A Neural Image Caption Generator
Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing.
Bengio, Samy +3 more
core +1 more source
Deconfounded fashion image captioning with transformer and multimodal retrieval
Background: The annotation of fashion images is a significantly important task in the fashion industry as well as social media and e-commerce. However, owing to the complexity and diversity of fashion images, this task entails multiple challenges ...
Tao Peng +4 more
doaj +1 more source
VSRI:Visual Semantic Relational Interactor for Image Caption [PDF]
Image captioning is one of the key objectives of multimodal image understanding.This paper aims to generate detail-rich and accurate image caption.Currently,mainstream image captioning methods focus on the interrelationships between regions,but ignore ...
LIU Jian, YAO Renyuan, GAO Nan, LIANG Ronghua, CHEN Peng
doaj +1 more source

