Results 1 to 10 of about 104,311 (276)

DIC-Transformer: interpretation of plant disease classification results using image caption generation technology [PDF]

open access: yesFrontiers in Plant Science, 2023
Disease image classification systems play a crucial role in identifying disease categories in the field of agricultural diseases. However, current plant disease image classification methods can only predict the disease category and do not offer ...
Qingtian Zeng, Jian Sun, Shansong Wang
doaj   +2 more sources

Enhancing image caption generation through context-aware attention mechanism [PDF]

open access: yesHeliyon
Image captioning, the process of generating natural language descriptions based on image content, has garnered attention in AI research for its implications in scene understanding and human-computer interaction.
Ahatesham Bhuiyan   +3 more
doaj   +2 more sources

VSAM-Based Visual Keyword Generation for Image Caption

open access: yesIEEE Access, 2021
Image caption is to understand and describe the visual content, which is expected to be applied in automatic news reporting in future. In recent years, there has been an increasing interest in an Encoder-Decoder framework for image caption: the encoder ...
Suya Zhang   +3 more
doaj   +3 more sources

PBC-Transformer: Interpreting Poultry Behavior Classification Using Image Caption Generation Techniques [PDF]

open access: yesAnimals
Accurate classification of poultry behavior is critical for assessing welfare and health, yet most existing methods predict behavior categories without providing explanations for the image content. This study introduces the PBC-Transformer model, a novel
Jun Li   +7 more
doaj   +2 more sources

Fine-Tuning a Small Vision Language Model Using Synthetic Data for Explaining Bacterial Skin Disease Images [PDF]

open access: yesDiagnostics
Background/Objectives: Vision language models (VLMs) show strong potential for medical image understanding, but their large scale often limits practical deployment. This study investigates whether a compact VLM can be effectively adapted for dermatology,
Shiwan Zhang   +3 more
doaj   +2 more sources

Review of Image Captioning Methods Based on Encoding-Decoding Technology [PDF]

open access: yesJisuanji kexue yu tansuo, 2022
In recent years, image caption generation, as a multimodal task in the field of artificial intelligence, integrates the related research of computer vision and natural language processing, and can realize the modal conversion from image to text. It plays
GENG Yaogang, MEI Hongyan, ZHANG Xing, LI Xiaohui
doaj   +1 more source

A Multimodal Framework for Video Caption Generation

open access: yesIEEE Access, 2022
Video captioning is a highly challenging computer vision task that automatically describes the video clips using natural language sentences with a clear understanding of the embedded semantics.
Reshmi S. Bhooshan, Suresh K.
doaj   +1 more source

Automated Caption Generation for Video Call with Language Translation [PDF]

open access: yesE3S Web of Conferences, 2023
In the modern era, virtual communication between individuals is common. Many people’s lives have been made simpler in a number of circumstances by providing subtitles, generating automated captions for social media videos, and language translation from a
Polepaka Sanjeeva   +4 more
doaj   +1 more source

Parallel Dense Video Caption Generation with Multi-Modal Features

open access: yesMathematics, 2023
The task of dense video captioning is to generate detailed natural-language descriptions for an original video, which requires deep analysis and mining of semantic captions to identify events in the video. Existing methods typically follow a localisation-
Xuefei Huang   +3 more
doaj   +1 more source

Multifaceted Feature Coding Image Caption Generation Algorithm Based on Transformer [PDF]

open access: yesJisuanji gongcheng, 2023
Object features extracted by object detection algorithms play an increasingly critical role in the generation of image captions.However, only using the features of object detection as the input of an image caption task can lead to the loss of other ...
HENG Hongjun, FAN Yuchen, WANG Jialiang
doaj   +1 more source

Home - About - Disclaimer - Privacy