Results 131 to 140 of about 5,597 (161)
Some of the next articles are maybe not open access.

Dense Video Captioning for Incomplete Videos

2021
Incomplete video or partially-missing video situations are rarely considered in video captioning research. Previous approaches are mainly trained and evaluated on complete video clip datasets where all the events involved are thoroughly observed. In this work, we formulate the issue of video content description for partially-missing videos.
Xuan Dang   +3 more
openaire   +1 more source

ADVC: Adversarial dense video captioning with unsupervised pretraining

Image and Vision Computing
Wangyu Choi, Jiasi Chen, Jongwon Yoon
exaly   +2 more sources

Event-centric multi-modal fusion method for dense video captioning

Neural Networks, 2022
Dense video captioning aims to automatically describe several events that occur in a given video, which most state-of-the-art models accomplish by locating and describing multiple events in an untrimmed video. Despite much progress in this area, most current approaches only encode visual features in the event location phase and they neglect the ...
Zhi, Chang   +4 more
exaly   +3 more sources

Event-Centric Hierarchical Representation for Dense Video Captioning

IEEE Transactions on Circuits and Systems for Video Technology, 2021
Dense video captioning aims to localize and describe multiple events in untrimmed videos, which is a challenging task that draws attention recently in computer vision. Although existing methods have achieved impressive performance, most of them only focus on local information of event segments or very simple event-level context, overlooking the ...
Teng Wang   +4 more
openaire   +1 more source

MPP-net: Multi-perspective perception network for dense video captioning

Neurocomputing, 2023
Yiwei Wei   +6 more
exaly   +2 more sources

SODA: Story Oriented Dense Video Captioning Evaluation Framework

Lecture Notes in Computer Science, 2020
Dense Video Captioning (DVC) is a challenging task that localizes all events in a short video and describes them with natural language sentences. The main goal of DVC is video story description, that is, to generate a concise video story that supports human video comprehension without watching it. In recent years, DVC has attracted increasing attention
Soichiro Fujita   +4 more
exaly   +2 more sources

Transformer and LLM-Based Captioning Module for Dense Video Captioning

Lecture Notes in Networks and Systems
Dvijesh Bhatt, Priyank Thakkar
exaly   +2 more sources

Attention-based Densely Connected LSTM for Video Captioning

Proceedings of the 27th ACM International Conference on Multimedia, 2019
Recurrent Neural Networks (RNNs), especially the Long Short-Term Memory (LSTM), have been widely used for video captioning, since they can cope with the temporal dependencies within both video frames and the corresponding descriptions. However, as the sequence gets longer, it becomes much harder to handle the temporal dependencies within the sequence ...
Yongqing Zhu, Shuqiang Jiang
openaire   +1 more source

Two Uni-directional LSTMs-Based Captioning Module for Dense Video Captioning

Lecture Notes in Networks and Systems
Dvijesh Bhatt, Priyank Thakkar
exaly   +2 more sources

Home - About - Disclaimer - Privacy