Results 131 to 140 of about 5,597 (161)
Some of the next articles are maybe not open access.
Dense Video Captioning for Incomplete Videos
2021Incomplete video or partially-missing video situations are rarely considered in video captioning research. Previous approaches are mainly trained and evaluated on complete video clip datasets where all the events involved are thoroughly observed. In this work, we formulate the issue of video content description for partially-missing videos.
Xuan Dang +3 more
openaire +1 more source
ADVC: Adversarial dense video captioning with unsupervised pretraining
Image and Vision ComputingWangyu Choi, Jiasi Chen, Jongwon Yoon
exaly +2 more sources
Event-centric multi-modal fusion method for dense video captioning
Neural Networks, 2022Dense video captioning aims to automatically describe several events that occur in a given video, which most state-of-the-art models accomplish by locating and describing multiple events in an untrimmed video. Despite much progress in this area, most current approaches only encode visual features in the event location phase and they neglect the ...
Zhi, Chang +4 more
exaly +3 more sources
Event-Centric Hierarchical Representation for Dense Video Captioning
IEEE Transactions on Circuits and Systems for Video Technology, 2021Dense video captioning aims to localize and describe multiple events in untrimmed videos, which is a challenging task that draws attention recently in computer vision. Although existing methods have achieved impressive performance, most of them only focus on local information of event segments or very simple event-level context, overlooking the ...
Teng Wang +4 more
openaire +1 more source
MPP-net: Multi-perspective perception network for dense video captioning
Neurocomputing, 2023Yiwei Wei +6 more
exaly +2 more sources
SODA: Story Oriented Dense Video Captioning Evaluation Framework
Lecture Notes in Computer Science, 2020Dense Video Captioning (DVC) is a challenging task that localizes all events in a short video and describes them with natural language sentences. The main goal of DVC is video story description, that is, to generate a concise video story that supports human video comprehension without watching it. In recent years, DVC has attracted increasing attention
Soichiro Fujita +4 more
exaly +2 more sources
Transformer and LLM-Based Captioning Module for Dense Video Captioning
Lecture Notes in Networks and SystemsDvijesh Bhatt, Priyank Thakkar
exaly +2 more sources
Attention-based Densely Connected LSTM for Video Captioning
Proceedings of the 27th ACM International Conference on Multimedia, 2019Recurrent Neural Networks (RNNs), especially the Long Short-Term Memory (LSTM), have been widely used for video captioning, since they can cope with the temporal dependencies within both video frames and the corresponding descriptions. However, as the sequence gets longer, it becomes much harder to handle the temporal dependencies within the sequence ...
Yongqing Zhu, Shuqiang Jiang
openaire +1 more source
Two Uni-directional LSTMs-Based Captioning Module for Dense Video Captioning
Lecture Notes in Networks and SystemsDvijesh Bhatt, Priyank Thakkar
exaly +2 more sources
Multimodal representation fusion method for dense video captioning
Knowledge-Based SystemsYonggang Li
exaly +2 more sources

