Results 31 to 40 of about 20,321 (258)
Streamlined Dense Video Captioning [PDF]
Dense video captioning is an extremely challenging task since accurate and coherent description of events in a video requires holistic understanding of video contents as well as contextual reasoning of individual events. Most existing approaches handle this problem by first detecting event proposals from a video and then captioning on a subset of the ...
Mun, Jonghwan +4 more
openaire +2 more sources
The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping.
Haitao Xiong +4 more
doaj +1 more source
A Multimodal Framework for Video Caption Generation
Video captioning is a highly challenging computer vision task that automatically describes the video clips using natural language sentences with a clear understanding of the embedded semantics.
Reshmi S. Bhooshan, Suresh K.
doaj +1 more source
The Effects of Captioning Videos Used for Foreign Language Listening Activities [PDF]
This study investigated the effects of captioning during video-based listening activities. Second- and fourth-year learners of Arabic, Chinese, Spanish, and Russian watched three short videos with and without captioning in randomized order.
Paula Winke +2 more
doaj
CapERA: Captioning Events in Aerial Videos
In this paper, we introduce the CapERA dataset, which upgrades the Event Recognition in Aerial Videos (ERA) dataset to aerial video captioning. The newly proposed dataset aims to advance visual–language-understanding tasks for UAV videos by providing ...
Laila Bashmal +4 more
doaj +1 more source
Abstract Dense video captioning involves detecting and describing events within video sequences. Traditional methods operate in an offline setting, assuming the entire video is available for analysis. In contrast, in this work we introduce a groundbreaking paradigm: Live Video Captioning (LVC), where captions must
Blanco Fernández, Eduardo +4 more
openaire +4 more sources
Video Captioning via Hierarchical Reinforcement Learning
Video captioning is the task of automatically generating a textual description of the actions in a video. Although previous work (e.g. sequence-to-sequence model) has shown promising results in abstracting a coarse description of a short video, it is ...
Chen, Wenhu +4 more
core +1 more source
Attentive Semantic Video Generation Using Captions [PDF]
This paper proposes a network architecture to perform variable length semantic video generation using captions. We adopt a new perspective towards video generation where we allow the captions to be combined with the long-term and short-term dependencies between video frames and thus generate a video in an incremental manner. Our experiments demonstrate
Marwah, Tanya +2 more
openaire +2 more sources
SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts Commentaries
Soccer is more than just a game - it is a passion that transcends borders and unites people worldwide. From the roar of the crowds to the excitement of the commentators, every moment of a soccer match is a thrill. Yet, with so many games happening simultaneously, fans cannot watch them all live.
Mkhallati, Hassan +4 more
openaire +2 more sources
Abstract The Internet of Things is emerging as a crucial technology in aiding humans and making their lives easier. Among the human population, a large percentage of people suffer from disabilities resulting in challenges in everyday life particularly people with visual disabilities.
Hania Tarik +8 more
wiley +1 more source

