Results 21 to 30 of about 222,124 (278)
Dense video captioning based on local attention
Dense video captioning aims to locate multiple events in an untrimmed video and generate captions for each event. Previous methods experienced difficulties in establishing the multimodal feature relationship between frames and captions, resulting in low ...
Yong Qian +5 more
doaj +1 more source
Guiding image captioning models toward more specific captions
Image captioning is conventionally formulated as the task of generating captions for images that match the distribution of reference image-caption pairs. However, reference captions in standard captioning datasets are short and may not uniquely identify the images they describe. These problems are further exacerbated when models are trained directly on
Kornblith, Simon +3 more
openaire +2 more sources
Standard captioning for the deaf and hard of hearing people cannot transmit the emotional information that music provides in support of the narrative in audio-visual media.
Maria J. Lucia +6 more
doaj +1 more source
When Malcolm Fraser opened The Australian Captioning Centre in 1982, he emphasised the importance of changing technology in improving the provision of captions:there is always going to be new technology coming forward, there will always be better ways of doing it if you wait a while.
Katie M Ellis, Mike Kent, Gwyneth Peaty
openaire +1 more source
Imageability- and Length-Controllable Image Captioning
Image captioning can show great performance for generating captions for general purposes, but it remains difficult to adjust the generated captions for different applications.
Marc A. Kastner +8 more
doaj +1 more source
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset
In recent years, automatic generation of image descriptions (captions), that is, image captioning, has attracted a great deal of attention. In this paper, we particularly consider generating Japanese captions for images.
Shigeto, Yutaro +2 more
core +1 more source
Next-to-Leading-Order Corrections to the Production of Heavy-Flavour Jets in e+e- Collisions [PDF]
In this paper we describe the calculation of the process e+e- -> Z/gamma -> QQbar + X, where Q is a heavy quark, at order alpha_s^2.Comment: 49 pages, Latex, epsfig, 6 figures.
Nason, P., Oleari, C.
core +3 more sources
Evaluation of Automatic Video Captioning Using Direct Assessment
We present Direct Assessment, a method for manually assessing the quality of automatically-generated captions for video. Evaluating the accuracy of video captions is particularly difficult because for any given video clip there is no definitive ground ...
Awad, George +2 more
core +1 more source
Generating Accurate Caption Units for Figure Captioning
Scientific-style figures are commonly used on the web to present numerical information. Captions that tell accurate figure information and sound natural would significantly improve figure accessibility. In this paper, we present promising results on machine figure captioning.
Xin Qian +7 more
openaire +1 more source
Context-aware Captions from Context-agnostic Supervision
We introduce an inference technique to produce discriminative context-aware image captions (captions that describe differences between images or visual concepts) using only generic context-agnostic training data (captions that describe a concept or an ...
Bengio, Samy +4 more
core +1 more source

