Generación y desambiguación de sentidos en el ámbito nominal: aportes al Léxico Generativo [PDF]
La relación entre la perspectiva de la generación de los significados léxicos, por un lado, y la de la interpretación semántica, por otro, ha sido poco explorada en los modelos contemporáneos sobre la polisemia.
Andreína Adelstein, Marina Berri
doaj +3 more sources
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion [PDF]
Text-to-image models offer unprecedented freedom to guide creation through natural language. Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and ...
Rinon Gal +6 more
semanticscholar +1 more source
Zero-Shot Composed Image Retrieval with Textual Inversion [PDF]
Composed Image Retrieval (CIR) aims to retrieve a target image based on a query composed of a reference image and a relative caption that describes the difference between the two images.
Alberto Baldrati +3 more
semanticscholar +1 more source
P+: Extended Textual Conditioning in Text-to-Image Generation [PDF]
We introduce an Extended Textual Conditioning space in text-to-image models, referred to as $P+$. This space consists of multiple textual conditions, derived from per-layer prompts, each corresponding to a layer of the denoising U-net of the diffusion ...
Andrey Voynov +3 more
semanticscholar +1 more source
LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On [PDF]
The rapidly evolving fields of e-commerce and metaverse continue to seek innovative approaches to enhance the consumer experience. At the same time, recent advancements in the development of diffusion models have enabled generative networks to create ...
Davide Morelli +5 more
semanticscholar +1 more source
TEMOS: Generating diverse human motions from textual descriptions [PDF]
We address the problem of generating diverse 3D human motions from textual descriptions. This challenging task requires joint modeling of both modalities: understanding and extracting useful human-centric information from the text, and then generating ...
Mathis Petrovich +2 more
semanticscholar +1 more source
Phenaki: Variable Length Video Generation From Open Domain Textual Description [PDF]
We present Phenaki, a model capable of realistic video synthesis, given a sequence of textual prompts. Generating videos from text is particularly challenging due to the computational cost, limited quantities of high quality text-video data and variable ...
Ruben Villegas +8 more
semanticscholar +1 more source
CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment [PDF]
Sign language recognition (SLR) is a weakly supervised task that annotates sign videos as textual glosses. Recent studies show that insufficient training caused by the lack of large-scale available sign datasets becomes the main bottleneck for SLR.
Jiangbin Zheng +7 more
semanticscholar +1 more source
SpaText: Spatio-Textual Representation for Controllable Image Generation [PDF]
Recent text-to-image diffusion models are able to generate convincing results of unprecedented quality. However, it is nearly impossible to control the shapes of different regions/objects or their layout in a fine-grained fashion.
Omri Avrahami +8 more
semanticscholar +1 more source
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback [PDF]
Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input.This phenomenon is emphasized in tasks like summarization, in which the generated summaries ...
Paul Roit +18 more
semanticscholar +1 more source

