Results 11 to 20 of about 16,712,425 (389)

Image-to-Image Translation with Conditional Adversarial Networks [PDF]

open access: yesComputer Vision and Pattern Recognition, 2018
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping.
Efros, Alexei A.   +3 more
core   +2 more sources

High-Resolution Image Synthesis with Latent Diffusion Models [PDF]

open access: yesComputer Vision and Pattern Recognition, 2021
By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a guiding mechanism
Robin Rombach   +4 more
semanticscholar   +1 more source

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models [PDF]

open access: yesInternational Conference on Machine Learning, 2023
The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models. This paper proposes BLIP-2, a generic and efficient pre-training strategy that bootstraps vision-language pre-training from
Junnan Li   +3 more
semanticscholar   +1 more source

Hierarchical Text-Conditional Image Generation with CLIP Latents [PDF]

open access: yesarXiv.org, 2022
Contrastive models like CLIP have been shown to learn robust representations of images that capture both semantics and style. To leverage these representations for image generation, we propose a two-stage model: a prior that generates a CLIP image ...
A. Ramesh   +4 more
semanticscholar   +1 more source

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding [PDF]

open access: yesNeural Information Processing Systems, 2022
We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength ...
Chitwan Saharia   +13 more
semanticscholar   +1 more source

Adding Conditional Control to Text-to-Image Diffusion Models [PDF]

open access: yesIEEE International Conference on Computer Vision, 2023
We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers ...
Lvmin Zhang, Anyi Rao, Maneesh Agrawala
semanticscholar   +1 more source

Analyzing and Improving the Image Quality of StyleGAN [PDF]

open access: yesComputer Vision and Pattern Recognition, 2019
The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. We expose and analyze several of its characteristic artifacts, and propose changes in both model architecture and training
Tero Karras   +5 more
semanticscholar   +1 more source

TIAToolbox as an end-to-end library for advanced tissue image analytics

open access: yesCommunications Medicine, 2022
Pocock, Graham et al. present TIAToolbox, a Python toolbox for computational pathology. The extendable library can be used for data loading, pre-processing, model inference, post-processing, and visualization.
Johnathan Pocock   +13 more
doaj   +1 more source

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation [PDF]

open access: yesComputer Vision and Pattern Recognition, 2022
Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-quality and diverse synthesis of images from a given text prompt.
Nataniel Ruiz   +5 more
semanticscholar   +1 more source

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network [PDF]

open access: yesComputer Vision and Pattern Recognition, 2016
Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at ...
C. Ledig   +8 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy