Results 11 to 20 of about 5,332,049 (311)

High-Resolution Image Synthesis with Latent Diffusion Models [PDF]

open access: yesComputer Vision and Pattern Recognition, 2021
By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a guiding mechanism
Robin Rombach   +4 more
semanticscholar   +1 more source

Adding Conditional Control to Text-to-Image Diffusion Models [PDF]

open access: yesIEEE International Conference on Computer Vision, 2023
We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers ...
Lvmin Zhang, Anyi Rao, Maneesh Agrawala
semanticscholar   +1 more source

Classifier-Free Diffusion Guidance [PDF]

open access: yesarXiv.org, 2022
Classifier guidance is a recently introduced method to trade off mode coverage and sample fidelity in conditional diffusion models post training, in the same spirit as low temperature sampling or truncation in other types of generative models. Classifier
Jonathan Ho
semanticscholar   +1 more source

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding [PDF]

open access: yesNeural Information Processing Systems, 2022
We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength ...
Chitwan Saharia   +13 more
semanticscholar   +1 more source

Scalable Diffusion Models with Transformers [PDF]

open access: yesIEEE International Conference on Computer Vision, 2022
We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches. We analyze the scalability of our
William S. Peebles, Saining Xie
semanticscholar   +1 more source

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation [PDF]

open access: yesComputer Vision and Pattern Recognition, 2022
Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-quality and diverse synthesis of images from a given text prompt.
Nataniel Ruiz   +5 more
semanticscholar   +1 more source

Diffusion policy: Visuomotor policy learning via action diffusion [PDF]

open access: yesRobotics: Science and Systems, 2023
This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot’s visuomotor policy as a conditional denoising diffusion process.
Cheng Chi   +6 more
semanticscholar   +1 more source

DreamFusion: Text-to-3D using 2D Diffusion [PDF]

open access: yesInternational Conference on Learning Representations, 2022
Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs. Adapting this approach to 3D synthesis would require large-scale datasets of labeled 3D data and efficient architectures for ...
Ben Poole   +3 more
semanticscholar   +1 more source

Elucidating the Design Space of Diffusion-Based Generative Models [PDF]

open access: yesNeural Information Processing Systems, 2022
We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices.
Tero Karras   +3 more
semanticscholar   +1 more source

DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps [PDF]

open access: yesNeural Information Processing Systems, 2022
Diffusion probabilistic models (DPMs) are emerging powerful generative models. Despite their high-quality generation performance, DPMs still suffer from their slow sampling as they generally need hundreds or thousands of sequential function evaluations ...
Cheng Lu   +5 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy