Results 261 to 270 of about 3,407,261 (330)
Some of the next articles are maybe not open access.
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
International Conference on Computer Graphics and Interactive TechniquesDiffusion models have demonstrated impressive performance in generating high-quality videos from text prompts or images. However, precise control over the video generation process—such as camera manipulation or content editing—remains a significant ...
Zekai Gu +11 more
semanticscholar +1 more source
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
arXiv.orgVisual instruction tuning has made considerable strides in enhancing the capabilities of Large Multimodal Models (LMMs). However, existing open LMMs largely focus on single-image tasks, their applications to multi-image scenarios remains less explored ...
Feng Li +7 more
semanticscholar +1 more source
ViPE: Video Pose Engine for 3D Geometric Perception
arXiv.orgAccurate 3D geometric perception is an important prerequisite for a wide range of spatial AI systems. While state-of-the-art methods depend on large-scale training data, acquiring consistent and precise 3D annotations from in-the-wild videos remains a ...
Jiahui Huang +14 more
semanticscholar +1 more source
2018
This chapter presents an overview of different tools used in research and engineering of 3D video delivery systems. These include software tools for 3D video compression and streaming, 3D video players, and their interfaces. Other types of tools widely used in research studies and development of new networking solutions, such as network simulators ...
Dumic, Emil +9 more
openaire +3 more sources
This chapter presents an overview of different tools used in research and engineering of 3D video delivery systems. These include software tools for 3D video compression and streaming, 3D video players, and their interfaces. Other types of tools widely used in research studies and development of new networking solutions, such as network simulators ...
Dumic, Emil +9 more
openaire +3 more sources
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation
ACM Transactions on GraphicsReal-world applications like video gaming and virtual reality often demand the ability to model 3D scenes that users can explore along custom camera trajectories.
Tianyu Huang +10 more
semanticscholar +1 more source
IEEE transactions on circuits and systems for video technology (Print), 2020
3D high-efficiency video coding (3D-HEVC) is the latest standard for 3D video compression created by the ISO/IEC MPEG and ITU-T Video Coding Experts Group (VCEG) based on a new video format called multiview texture videos plus depth maps (MVDs).
Hamza Hamout, Abderrahmane Elyousfi
semanticscholar +1 more source
3D high-efficiency video coding (3D-HEVC) is the latest standard for 3D video compression created by the ISO/IEC MPEG and ITU-T Video Coding Experts Group (VCEG) based on a new video format called multiview texture videos plus depth maps (MVDs).
Hamza Hamout, Abderrahmane Elyousfi
semanticscholar +1 more source
SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion
European Conference on Computer VisionWe present Stable Video 3D (SV3D) -- a latent video diffusion model for high-resolution, image-to-multi-view generation of orbital videos around a 3D object.
Vikram S. Voleti +8 more
semanticscholar +1 more source
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation
International Conference on Computer Graphics and Interactive TechniquesIn this work, we present CineMaster, a novel framework for 3D-aware and controllable text-to-video generation. Our goal is to empower users with comparable controllability as professional film directors: precise placement of objects within the scene ...
Qinghe Wang +9 more
semanticscholar +1 more source
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
European Conference on Computer VisionRealistic object interactions are crucial for creating immersive virtual experiences, yet synthesizing realistic 3D object dynamics in response to novel interactions remains a significant challenge.
Tianyuan Zhang +7 more
semanticscholar +1 more source
Spatiotemporal Multimodal Learning With 3D CNNs for Video Action Recognition
IEEE transactions on circuits and systems for video technology (Print), 2021Extracting effective spatial-temporal information is significantly important for video-based action recognition. Recently 3D convolutional neural networks (3D CNNs) that could simultaneously encode spatial and temporal dynamics in videos have made ...
Hanbo Wu, Xin Ma, Yibin Li
semanticscholar +1 more source

