Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training [PDF]
Masked Autoencoders (MAE) have shown great potentials in self-supervised pre-training for language and 2D image transformers. However, it still remains an open question on how to exploit masked autoencoding for learning 3D representations of irregular ...
Renrui Zhang+7 more
semanticscholar +1 more source
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling [PDF]
We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT [8] to 3D point cloud. Inspired by BERT, we devise a Masked Point Modeling (MPM) task to pre-train point cloud Transformers. Specifically, we first divide a
Xumin Yu+5 more
semanticscholar +1 more source
CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding [PDF]
Manual annotation of large-scale point cloud dataset for varying tasks such as 3D object classification, segmentation and detection is often laborious owing to the irregular structure of point clouds.
Mohamed Afham+5 more
semanticscholar +1 more source
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following [PDF]
We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, audio, and video. Guided by ImageBind, we construct a joint embedding space between 3D and multi-modalities, enabling many promising applications, e.g., any-
Ziyu Guo+10 more
semanticscholar +1 more source
Diffusion Probabilistic Models for 3D Point Cloud Generation [PDF]
We present a probabilistic model for point cloud generation, which is fundamental for various 3D vision tasks such as shape completion, upsampling, synthesis and data augmentation.
Shitong Luo, Wei Hu
semanticscholar +1 more source
REGTR: End-to-end Point Cloud Correspondences with Transformers [PDF]
Despite recent success in incorporating learning into point cloud registration, many works focus on learning feature descriptors and continue to rely on nearest-neighbor feature matching and outlier filtering through RANSAC to obtain the final set of ...
Zi Jian Yew, Gim Hee Lee
semanticscholar +1 more source
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models [PDF]
Recent advancements in vision foundation models (VFMs) have opened up new possibilities for versatile and efficient visual perception. In this work, we introduce Seal, a novel framework that harnesses VFMs for segmenting diverse automotive point cloud ...
You-Chen Liu+7 more
semanticscholar +1 more source
PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers [PDF]
Point clouds captured in real-world applications are of-ten incomplete due to the limited sensor resolution, single viewpoint, and occlusion. Therefore, recovering the complete point clouds from partial ones becomes an indispensable task in many ...
Xumin Yu+5 more
semanticscholar +1 more source
PointCLIP: Point Cloud Understanding by CLIP [PDF]
Recently, zero-shot and few-shot learning via Contrastive Vision-Language Pre-training (CLIP) have shown inspirational performance on 2D visual recognition, which learns to match images with their corresponding texts in open-vocabulary settings. However,
Renrui Zhang+8 more
semanticscholar +1 more source
Georeferenced Point Clouds: A Survey of Features and Point Cloud Management [PDF]
This paper presents a survey of georeferenced point clouds. Concentration is, on the one hand, put on features, which originate in the measurement process themselves, and features derived by processing the point cloud. On the other hand, approaches for the processing of georeferenced point clouds are reviewed.
Johannes Otepka+4 more
openaire +5 more sources