ICLR 2024 PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction Spotlight

Spotlight

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

Peng Wang · Hao Tan · Sai Bi · Yinghao Xu · Fujun Luan · Kalyan Sunkavalli · Wenping Wang · Zexiang Xu · Kai Zhang

[ Abstract ]

[ OpenReview]

Abstract:

We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the camera poses in 1.3 seconds on a single A100 GPU. PF-LRM is a highly scalable method utilizing the self-attention blocks to exchange information between 3D object tokens and 2D image tokens; we predict coarse geometry for each view, and then use a differentiable Perspective-n-Point (PnP) solver to obtain camera poses. When trained on a huge amount of multi-view data, PF-LRM shows strong cross-dataset generalization ability, and outperforms baseline methods by a large margin in terms of pose prediction accuracy and 3D reconstruction quality on various evaluation datasets. We also demonstrate our model's robustness to variable numbers of input views and segmentation mask errors. Our project website is at: https://pf-lrm.github.io/project.

Chat is not available.