Detailed Geometry and Appearance from Opportunistic Motion
Reconstructing 3D geometry and appearance from a sparse set of fixed cameras is a foundational task with broad applications, yet it remains fundamentally constrained by the limited viewpoints. We show that this bound can be broken by exploiting opportunistic object motion: as a person manipulates an object~(e.g., moving a chair or lifting a mug), the static cameras effectively ``orbit’’ the object in its local coordinate frame, providing additional virtual viewpoints. Harnessing this object motion, however, poses two challenges: the tight coupling of object pose and geometry estimation and the complex appearance variations of a moving object under static illumination. We address these by formulating a joint pose and shape optimization using 2D Gaussian splatting with alternating minimization of 6DoF trajectories and primitive parameters, and by introducing a novel appearance model that factorizes diffuse and specular components with reflected directional probing within the spherical harmonics space. Extensive experiments on synthetic and real-world datasets with extremely sparse viewpoints demonstrate that our method recovers significantly more accurate geometry and appearance than state-of-the-art baselines.
💡 Research Summary
This paper tackles the challenging problem of reconstructing high‑fidelity 3D geometry and appearance of a rigid object from a very sparse set of static cameras, by exploiting the “opportunistic” motion of the object itself. When a person picks up, slides, or rotates an object, the fixed cameras effectively observe the object from many different viewpoints in the object’s local coordinate frame, creating a set of virtual viewpoints that would otherwise require many moving cameras. The authors identify two fundamental obstacles: (1) the tight coupling between object pose and shape—pose estimation relies on the current geometry, while geometry refinement depends on an accurate pose; (2) the standard appearance model used in 3D Gaussian Splatting (3DGS) assumes lighting attached to the object, which breaks down when the object moves relative to static illumination, especially for specular surfaces where highlights shift dramatically.
To overcome these issues, the paper introduces a joint pose‑and‑shape optimization pipeline built on 2D Gaussian splatting, together with a novel motion‑aware appearance model. The pipeline proceeds in three stages. First, an initial canonical set of 2D Gaussians and a soft segmentation mask are obtained from the very first multi‑view frame using the MAtCha method (which provides depth and normal priors from sparse views) and the SAM‑2 segmentation model. Second, an alternating optimization loop refines object pose and Gaussian parameters across all frames. A key technical contribution is the “soft‑masked rigid transformation”: each Gaussian i is assigned a mask value m_i ∈
Comments & Academic Discussion
Loading comments...
Leave a Comment