Abstract

Indoor scenes we are living in are visually homogenous or textureless, while they inherently have structural forms and provide enough structural priors for 3D scene reconstruction. Motivated by this fact, we propose a structure-aware online signed distance fields (SDF) reconstruction framework in indoor scenes, especially under the Atlanta world (AW) assumption. Thus, we dub this incremental SDF reconstruction for AW as AiSDF. Within the online framework, we infer the underlying Atlanta structure of a given scene and then estimate planar surfel regions supporting the Atlanta structure. This Atlanta-aware surfel representation provides an explicit planar map for a given scene. In addition, based on these Atlanta planar surfel regions, we adaptively sample and constrain the structural regularity in the SDF reconstruction, which enables us to improve the reconstruction quality by maintaining a high-level structure while enhancing the details of a given scene. We evaluate the proposed AiSDF on the ScanNet and ReplicaCAD datasets, where we demonstrate that the proposed framework is capable of reconstructing fine details of objects implicitly, as well as structures explicitly in room-scale scenes.

Method



Given a stream of posed depth images, AiSDF first selects the keyframe and adds it to the keyframe set for continual learning. We update the global Atlanta frame (AF) by extracting the dominant directions from a new keyframe and then generate surfels that represent the planar regions supported by the updated global AF. From a set of keyframes with Atlanta-aware surfels, we sample the 3D points considering the structure of the scene. Finally, sampled point x is queried to MLP that outputs signed distance value s, and we optimize the network in a self-supervised manner by measuring the loss between s and bound b. Note that we intentionally present intermediate steps of continual learning to show the process of extracting the new Atlanta direction and surfels supported by updated global AF. In Atlanta-aware sampling (blue box), we use the ground truth mesh to visualize the sampling effectively. The final mesh result indicates the reconstructed mesh by AiSDF using all keyframes.

Results

Comparison:
scannet_0031apt_3_nav
Qualitative result on Atlanta sequence:

AiSDF can also reconstruct a more general indoor scene using the atlanta assumption. The right image presents the explicit planar map composed of surfels, where green color denotes the surfels supported by the vertical Atlanta direction, and the other colors represent the surfels by the other horizontal Atlanta directions.

Supplemetary Video

Citation

Acknowledgements


The website template was borrowed from Dor Verbin.