DogRecon: Canine Prior-Guided Animatable 3D Gaussian Dog Reconstruction From a Single Image

Abstract

We tackle animatable 3D dog reconstruction from a single image, noting the overlooked potential of animals. Particularly, we focus on dogs, emphasizing their intrinsic characteristics that complicate 3D observation. The nature of quadruped leads to frequent joint occlusions compared to humans when we capture them in the 2D image. The challenge makes 3D reconstruction from 2D observations difficult, and it becomes dramatically harder when constrained to a single image. To this end, our framework consists of two key components: Canine-centric novel view synthesis with canine prior for dog's multi-view generation and a reliable sampling weight strategy with Gaussian Splatting for animatable 3D dog reconstruction. Extensive experiments on GART, DFA, and internet-sourced datasets confirm our framework's state-of-the-art performance in image-to-3D generation. Additionally, we demonstrate novel pose animation and text-to-3D dog reconstruction as applications.

Video

Method Overview

Overview of DogRecon, Given a single dog image, we first predict the canine prior by BITE. Based on the canine prior, we infer D-SMAL and the corresponding silhouette mask in the desired views, which guides the canine-centric NVS to generate canine-centric multi-view images. Finally, we create animatable 3D Gaussian dog with the Reliable Sampling Weight.

DogRecon: Canine Prior-Guided Animatable 3D Gaussian Dog Reconstruction From a Single Image

DogRecon creates an animatable 3D dog from a single image without relying on extra training data.

Abstract

Video

Method Overview

Application: Animation With Gaussian Scene

Application: Text-to-Animatable 3D Generation

Additional Results