WildAni4D: Towards 4D Animal Mesh Reconstruction

Abstract

A concise summary of the problem, method, and contributions.

We present WildAni4D, a framework for 4D animal mesh reconstruction from monocular videos. Reconstructing articulated animals in the wild is challenging due to rapid motion, shape ambiguity, frequent self-occlusion, and the lack of large-scale video datasets with 3D supervision. To address this, WildAni4D combines a scalable synthetic video generation pipeline with a temporally consistent reconstruction model that predicts animal shape, pose, and trajectory across time. Our approach is designed to be video-native rather than frame-wise, enabling more stable motion recovery and improved temporal coherence. We show that the proposed framework serves as a strong starting point for 4D animal understanding and supports downstream applications such as pseudo-annotation, animatable reconstruction, and motion-driven generation.

Video

Embed the main project video here. You can use YouTube, Google Drive preview, or a local mp4.

Replace YOUR_VIDEO_ID with your uploaded project video ID.

Method Overview

Show the pipeline figure and briefly explain the key idea.

Overview Figure Placeholder
Recommended file: assets/overview.png

Overview. Given a monocular animal video, WildAni4D first builds scalable training data through a synthetic video pipeline, then learns a temporally consistent model for recovering shape, pose, and motion across frames. The design emphasizes sequence-level coherence for stable 4D reconstruction and downstream use.

Qualitative Results

A simple gallery for videos, reconstructions, and applications.

Result Video 1
assets/result_1.mp4

Example reconstruction result on a challenging animal sequence.

Result Video 2
assets/result_2.mp4

Temporal consistency across frames and viewpoints.

Downstream Applications

Use this block for pseudo-annotation, animatable reconstruction, text-to-motion, or any extra demo.

Application 1
assets/app_1.mp4

Pseudo-annotation. Example use for generating temporally consistent supervision.

Application 2
assets/app_2.mp4

Animatable reconstruction. Example use for stable animal animation.

Application 3
assets/app_3.mp4

Motion-driven generation. Example use for text-to-motion or related tasks.

Additional Results

A place for extra visualizations, failure cases, or comparisons.

Additional Figure / Composite Video
Use assets/additional.png or assets/additional.mp4

You can also replace this with a comparison table screenshot or a failure-case montage.

Citation

Update the year, venue, and page links after the paper is finalized.

@inproceedings{cho2026wildani4d,
  title     = {WildAni4D: Towards 4D Animal Mesh Reconstruction},
  author    = {Cho, Gyeongsu and Hu, Hezhen and Soon, Donghyeon and Kang, Changwoo and Joo, Kyungdon},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings},
  year      = {2026}
}