CVPR 2026 Findings · Project Page Draft

WildAni4D: Towards 4D Animal Mesh Reconstruction

A video-native framework for 4D animal mesh reconstruction from monocular videos, combining scalable synthetic video generation with temporally consistent animal motion recovery.

Gyeongsu Cho (UNIST), Hezhen Hu (UT Austin), Donghyeon Soon (DGIST), Changwoo Kang (UNIST), Kyungdon Joo (UNIST)
UNIST · The University of Texas at Austin · DGIST
Synthetic video pipeline Temporal consistency World-grounded motion Animal 4D reconstruction
Teaser Image / Video
Put your main teaser here.
Recommended file: assets/teaser.mp4 or assets/teaser.png

Abstract

A concise summary of the problem, method, and contributions.

We present WildAni4D, a framework for 4D animal mesh reconstruction from monocular videos. Reconstructing articulated animals in the wild is challenging due to rapid motion, shape ambiguity, frequent self-occlusion, and the lack of large-scale video datasets with 3D supervision. To address this, WildAni4D combines a scalable synthetic video generation pipeline with a temporally consistent reconstruction model that predicts animal shape, pose, and trajectory across time. Our approach is designed to be video-native rather than frame-wise, enabling more stable motion recovery and improved temporal coherence. We show that the proposed framework serves as a strong starting point for 4D animal understanding and supports downstream applications such as pseudo-annotation, animatable reconstruction, and motion-driven generation.

Video

Embed the main project video here. You can use YouTube, Google Drive preview, or a local mp4.

Replace YOUR_VIDEO_ID with your uploaded project video ID.

Method Overview

Show the pipeline figure and briefly explain the key idea.

Overview Figure Placeholder
Recommended file: assets/overview.png

Overview. Given a monocular animal video, WildAni4D first builds scalable training data through a synthetic video pipeline, then learns a temporally consistent model for recovering shape, pose, and motion across frames. The design emphasizes sequence-level coherence for stable 4D reconstruction and downstream use.

Qualitative Results

A simple gallery for videos, reconstructions, and applications.

Result Video 1
assets/result_1.mp4

Example reconstruction result on a challenging animal sequence.

Result Video 2
assets/result_2.mp4

Temporal consistency across frames and viewpoints.

Downstream Applications

Use this block for pseudo-annotation, animatable reconstruction, text-to-motion, or any extra demo.

Application 1
assets/app_1.mp4

Pseudo-annotation. Example use for generating temporally consistent supervision.

Application 2
assets/app_2.mp4

Animatable reconstruction. Example use for stable animal animation.

Application 3
assets/app_3.mp4

Motion-driven generation. Example use for text-to-motion or related tasks.

Additional Results

A place for extra visualizations, failure cases, or comparisons.

Additional Figure / Composite Video
Use assets/additional.png or assets/additional.mp4

You can also replace this with a comparison table screenshot or a failure-case montage.

Citation

Update the year, venue, and page links after the paper is finalized.

@inproceedings{cho2026wildani4d,
  title     = {WildAni4D: Towards 4D Animal Mesh Reconstruction},
  author    = {Cho, Gyeongsu and Hu, Hezhen and Soon, Donghyeon and Kang, Changwoo and Joo, Kyungdon},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings},
  year      = {2026}
}