Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos


SIGGRAPH Asia 2024 (TOG)

Yuheng Jiang, Zhehao Shen, Yu Hong, Chengcheng Guo, Yize Wu, Yingliang Zhang, Jingyi Yu, Lan Xu

Paper Animation Sequences(PLY format) Dataset
We present a novel Gaussian-based representation DualGS for volumetric videos, achieving robust human performance tracking and high-fidelity rendering.

We implement a DualGS player that enables real-time rendering on low-end mobile devices and VR headsets, offering a more user-friendly and interactive experience.

Overview Video



DualGS serves as a ticket to a virtual world, offering immersive, high-fidelity viewing of multiple musicians performing.

Pipeline


We propose a novel Dual Gaussian representation to capture challenging human performance from multi-view inputs. We first optimize joint Gaussians from a random point cloud, then use them to initialize skin Gaussians, expressing their motion through interpolation. In the following optimization, we employ a coarse-to-fine strategy, with a coarse alignment for overall motion prediction and fine-grained optimization for robust tracking and high-fidelity rendering.

Compression


Illustration of our hybrid compression strategy. We compress joint Gaussian motions using residual vector quantization, encode opacity and scaling via codec compression, and represent spherical harmonics with a persistent codebook. Our approach achieves a compression ratio of up to 120-fold.

Comparison


Ours
HumanRF
Ours
NeuS2
Ours
Spacetime Gaussian
Ours
HiFi4G

Result Gallery


Acknowledgements


The authors would like to thank Zitong Hu, Shengkun Zhu and Zhengxiao Yu from ShanghaiTech University for insightful suggestions. We also thank the reviewers for their feedback. This work was supported by National Key R&D Program of China (2022YFF0902301), Shanghai Local college capacity building program (22010502800). We also acknowledge support from Shanghai Frontiers Science Center of Human-centered Artificial Intelligence (ShangHAI).