Scaling Spatial Intelligence with Multimodal Foundation Models Paper • 2511.13719 • Published 14 days ago • 42
Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals Paper • 2510.27684 • Published Oct 31 • 22
SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation Paper • 2411.19921 • Published Nov 29, 2024
TokensGen: Harnessing Condensed Tokens for Long Video Generation Paper • 2507.15728 • Published Jul 21 • 7
DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior Paper • 2508.00599 • Published Aug 1 • 7
Has GPT-5 Achieved Spatial Intelligence? An Empirical Study Paper • 2508.13142 • Published Aug 18 • 34
OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation Paper • 2301.07525 • Published Jan 18, 2023
IT3D: Improved Text-to-3D Generation with Explicit View Synthesis Paper • 2308.11473 • Published Aug 22, 2023
DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering Paper • 2307.10173 • Published Jul 19, 2023 • 6
HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling Paper • 2204.13686 • Published Apr 28, 2022
Large Motion Model for Unified Multi-Modal Motion Generation Paper • 2404.01284 • Published Apr 1, 2024
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model Paper • 2208.15001 • Published Aug 31, 2022
HumanLiff: Layer-wise 3D Human Generation with Diffusion Model Paper • 2308.09712 • Published Aug 18, 2023 • 1
Disco4D: Disentangled 4D Human Generation and Animation from a Single Image Paper • 2409.17280 • Published Sep 25, 2024 • 11
ReliTalk: Relightable Talking Portrait Generation from a Single Video Paper • 2309.02434 • Published Sep 5, 2023 • 1
AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation Paper • 2403.17934 • Published Mar 26, 2024
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers Paper • 2406.10163 • Published Jun 14, 2024 • 33