Dreaming the Sound of Contact: Leveraging Video and Audio Generation for Zero-Shot Force-Aware Manipulation
In Submission
Workshop on Manipulation Robustness: Towards Human-Level Robustness under Real-World Challenges, ICRA 2026
We present a pipeline that jointly leverages generated video and audio to recover both motion trajectories and contact force profiles from a single task description, enabling zero-shot force-aware manipulation where a kinematic-only baseline fails.