UC Berkeley
ICON Lab

DDAT: Diffusion Policies Enforcing Dynamically Admissible Robot Trajectories

Unitree GO2 zero-shot hardware deployment of open-loop trajectories generated by a vanilla diffusion policy (left) and our DDAT model (right).
The vanilla diffusion policy fails at walking through the cones in open-loop. By accounting for the quadruped's dynamics our open-loop diffusion policy succeeds in following the corridor.

Abstract

The stochastic nature of diffusion models prevents them from generating trajectories exactly satisfying the equations of motion of robots. To alleviate this issue, we introduce DDAT: Diffusion policies for Dynamically Amissible Trajectories. A trajectory is dynamically admissible if each state belongs to the reachable set of its predecessor by the robot's equations of motion. To generate such trajectories, our diffusion policies project their predictions onto a dynamically admissible manifold during both training and inference to align the objective of the denoiser neural network with the dynamical admissibility constraint. Due to the auto-regressive nature of such projections as well as the black-box nature of robot dynamics, exact projections are challenging. We instead sample a polytopic under-approximation of the reachable set onto which we project the predicted successor, before iterating this process with the projected successor. By producing accurate trajectories, this projection eliminates the need for diffusion models to continually replan, enabling one-shot long-horizon trajectory planning. We demonstrate our framework through extensive simulations on a quadcopter and various MuJoCo environments, along with real-world experiments on a Unitree GO1 and GO2.

DDAT illustration

Schematic illustration of DDAT. Diffusion model $D_\theta$ is trained to predict a trajectory $\tilde{\tau}$ given a trajectory $\tau$ from the training dataset corrupted by noise $\varepsilon$. If the noise level $\sigma$ of signal $\varepsilon$ is sufficiently small, $\mathcal{P}_\sigma$ projects $\tilde{\tau}$ to the dynamically admissible trajectory $\tau_p$. The loss $||$ $\tau_p$ - $\tau||$ is used to update $D_\theta$.

Auto-regressive trajectory projection by sampling convex under-approximations of reachable sets of black-box dynamics $s_{t+1} = f(s_t, a_t)$.

Unitree GO1 zero-shot hardware deployment of an open-loop trajectory generated by a our DDAT model.

Projections make trajectories admissible and better

Unitree GO2 open-loop trajectories tasked with going straight.
Both models without projections deviate from walking straight,
whereas our DDAT model follows prefectly the command.

Projections should only occur at low noise levels

All diffusion models generate state-action trajectories with projections starting from either the beginning of inference, i.e., projecting at all noise levels, or starting mid-inference, or project only once after inference.

Generating quadcopter trajectories

Objective: slalom between obstacles to reach the target.

MuJoCo Hopper open-loop trajectories.
Without projections the Hopper fall earlier than with our projections.

BibTeX

@inproceedings{bouvier2025ddat,
        title = {DDAT: Diffusion Policies Enforcing Dynamically Admissible Robot Trajectories},
        author = {Bouvier, Jean-Baptiste and Ryu, Kanghyun and Nagpal, Kartik and Liao, Qiayuan and Sreenath, Koushil and Mehr, Negar},
        booktitle = {under review},
        year = {2025}
      }