Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation

Hierarchical Diffusion Policy (HDP) factorises the policy space into a 1) high-level task-planning agent and 2) low-level goal-conditioned diffusion policy,

which achieves both task-level generalisation and flexible low-level control.


This paper introduces Hierarchical Diffusion Policy (HDP), a hierarchical agent for multi-task robotic manipulation.

HDP factorises a manipulation policy into a hierarchical structure: a high-level task-planning agent which predicts a distant next-best end-effector pose (NBP), and a low-level goal-conditioned diffusion policy which generates optimal motion trajectories. The factorised policy representation allows HDP to tackle both long-horizon task planning while generating fine-grained low-level actions. To generate context-aware motion trajectories while satisfying robot kinematics constraints, we present a novel kinematics-aware goal-conditioned control agent, Robot Kinematics Diffuser (RK-Diffuser). Specifically, RK-Diffuser learns to generate both the end-effector pose and joint position trajectories, and distill the accurate but kinematics-unaware end-effector pose diffuser to the kinematics-aware but less accurate joint position diffuser via differentiable kinematics.

Empirically, we show that HDP achieves a significantly higher success rate than the state-of-the-art methods in both simulation and real-world.

Hierarchical Diffusion Policy

Interpolate start reference image.

Overview of Hierarchical Diffusion Policy (HDP). HDP is a multi-task hierarchical agent for kinematics-aware robotic manipulation. HDP consists of two levels: a high-level language-guided agent and a low-level goal-conditioned diffusion policy. From left to right, the high-level agent takes in 3D environment observations and language instructions, then predicts the next-best end-effector pose. This pose guides the low-level RK-Diffuser. The RK-Diffuser subsequently generates a continuous joint-position trajectory by conditional sampling and trajectory inpainting given the next-best pose and environment observations. To generate kinematics-aware trajectories, RK-Diffuser distills the accurate but less flexible end-effector pose trajectories into joint position space via differentiable robot kinematics.


