S<sup>2</sup>-Diffusion

S²-Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation

Quantao Yang^∗,1, Michael C. Welle^∗,1,2, Danica Kragic¹, and Olov Andersson¹

Recent advances in skill learning has propelled robot manipulation to new heights by enabling it to learn complex manipulation tasks from a practical number of demonstrations. However, these skills are often limited to the particular action, object, and environment \textit{instances} that are shown in the training data, and have trouble transferring to other instances of the same category. In this work we present an open-vocabulary Spatial-Semantic Diffusion policy (S$^2$-Diffusion) which enables generalization from instance-level training data to category-level, enabling skills to be transferable between instances of the same category. We show that functional aspects of skills can be captured via a promptable semantic module combined with a spatial representation. We further propose leveraging depth estimation networks to allow the use of only a single RGB camera. Our approach is evaluated and compared on a diverse number of robot manipulation tasks, both in simulation and in the real world. Our results show that S$^2$-Diffusion is invariant to changes in category-irrelevant factors as well as enables satisfying performance on other instances within the same category, even if it was not trained on that specific instance.

S²-Diffusion

Download Preprint

^∗ These authors contributed equally.
¹ KTH Royal Institute of Technology, Stockholm, Sweden
² INCAR Robotics AB, Stockholm, Sweden

S²-Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation

Quantao Yang^∗,1, Michael C. Welle^∗,1,2, Danica Kragic¹, and Olov Andersson¹

Expert Demonstrations

Experiments

S2-Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation

Quantao Yang∗,1, Michael C. Welle∗,1,2, Danica Kragic1, and Olov Andersson1

Expert Demonstrations

Experiments

S²-Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation

Quantao Yang^∗,1, Michael C. Welle^∗,1,2, Danica Kragic¹, and Olov Andersson¹