Learning to Pour using Deep Deterministic Policy Gradients

Pouring is a fundamental skill for robots in both domestic and industrial environments. Ideally, a robot should be able to pour with high accuracy to specific, pre-defined heights and without spilling. However, due to the complex dynamics of liquids, it is difficult to learn how to pour to achieve these goals. In this paper we present an approach to learn a policy for pouring using Deep Deterministic Policy Gradients (DDPG). We remove the need for collecting training experiences on a real robot, by using a state-of-the-art liquid simulator, which allows for learning the liquid dynamics. We show through our experiments, performed with a PR2 robot, that it is possible to successfully transfer the learned policy to a real robot and even apply it to different liquids.