Simon Stepputtis

Imitation Learning of Robot Policies by Combining Language, Vision, and Demonstration

NeurIPS Workshop on Robot Learning: Control and Interaction in the Real World (NeurIPS-WRL), 2019

In this work we propose a novel end-to-end imitation learning approach that combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn is used to synthesize specific motion controllers at run-time. This multimodal approach enables generalization to a wide variety of environmental conditions and allows an end-user to direct a robot policy through verbal communication. We empirically validate our approach with an extensive set of simulations and show that it achieves a high task success rate over a variety of conditions while remaining amenable to probabilistic interpretability.