Imitation Learning of Robot Policies by Combining Language, Vision and Demonstration

Year
2019
Type(s)
Author(s)
Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Chitta Baral, Heni Ben Amor
Source
NeurIPS 2019 Workshop on Robot Learning: Control and Interaction in the Real World,
BibTeX
BibTeX

In this work we propose a novel end-to-end imitation learning approach which combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn is used to synthesize specific motion controllers at run-time. This multimodal approach enables generalization to a wide variety of environmental conditions and allows an end-user to direct a robot policy through verbal communication. We empirically validate our approach with an extensive set of simulations and show that it achieves a high task success rate over a variety of conditions while remaining amenable to probabilistic interpretability.