Simon Stepputtis

Publications

  • Explainable Action Advising for Multi-Agent Reinforcement Learning

    Yue Guo, Joseph Campbell, Simon Stepputtis, Ruiyu Li, Dana Hughes, Fei Fang, Katia Sycara International Conference on Robotics and Automation 2023

    Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm. An expert teacher provides advice to a student during training in order to improve the student’s sample efficiency and policy performance. Such advice is commonly given in the form of state-action pairs. However, it makes it difficult for the student to reason with and apply to novel states. We introduce Explainable Action Advising, in which the teacher provides action advice as well as associated explanations indicating why the action was chosen. This allows the student to self-reflect on what it has learned, enabling advice generalization and leading to improved sample efficiency and learning performance - even in environments where the teacher is sub-optimal. We empirically show that our framework is effective in both single-agent and multi-agent scenarios, yielding improved policy returns and convergence rates when compared to state-of-the-art methods.

  • Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

    Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Simon Stepputtis, Heni Ben Amor Conference on Robot Learning 2022

    Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process.

  • Concept Learning for Interpretable Multi-Agent Reinforcement Learning

    Renos Zabounidis, Joseph Campbell, Simon Stepputtis, Dana Hughes, Katia P. Sycara Conference on Robot Learning 2022

    Multi-agent robotic systems are increasingly operating in real-world environments in close proximity to humans, yet are largely controlled by policy models with inscrutable deep neural network representations. We introduce a method for incorporating interpretable concepts from a domain expert into models trained through multi-agent reinforcement learning, by requiring the model to first predict such concepts then utilize them for decision making. This allows an expert to both reason about the resulting concept policy models in terms of these high-level concepts at run-time, as well as intervene and correct mispredictions to improve performance. We show that this yields improved interpretability and training stability, with benefits to policy performance and sample efficiency in a simulated and real-world cooperative-competitive multi-agent game.

  • A System for Imitation Learning of Contact-Rich Bimanual Manipulation Policies

    Simon Stepputtis, Maryam Bandari, Stefan Schaal, Heni Ben Amor International Conference on Intelligent Robots and Systems 2022

    In this paper, we discuss a framework for teaching bimanual manipulation tasks by imitation. To this end, we present a system and algorithms for learning compliant and contact-rich robot behavior from human demonstrations. The presented system combines insights from admittance control and machine learning to extract control policies that can (a) recover from and adapt to a variety of disturbances in time and space, while also (b) effectively leveraging physical contact with the environment. We demonstrate the effectiveness of our approach using a real-world insertion task involving multiple simultaneous contacts between a manipulated object and insertion pegs. We also investigate efficient means of collecting training data for such bimanual settings. To this end, we conduct a human-subject study and analyze the effort and mental demand as reported by the users. Our experiments show that, while harder to provide, the additional force/torque information available in teleoperated demonstrations is crucial for phase estimation and task success. Ultimately, force/torque data substantially improves manipulation robustness, resulting in a 90% success rate in a multipoint insertion task.

  • Language-Conditioned Human-Agent Teaming

    Simon Stepputtis Robotics: Science and Systems Pioneers Workshop 2022

    In my work, I am focusing on multimodal techniques for robot learning and motor skill acquisition. Most intuitively, humans communicate desires to a robot using natural language, rooting from their internal mental state that encodes their beliefs and intentions. Modeling this mental state is a crucial part of truly understanding the human partner, as it is not captured by natural language alone. To this end, my research focuses on creating robotic systems that truly understand the human partner by combining Theory of Mind, natural language processing, and vision to complete various manipulation tasks.

  • Language-Conditioned Imitation Learning for Robot Manipulation Tasks

    Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Stefan Lee, Chitta Baral, Heni Ben Amor Conference on Neural Information Processing Systems 2020

    Imitation learning is a popular approach for teaching motor skills to robots. However, most approaches focus on extracting policy parameters from execution traces alone (i.e., motion trajectories and perceptual data). No adequate communication channel exists between the human expert and the robot to describe critical aspects of the task, such as the properties of the target object or the intended shape of the motion. Motivated by insights into the human teaching process, we introduce a method for incorporating unstructured natural language into imitation learning. At training time, the expert can provide demonstrations along with verbal descriptions in order to describe the underlying intent (e.g., “go to the large green bowl”). The training process then interrelates these two modalities to encode the correlations between language, perception, and motion. The resulting language-conditioned visuomotor policies can be conditioned at runtime on new human commands and instructions, which allows for more fine-grained control over the trained policies while also reducing situational ambiguity. We demonstrate in a set of simulation experiments how our approach can learn language-conditioned manipulation policies for a seven-degree-of-freedom robot arm and compare the results to a variety of alternative methods.

  • Imitation Learning of Robot Policies by Combining Language, Vision, and Demonstration

    Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Chitta Baral, Heni Ben Amor NeurIPS Workshop on Robot Learning: Control and Interaction in the Real World 2019

    In this work we propose a novel end-to-end imitation learning approach that combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn is used to synthesize specific motion controllers at run-time. This multimodal approach enables generalization to a wide variety of environmental conditions and allows an end-user to direct a robot policy through verbal communication. We empirically validate our approach with an extensive set of simulations and show that it achieves a high task success rate over a variety of conditions while remaining amenable to probabilistic interpretability.

  • Improved Exploration Through Latent Trajectory Optimization in Deep Deterministic Policy Gradient

    Kevin Sebastian Luck, Mel Vecerik, Simon Stepputtis, Heni Ben Amor, Jonathan Scholz International Conference on Intelligent Robots and Systems 2019

    Model-free reinforcement learning algorithms such as Deep Deterministic Policy Gradient (DDPG) often require additional exploration strategies, especially if the actor is of deterministic nature. This work evaluates the use of model-based trajectory optimization methods used for exploration in Deep Deterministic Policy Gradient when trained on a latent image embedding. In addition, an extension of DDPG is derived using a value function as critic, making use of a learned deep dynamics model to compute the policy gradient. This approach leads to a symbiotic relationship between the deep reinforcement learning algorithm and the latent trajectory optimizer. The trajectory optimizer benefits from the critic learned by the RL algorithm and the latter from the enhanced exploration generated by the planner. The developed methods are evaluated on two continuous control tasks, one in simulation and one in the real world. In particular, a Baxter robot is trained to perform an insertion task, while only receiving sparse rewards and images as observations from the environment.

  • Learning Interactive Behaviors for Musculoskeletal Robots Using Bayesian Interaction Primitives

    Joseph Campbell, Arne Hitzmann, Simon Stepputtis, Shuhei Ikemoto, Koh Hosoda, Heni Ben Amor International Conference on Intelligent Robots and Systems 2019

    Musculoskeletal robots that are based on pneumatic actuation have a variety of properties, such as compliance and back-drivability, that render them particularly appealing for human-robot collaboration. However, programming interactive and responsive behaviors for such systems is extremely challenging due to the nonlinearity and uncertainty inherent to their control. In this paper, we propose an approach for learning Bayesian Interaction Primitives for musculoskeletal robots given a limited set of example demonstrations. We show that this approach is capable of real-time state estimation and response generation for interaction with a robot for which no analytical model exists. Human-robot interaction experiments on a ‘handshake’ task show that the approach generalizes to new positions, interaction partners, and movement velocities.

  • Probabilistic Multimodal Modeling for Human-Robot Interaction Tasks

    Joseph Campbell, Simon Stepputtis, Heni Ben Amor Conference on Robot Learning 2019

    Human-robot interaction benefits greatly from multimodal sensor inputs as they enable increased robustness and generalization accuracy. Despite this observation, few HRI methods are capable of efficiently performing inference for multimodal systems. In this work, we introduce a reformulation of Interaction Primitives which allows for learning from demonstration of interaction tasks, while also gracefully handling non-linearities inherent to multimodal inference in such scenarios. We also empirically show that our method results in more accurate, more robust, and faster inference than standard Interaction Primitives and other common methods in challenging HRI scenarios.

  • Neural Policy Translation for Robot Control

    Simon Stepputtis, Chitta Baral, Heni Ben Amor Southwest Robotics Symposium 2019

    Teaching new skills to robots is usually a tedious process that requires expert knowledge and a substantial amount of time, depending on the complexity of the new task. Especially when used for imitation learning, rapid and intuitive ways of teaching novel tasks are needed. In this work, we outline Neural Policy Translation (NPT) – a novel approach that enables robots to directly learn a new skill by translating natural language and kinesthetic demonstrations into neural network policies.

  • Extrinsic Dexterity Through Active Slip Control Using Deep Predictive Models

    Simon Stepputtis, Yezhou Yang, Heni Ben Amor International Conference on Robotics and Automation 2018

    We present a machine learning methodology for actively controlling slip, in order to increase robot dexterity. Leveraging recent insights in deep learning, we propose a Deep Predictive Model that uses tactile sensor information to reason about slip and its future influence on the manipulated object. The obtained information is then used to precisely manipulate objects within a robot end-effector using external perturbations imposed by gravity or acceleration. We show in a set of experiments that this approach can be used to increase a robot’s repertoire of motor skills.

  • Towards Semantic Policies for Human-Robot Collaboration

    Simon Stepputtis, Chitta Baral, Heni Ben Amor Southwest Robotics Symposium 2018

    As the application domain of robots moves closer to our daily lives, algorithms and methods are needed to ensure safe and meaningful human-machine interaction. Robots need to be able to understand human body movements, as well as the semantic meaning of these actions. To overcome this challenge, this research aims to create novel ways of teaching complex tasks to a robot by \textbf{combining traditional learning-from-demonstration with natural language processing and semantic analysis.

  • Speech Enhanced Imitation Learning and Task Abstraction for Human-Robot Interaction

    Simon Stepputtis, Chitta Baral, Heni Ben Amor IROS Workshop on Synergies Between Learning and Interaction 2017

    In this short paper, we show how to learn interaction primitives and networks from long interactions by taking advantage of language and speech markers. The speech markers are obtained from free speech that accompanies the demonstration. We perform experiments to show the value of using speech markers for learning interaction primitives.

  • Active Slip Control for In-Hand Object Manipulation using Deep Predictive Models

    Simon Stepputtis, Heni Ben Amor RSS Workshop on Tactile Sensing for Manipulation: Hardware, Modeling, and Learning 2017

    We discuss a machine learning methodology for actively controlling slip, in order to increase robot dexterity. Leveraging recent insights in Deep Learning, we propose a Deep Predictive Model that uses tactile sensor information to reason about slip and its future influence on the manipulated object. We show in a set of experiments that this approach can be used to increase a robot’s repertoire of skills.

  • Deep Predictive Models for Active Slip Control

    Simon Stepputtis, Heni Ben Amor RSS Workshop on (Empirically) Data-Driven Robotic Manipulation 2017

    We discuss a machine learning methodology for actively controlling slip, in order to increase robot dexterity. Leveraging recent insights in Deep Learning, we propose a Deep Predictive Model that uses tactile sensor information to reason about slip and its future influence on the manipulated object. We show in a set of experiments that this approach can be used to increase a robot’s repertoire of skills.

  • A System for Learning Continuous Human-Robot Interactions from Human-Human Demonstrations

    David Vogt, Simon Stepputtis, Steve Grehl, Bernhard Jung, Heni Ben Amor International Conference on Robotics and Automation 2017

    We present a data-driven imitation learning system for learning human-robot interactions from human-human demonstrations. During training, the movements of two interaction partners are recorded through motion capture and an interaction model is learned. At runtime, the interaction model is used to continuously adapt the robot’s motion, both spatially and temporally, to the movements of the human interaction partner. We show the effectiveness of the approach on complex, sequential tasks by presenting two applications involving collaborative human-robot assembly. Experiments with varied object hand-over positions and task execution speeds confirm the capabilities for spatio-temporal adaption of the demonstrated behavior to the current situation.

  • Learning Human-Robot Interactions from Human-Human Demonstrations (With Applications in Lego Rocket Assembly)

    David Vogt, Simon Stepputtis, Richard Weinhold, Bernhard Jung, Heni Ben Amor International Conference on Humanoid Robots 2016

    This video demonstrates a novel imitation learning approach for learning human-robot interactions from human- human demonstrations. During training, the movements of two human interaction partners are recorded via motion capture. From this, an interaction model is learned that inherently captures important spatial relationships as well as temporal synchrony of body movements between the two interacting partners. The interaction model is based on interaction meshes that were first adopted by the computer graphics community for the offline animation of interacting virtual characters. We developed a variant of interaction meshes that is suitable for real-time human-robot interaction scenarios. During human- robot collaboration, the learned interaction model allows for adequate spatio-temporal adaptation of the robots behavior to the movements of the human cooperation partner. Thus, the presented approach is well suited for collaborative tasks requiring continuous body movement coordination of a human and a robot. The feasibility of the approach is demonstrated with the example of a cooperative Lego rocket assembly task.

  • One-Shot Learning of Human–Robot Handovers with Triadic Interaction Meshes

    David Vogt, Simon Stepputtis, Bernhard Jung, Heni Ben Amor Autonomous Robots 2018

    We propose an imitation learning methodology that allows robots to seamlessly retrieve and pass objects to and from human users. Instead of hand-coding interaction parameters, we extract relevant information such as joint correlations and spatial relationships from a single task demonstration of two humans. At the center of our approach is an interaction model that enables a robot to generalize an observed demonstration spatially and temporally to new situations. To this end, we propose a data-driven method for generating interaction meshes that link both interaction partners to the manipulated object. The feasibility of the approach is evaluated in a within user study which shows that human–human task demonstration can lead to more natural and intuitive interactions with the robot.