skip to main content
10.5555/2484920.2485064acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Using informative behavior to increase engagement in the tamer framework

Authors Info & Claims
Published:06 May 2013Publication History

ABSTRACT

In this paper, we address a relatively unexplored aspect of designing agents that learn from human training by investigating how the agent's non-task behavior can elicit human feedback of higher quality and quantity. We use the TAMER framework, which facilitates the training of agents by human-generated reward signals, i.e., judgements of the quality of the agent's actions, as the foundation for our investigation. Then, we propose two new training interfaces to increase active involvement in the training process and thereby improve the agent's task performance. One provides information on the agent's uncertainty, the other on its performance. Our results from a 51-subject user study show that these interfaces can induce the trainers to train longer and give more feedback. The agent's performance, however, increases only in response to the addition of performance-oriented information, not by sharing uncertainty levels. Subsequent analysis of our results suggests that the organizational maxim about human behavior, "you get what you measure" - i.e., sharing metrics with people causes them to focus on maximizing or minimizing those metrics while de-emphasizing other objectives - also applies to the training of agents, providing a powerful guiding principle for human-agent interface design in general.

References

  1. P. Abbeel and A. Ng. Apprenticeship learning via inverse reinforcement learning. ICML, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Argall, B. Browning, and M. Veloso. Learning by demonstration with critique from a human teacher. HRI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Argall, S. Chernova, M. Veloso, and B. Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Bertsekas and J. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Blumberg, M. Downie, Y. Ivanov, M. Berlin, M. Johnson, and B. Tomlinson. Integrated learning for interactive synthetic characters. ACM Transactions on Graphics, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. N. Bohm, G. Kokai, and S. Mandl. Evolving a heuristic function for the game of Tetris. Proc. Lernen, Wissensentdeckung und Adaptivitat LWA, 2004.Google ScholarGoogle Scholar
  7. E. Bouwers, J. Visser, and A. Van Deursen. Getting what you measure. Communications of the ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Chao, M. Cakmak, and A. Thomaz. Transparent active learning for robots. HRI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chernova and M. Veloso. Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Demaine, S. Hohenberger, and D. Liben-Nowell. Tetris is hard, even to approximate. Computing and Combinatorics, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Gill and T. Deeter. Development of the sport orientation questionnaire. Research Quarterly for Exercise and Sport, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  12. K. Judah, S. Roy, A. Fern, and T. Dietterich. Reinforcement learning via practice and critique advice. Proc. of the 24th AAAI Conference on AI, 2010.Google ScholarGoogle Scholar
  13. W. Knox. Learning from Human-Generated Reward. PhD thesis, 2012.Google ScholarGoogle Scholar
  14. W. Knox, B. Glass, B. Love, W. Maddox, and P. Stone. How humans teach agents. IJSR, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  15. W. Knox and P. Stone. Interactively shaping agents via human reinforcement: The TAMER framework. Proc. of the 5th International Conference on Knowledge Capture, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. W. Knox and P. Stone. Combining manual feedback with subsequent MDP reward signals for reinforcement learning. AAMAS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. W. Knox and P. Stone. Reinforcement learning from human reward: Discounting in episodic tasks. RO-MAN, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  18. W. Knox and P. Stone. Reinforcement learning from simultaneous human and MDP reward. AAMAS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. Lawrence, P. Shaw, D. Baker, S. Baron-Cohen, A. David, et al. Measuring empathy: reliability and validity of the empathy quotient. Psychological Medicine, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  20. A. Lockerd and C. Breazeal. Tutelage and socially guided robot learning. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004.Google ScholarGoogle Scholar
  21. R. Maclin and J. Shavlik. Creating advice-taking reinforcement learners. Machine Learning, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Pilarski, M. Dawson, T. Degris, F. Fahimi, J. Carey, and R. Sutton. Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning. International Conference on Rehabilitation Robotics, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  23. H. Suay and S. Chernova. Effect of human guidance and state space size on interactive reinforcement learning. RO-MAN, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  24. R. Sutton and A. Barto. Reinforcement learning: An introduction. Cambridge Univ Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. I. Szita and A. Lorincz. Learning Tetris Using the Noisy Cross-Entropy Method. Neural Computation, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Taylor and S. Chernova. Integrating human demonstration and reinforcement learning: Initial results in human-agent transfer. AAMAS Workshop, 2010.Google ScholarGoogle Scholar
  27. A. Tenorio-Gonzalez, E. Morales, and L. Villaseñor-Pineda. Dynamic reward shaping: training a robot by voice. Advances in Artificial Intelligence - IBERAMIA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Thomaz and C. Breazeal. Reinforcement learning with human teachers: Evidence of feedback and guidance with implications for learning performance. Proc. of the National Conference on AI, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Thomaz and C. Breazeal. Transparency and socially guided machine learning. ICDL, 2006.Google ScholarGoogle Scholar
  30. A. Thomaz, G. Hoffman, and C. Breazeal. Real-time interactive reinforcement learning for robots. AAAI Workshop, 2005.Google ScholarGoogle Scholar
  31. C. Watkins and P. Dayan. Q-learning. Machine Learning, 1992.Google ScholarGoogle Scholar

Index Terms

  1. Using informative behavior to increase engagement in the tamer framework

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
      May 2013
      1500 pages
      ISBN:9781450319935

      Publisher

      International Foundation for Autonomous Agents and Multiagent Systems

      Richland, SC

      Publication History

      • Published: 6 May 2013

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      AAMAS '13 Paper Acceptance Rate140of599submissions,23%Overall Acceptance Rate1,155of5,036submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader