research-article

Using informative behavior to increase engagement in the tamer framework

Authors:
Guangliang Li

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Hayley Hung

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Shimon Whiteson

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

,
W. Bradley Knox

MMassachusetts Institute of Technology Media Lab, Cambridge, Massachusetts, USA

MMassachusetts Institute of Technology Media Lab, Cambridge, Massachusetts, USA
View Profile

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systemsMay 2013Pages 909–916

Published:06 May 2013Publication History

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Pages 909–916

ABSTRACT

In this paper, we address a relatively unexplored aspect of designing agents that learn from human training by investigating how the agent's non-task behavior can elicit human feedback of higher quality and quantity. We use the TAMER framework, which facilitates the training of agents by human-generated reward signals, i.e., judgements of the quality of the agent's actions, as the foundation for our investigation. Then, we propose two new training interfaces to increase active involvement in the training process and thereby improve the agent's task performance. One provides information on the agent's uncertainty, the other on its performance. Our results from a 51-subject user study show that these interfaces can induce the trainers to train longer and give more feedback. The agent's performance, however, increases only in response to the addition of performance-oriented information, not by sharing uncertainty levels. Subsequent analysis of our results suggests that the organizational maxim about human behavior, "you get what you measure" - i.e., sharing metrics with people causes them to focus on maximizing or minimizing those metrics while de-emphasizing other objectives - also applies to the training of agents, providing a powerful guiding principle for human-agent interface design in general.

References

P. Abbeel and A. Ng. Apprenticeship learning via inverse reinforcement learning. ICML, 2004. Google ScholarDigital Library
B. Argall, B. Browning, and M. Veloso. Learning by demonstration with critique from a human teacher. HRI, 2007. Google ScholarDigital Library
B. Argall, S. Chernova, M. Veloso, and B. Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 2009. Google ScholarDigital Library
D. Bertsekas and J. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996. Google ScholarDigital Library
B. Blumberg, M. Downie, Y. Ivanov, M. Berlin, M. Johnson, and B. Tomlinson. Integrated learning for interactive synthetic characters. ACM Transactions on Graphics, 2002. Google ScholarDigital Library
N. Bohm, G. Kokai, and S. Mandl. Evolving a heuristic function for the game of Tetris. Proc. Lernen, Wissensentdeckung und Adaptivitat LWA, 2004.Google Scholar
E. Bouwers, J. Visser, and A. Van Deursen. Getting what you measure. Communications of the ACM, 2012. Google ScholarDigital Library
C. Chao, M. Cakmak, and A. Thomaz. Transparent active learning for robots. HRI, 2010. Google ScholarDigital Library
S. Chernova and M. Veloso. Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research, 2009. Google ScholarDigital Library
E. Demaine, S. Hohenberger, and D. Liben-Nowell. Tetris is hard, even to approximate. Computing and Combinatorics, 2003. Google ScholarDigital Library
D. Gill and T. Deeter. Development of the sport orientation questionnaire. Research Quarterly for Exercise and Sport, 1988.Google ScholarCross Ref
K. Judah, S. Roy, A. Fern, and T. Dietterich. Reinforcement learning via practice and critique advice. Proc. of the 24th AAAI Conference on AI, 2010.Google Scholar
W. Knox. Learning from Human-Generated Reward. PhD thesis, 2012.Google Scholar
W. Knox, B. Glass, B. Love, W. Maddox, and P. Stone. How humans teach agents. IJSR, 2012.Google ScholarCross Ref
W. Knox and P. Stone. Interactively shaping agents via human reinforcement: The TAMER framework. Proc. of the 5th International Conference on Knowledge Capture, 2009. Google ScholarDigital Library
W. Knox and P. Stone. Combining manual feedback with subsequent MDP reward signals for reinforcement learning. AAMAS, 2010. Google ScholarDigital Library
W. Knox and P. Stone. Reinforcement learning from human reward: Discounting in episodic tasks. RO-MAN, 2012.Google ScholarCross Ref
W. Knox and P. Stone. Reinforcement learning from simultaneous human and MDP reward. AAMAS, 2012. Google ScholarDigital Library
E. Lawrence, P. Shaw, D. Baker, S. Baron-Cohen, A. David, et al. Measuring empathy: reliability and validity of the empathy quotient. Psychological Medicine, 2004.Google ScholarCross Ref
A. Lockerd and C. Breazeal. Tutelage and socially guided robot learning. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004.Google Scholar
R. Maclin and J. Shavlik. Creating advice-taking reinforcement learners. Machine Learning, 1996. Google ScholarDigital Library
P. Pilarski, M. Dawson, T. Degris, F. Fahimi, J. Carey, and R. Sutton. Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning. International Conference on Rehabilitation Robotics, 2011.Google ScholarCross Ref
H. Suay and S. Chernova. Effect of human guidance and state space size on interactive reinforcement learning. RO-MAN, 2011.Google ScholarCross Ref
R. Sutton and A. Barto. Reinforcement learning: An introduction. Cambridge Univ Press, 1998. Google ScholarDigital Library
I. Szita and A. Lorincz. Learning Tetris Using the Noisy Cross-Entropy Method. Neural Computation, 2006. Google ScholarDigital Library
M. Taylor and S. Chernova. Integrating human demonstration and reinforcement learning: Initial results in human-agent transfer. AAMAS Workshop, 2010.Google Scholar
A. Tenorio-Gonzalez, E. Morales, and L. Villaseñor-Pineda. Dynamic reward shaping: training a robot by voice. Advances in Artificial Intelligence - IBERAMIA, 2010. Google ScholarDigital Library
A. Thomaz and C. Breazeal. Reinforcement learning with human teachers: Evidence of feedback and guidance with implications for learning performance. Proc. of the National Conference on AI, 2006. Google ScholarDigital Library
A. Thomaz and C. Breazeal. Transparency and socially guided machine learning. ICDL, 2006.Google Scholar
A. Thomaz, G. Hoffman, and C. Breazeal. Real-time interactive reinforcement learning for robots. AAAI Workshop, 2005.Google Scholar
C. Watkins and P. Dayan. Q-learning. Machine Learning, 1992.Google Scholar

Index Terms

Using informative behavior to increase engagement in the tamer framework
1. Computing methodologies
  1. Machine learning

Recommendations

Using informative behavior to increase engagement while learning from human reward

In this work, we address a relatively unexplored aspect of designing agents that learn from human reward. We investigate how an agent's non-task behavior can affect a human trainer's training and agent learning. We use the TAMER framework, which ...
Read More
Leveraging social networks to motivate humans to train agents
AAMAS '14: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems

Learning from rewards generated by a human trainer observing the agent in action has been demonstrated to be an effective method for humans to teach an agent to perform challenging tasks. However, how to make the agent learn most efficiently from these ...
Read More
The effects of cooperative agent behavior on human cooperativeness
AAMAS '09: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2

In this paper we examine the question of how cooperativeness of a software agent affects cooperativeness of a human player. Our data shows that humans behave more cooperatively towards agents that negotiate with them in a cooperative way.

Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
May 2013
1500 pages
ISBN:9781450319935
General Chairs:
Maria Gini
University of Minnesota, USA
,
Onn Shehory
IBM Haifa Research Lab, Israel
,
Program Chairs:
Takayuki Ito
Nagoya Institute of Technology, Japan
,
Catholijn Jonker
Delft Institute of Technology, The Netherlands
Sponsors
In-Cooperation
Publisher
International Foundation for Autonomous Agents and Multiagent Systems
Richland, SC
Publication History
- Published: 6 May 2013
Check for updates
Author Tags
human-agent interaction
reinforcement learning
Qualifiers
- research-article
Conference

Acceptance Rates
AAMAS '13 Paper Acceptance Rate140of599submissions,23%Overall Acceptance Rate1,155of5,036submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 125
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Using informative behavior to increase engagement in the tamer framework

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Using informative behavior to increase engagement while learning from human reward

Leveraging social networks to motivate humans to train agents

The effects of cooperative agent behavior on human cooperativeness

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Using informative behavior to increase engagement in the tamer framework

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Using informative behavior to increase engagement while learning from human reward

Leveraging social networks to motivate humans to train agents

The effects of cooperative agent behavior on human cooperativeness

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media