Article

Multi-task reinforcement learning: a hierarchical Bayesian approach

Authors:
Aaron Wilson

Oregon State University

Oregon State University
View Profile

,
Alan Fern

Oregon State University

Oregon State University
View Profile

,
Soumya Ray

Oregon State University

Oregon State University
View Profile

,
Prasad Tadepalli

Oregon State University

Oregon State University
View Profile

ICML '07: Proceedings of the 24th international conference on Machine learningJune 2007Pages 1015–1022https://doi.org/10.1145/1273496.1273624

Published:20 June 2007Publication History

ICML '07: Proceedings of the 24th international conference on Machine learning

Pages 1015–1022

ABSTRACT

We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknown distribution. We model the distribution over MDPs using a hierarchical Bayesian infinite mixture model. For each novel MDP, we use the previously learned distribution as an informed prior for modelbased Bayesian reinforcement learning. The hierarchical Bayesian framework provides a strong prior that allows us to rapidly infer the characteristics of new environments based on previous environments, while the use of a nonparametric model allows us to quickly adapt to environments we have not encountered before. In addition, the use of infinite mixtures allows for the model to automatically learn the number of underlying MDP components. We evaluate our approach and show that it leads to significant speedups in convergence to an optimal policy after observing only a small number of tasks.

References

Banerjee, B., & Stone, P. (2007). General game learning using knowledge transfer. Proceedings of the 20th International Joint Conference on Artificial Intelligence. Google ScholarDigital Library
Dearden, R., Friedman, N., & Andre, D. (1998a). Modelbased Bayesian exploration. Proceedings of the 15th International Conference on Machine Learning. Google ScholarDigital Library
Dearden, R., Friedman, N., & Russell, S. (1998b). Bayesian Q-learning. Proceedings of the Fifteenth National Conference on Artificial Intelligence. Google ScholarDigital Library
Duff, M. (2003). Design for an optimal probe. Proceedings of the 20th International Conference on Machine Learning.Google Scholar
Konidaris, G., & Barto, A. (2006). Autonomous shaping: knowledge transfer in reinforcement learning. Proceedings of the 23rd international conference on Machine Learning (pp. 489--496). Google ScholarDigital Library
Mehta, N., Natarajan, S., Tadepalli, P., & Fern, A. (2005). Transfer in variable-reward hierarchical reinforcement learning. Workshop on Transfer Learning at Neural Information Processing Systems.Google Scholar
Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9, 249--265.Google Scholar
Strens, M. J. A. (2000). A Bayesian framework for reinforcement learning. Proceedings of the 17th International Conference on Machine Learning. Google ScholarDigital Library
Sutton, R., & Barto, A. G. (1998). Reinforcement learning: An introduction. MIT Press. Google ScholarDigital Library
Thompson, W. R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25, 285--294.Google ScholarCross Ref
Wang, T., Lizotte, D., Bowling, M., & Schuurmans, D. (2005). Bayesian sparse sampling for on-line reward optimization. Proceedings of the 22nd Internationl Conference on Machine Learning. Google ScholarDigital Library

Multi-task reinforcement learning: a hierarchical Bayesian approach
1. Computing methodologies
2. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic representations

Recommendations

Multi-task Reinforcement Learning in Partially Observable Stochastic Environments

We consider the problem of multi-task reinforcement learning (MTRL) in multiple partially observable stochastic environments. We introduce the regionalized policy representation (RPR) to characterize the agent's behavior in each environment. The RPR is ...
Read More
Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Acting in domains where an agent must plan several steps ahead to achieve a goal can be a challenging task, especially if the agent@?s sensors provide only noisy or partial information. In this setting, Partially Observable Markov Decision Processes (...
Read More
Active Task-Inference-Guided Deep Inverse Reinforcement Learning
2020 59th IEEE Conference on Decision and Control (CDC)
We consider the problem of reward learning for temporally extended tasks. For reward learning, inverse reinforcement learning (IRL) is a widely used paradigm. Given a Markov decision process (MDP) and a set of demonstrations for a task, IRL learns a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '07: Proceedings of the 24th international conference on Machine learning
June 2007
1233 pages
ISBN:9781595937933
DOI:10.1145/1273496
Editor:
Zoubin Ghahramani
University of Cambridge, United Kingdom
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 June 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 122
  Total Citations
  View Citations
- 1,901
  Total Downloads
- Downloads (Last 12 months)166
- Downloads (Last 6 weeks)32
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Multi-task reinforcement learning: a hierarchical Bayesian approach

ICML '07: Proceedings of the 24th international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

Multi-task Reinforcement Learning in Partially Observable Stochastic Environments

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Active Task-Inference-Guided Deep Inverse Reinforcement Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Multi-task reinforcement learning: a hierarchical Bayesian approach

ICML '07: Proceedings of the 24th international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

Multi-task Reinforcement Learning in Partially Observable Stochastic Environments

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Active Task-Inference-Guided Deep Inverse Reinforcement Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media