Discovering hierarchy in reinforcement learning

January 2005

Author:
Bernhard Hengst
University of New South Wales (Australia)

Publisher:

University of New South Wales
P.O. Box 1 Kensington, NSW 2033
Australia

Order Number:AAI0807585

Pages:

Purchase on ProQuest

Bibliometrics

Abstract

This thesis addresses the open problem of automatically discovering hierarchical structure in reinforcement learning. Current algorithms for reinforcement learning fail to scale as problems become more complex. Many complex environments empirically exhibit hierarchy and can be modelled as interrelated subsystems, each in turn with hierarchic structure. Subsystems are often repetitive in time and space, meaning that they reoccur as components of different tasks or occur multiple times in different circumstances in the environment. A learning agent may sometimes scale to larger problems if it successfully exploits this repetition. Evidence suggests that a bottom up approach that repetitively finds building-blocks at one level of abstraction and uses them as background knowledge at the next level of abstraction, makes learning in many complex environments tractable. An algorithm, called HEXQ, is described that automatically decomposes and solves a multi-dimensional Markov decision problem (MDP) by constructing a multi-level hierarchy of interlinked subtasks without being given the model beforehand. The effectiveness and efficiency of the HEXQ decomposition depends largely on the choice of representation in terms of the variables, their temporal relationship and whether the problem exhibits a type of constrained stochasticity. The algorithm is first developed for stochastic shortest path problems and then extended to infinite horizon problems. The operation of the algorithm is demonstrated using a number of examples including a taxi domain, various navigation tasks, the Towers of Hanoi and a larger sporting problem. The main contributions of the thesis are the automation of (1) decomposition, (2) sub-goal identification, and (3) discovery of hierarchical structure for MDPs with states described by a number of variables or features. It points the way to further scaling opportunities that encompass approximations, partial observability, selective perception, relational representations and planning. The longer term research aim is to train rather than program intelligent agents.

Cited By

Contributors

Bernhard Hengst
UNSW Sydney
- Publication Years2000 - 2016
- Publication counts19
- Citation count62
- Available for Download1
- Downloads (cumulative)260
- Downloads (12 months)12
- Downloads (6 weeks)2
- Average Downloads per Article260
- Average Citation per Article3
View Full Profile

Recommendations

Hierarchical Average Reward Reinforcement Learning

Hierarchical reinforcement learning (HRL) is a general framework for scaling reinforcement learning (RL) to problems with large state and action spaces by using the task (or action) structure to restrict the space of policies. Prior work in HRL ...
Read More
Approximate planning for bayesian hierarchical reinforcement learning

In this paper, we propose to use hierarchical action decomposition to make Bayesian model-based reinforcement learning more efficient and feasible for larger problems. We formulate Bayesian hierarchical reinforcement learning as a partially observable ...
Read More
Reinforcement learning with a hierarchy of abstract models
AAAI'92: Proceedings of the tenth national conference on Artificial intelligence

Reinforcement learning (RL) algorithms have traditionally been thought of as trial and error learning methods that use actual control experience to incrementally improve a control policy. Sutton's DYNA architecture demonstrated that RL algorithms can ...
Read More

Comments

Browse Theses

Sections

Cited By

Hierarchical Average Reward Reinforcement Learning

Approximate planning for bayesian hierarchical reinforcement learning

Reinforcement learning with a hierarchy of abstract models

Sections

Cited By

Save to Binder

Recommendations

Hierarchical Average Reward Reinforcement Learning

Approximate planning for bayesian hierarchical reinforcement learning

Reinforcement learning with a hierarchy of abstract models