RL-Glue Core

From RL-Glue

Jump to: navigation, search

Important Note: You must have THIS COMPONENT installed to use all other codecs.

Contents

Official Documentation

Main Overview

This document should be your first resource if you are new to RL-Glue. RL-Glue is not particularly complicated or tricksy, but you will be much more effective using it if you have a high-level understanding of how the project works, and how it interacts with related projects (RL-Glue Extensions).

  • RL-Glue what? learning about RL-Glue at an abstract level
  • Compatibility: making existing C/C++ agents and environments work with RL-Glue
  • Plugging agents and environments together: how to write experiment programs
  • What function do I use to... quick function reference for RL-Glue
  • High-level changes from RL-Glue 2.x to 3.x

Technical Manual

This manual will help you get RL-Glue, and install it on your local machine.

After that, if you decide that you want to write agents, environments, or experiments in C or C++, all of the details you need are here also.

  • Where to download RL-Glue
  • How to install RL-Glue
  • C/C++ Changes since RL-Glue 2.x
  • Details about C/C++ data types and function prototypes
  • Memory Management Pointers and Suggestions


Frequently Asked Questions

Can I write my agents in < insert language here >

Yes! Maybe! Writing agents/environments/experiments in different languages require there to be a codec for that language. As of writing, there are codecs for C/C++, Java, Matlab, Python, and Lisp.

Does RL-Glue support multi-agent reinforcement learning?

No. RL-Glue is designed for single agent reinforcement learning. At present we are not planning a multi-agent extension of RL-Glue. We envision that this would be a separate project with a different audience and different objectives.

Update: There have recently been some a proposal (from Gabor Balazs) about how to make some simple updates to RL-Glue that would allow it to work with multiple agents. If you would really like RL-Glue to support multiple agents, please let us know on the discussion list (http://groups.google.com/group/rl-glue)

Why isn't the RL-Glue interface object oriented?

RL-Glue is meant to be a low level protocol for connecting agents, environments, and experiments. These interactions can easily be described by the simple, flat, functions calls of RL-Glue. We don't feel that it is useful to overcomplicate things in that respect. However, there is no reason that an implementation of an agent or environment shouldn't be designed using an object-oriented approach. In fact, many of the contributors to this project have their own object-oriented libraries of agents that they use with RL-Glue. Some of the codecs even have an OO flavor (Python, Java, Lisp).

Some might argue that it makes sense to create a codecs that support very seriously, with a hierarchy of observation and action types, where you create an instance of RL-Glue instead of calling static methods on it, etc . This would not be hard, it's just a matter of someone interested picking up the project and doing it. Personally, we've found it easy enough to write a small bridge between the existing codecs and our personal OO hierarchies.

What does Observation mean? Why does the RL-Glue not use states?

If the state of an environment is fully observable, then you can often use the terms state and observation interchangeably. However, observation is a more general term that is meant to mean the perceptions that the agent receives. This can be different from the concept of state, which corresponds to some truth about the environment. For example, in partially observable environments, the observations may be aliased: the environment may be in different states, but the agent receives the same observation.

Where is the environmental state stored in RL-Glue?

In other systems, such as CLSquare, the old state is passed to the environment step function.

The environment in RL-Glue is responsible for keeping track of the current state and computing the next state given an action. The old state does not need to be passed outside of the environment, the state stays within the environment. The next_state method in CLSquare is basically the same as env_step in RL-Glue.

Can RL-Glue handle sampling the same trajectory a number of times consecutively?

This can be done in RL-Glue by using env_message, agent_message, and coordinating the two with your own experiment program. It's not trivial, and there are many different ways and reasons that you might want to do this, so it's hard to come up with a very clear example. If you are interested in this, please contact us and we would like to make an example for you, that we can share with everyone else.

Why is there no RL_freeze, RL_unfreeze or RL_frozen?

The functionality of Freeze can easily be replicated through RL_agent_message and agent_message. There are literally a hundred similar methods that would be desirable to one person or another. To avoid the RL-Glue interface becoming bloated, we are trying to avoid adding too many redundant functions for the sake of convenience.

Personal tools