RL-Glue Changes (old page -- see the manual for current info)

This has all been included in the RL-Glue overview manual. This page is now deprecated.

The Codec Split

The C codec has been split from the main rl-glue project. Each package offers different things to the user. However, as always, they layer on RL-Glue. The root page for all of the codecs is:

http://glue.rl-community.org/Home/codecs

The additional good news is that as usual, you don't need to reconsider your code depending on how it will be used. The code for an agent, environment, or experiment is identical no matter if you will run it using sockets or directly compiled together. The only difference is what library you link against.

RL-Glue Project

The idea of the RL-Glue project is that it will very very rarely change. We're trying to make all the changes on our wishlist, and then RL-Glue can become a library that is counted on for doing reinforcement learning experiments. It is always there, and it always just works.

This project is written entirely in C and can be linked from C or C++ code.

When you download and install the RL-Glue project, you get three artifacts:

RL_glue executable

This is the server for running socket-based experiments.

librlglue

This is a C library that can be linked against for creating executables where the agent, environment, and experiment program are all written in C or C++. These experiments have virtually no rl-glue calling overhead.

Agent/Environment/Experiment Headers

Four header files, Agent_common.h, Environment_common.h, Experiment_common.h, and RL_common.h.

The way that they should be included in your agents/environments/experiments is like this:

<rlglue/RL_common.h> /* Datastructures */

<rlglue/Agent_common.h> /* Agent functions and includes RL_common */

<rlglue/Environment_common.h> /* Environment functions and includes RL_common */

<rlglue/Experiment_common.h> /*Formerly RL_glue.h. Experiment (RL_) functions and includes RL_common */

Generally, each of agent/env/experiment should only have to include one of these files. You'll probably never include RL_common.h, but it is needed by the others.

RL-Glue-Extensions Project :: C Codec

The C codec gives you libraries that can be used to build stand-alone socket-based agents, environment, and experiments, as well as some utilities for copying data structures and parsing task specs. The C Codec is in the rl-glue-ext project and is expected to change more frequently than the rl-glue project as the task spec (and this task spec parser) evolves over time.

This project is written entirely in C and can be linked from C or C++ code.

The artifacts of the C Codec are:

librlagent

Write an agent and link to librlagent to give it all it needs to connect to the rl_glue server over sockets.

librlenvironment

Write an environment and link to librlenvironment to give it all it needs to connect to the rl_glue server over sockets.

librlexperiment

Write an experiment and link to librlexperiment to give it all it needs to connect to the rl_glue server over sockets.

librlutils

Provides a C task spec parser and utility for copying rl-glue structures.

Utility Headers

Headers for the functionality in librlutils.

Build Changes

We're not manually writing Makefiles anymore. We've moved both RL-Glue and the C Codec to a GNU autotools system. You should be able to build like:

$>./configure

$>make

$>sudo make install

API Changes

RL_Freeze and Agent_Freeze

The freeze methods were the first in a long long of "special" methods that some people wanted. The long term solution to special methods is to the messaging system, RL_agent_message and RL_env_message. With the messaging system you can create any protocol you want between your experiement and agent or experiment and environment.

So, to reduce the clutter and kruft (cruft?) of the API, I've removed Freeze. How do you unfreeze anyways?

RL_Episode

Csaba made the request at some point that RL_Episode(unsigned int T) should let you know whether it ended because time expired, or because the episode ended normally. We now return the value of the terminal flag from the last env_step of the episode. If it's set, then the episode terminated normally, if not, the timeout ended the episode. Again, "roat.terminal==1" means that the episode completed normally and was not cut off.

RL_Init

It made sense to us that RL_init should return the task spec, in case the experiment program wants to know it, make a note of it, etc. The RL_glue specification has now been updated to handle this.

Typedef Changes

This is a big one. We revamped all of the type names. We made them all lower case, and added "_t" to them to identify them as types. This should reduce confusion so there is no more code like:

Observation observation;

Instead it'll be:

observation_t observation;

I think the latter is easier to read.

This is a pain to change if you have a lot of existing code, so we've made it easy to transition. There is a file you can include which will define all the old, ugly type names. This is good news because it means you can migrate to the new type names at your leisure.

#include <rlglue/legacy_types.h>

You can find all the old and new type names here:

http://code.google.com/p/rl-glue/source/browse/trunk/src/rlglue/legacy_types.h

String Observations/Actions/etc.

Some people have found the interface of abstract types that are arrays of doubles and ints a little bit too constricting. We've added a third array, this time of chars. Now people can push strings and char arrays anything they want through observations, action, state_key types, etc.

The rl_abstract_type_t now looks like:

typedef struct rl_abstract_t_struct

{

unsigned int numInts;

unsigned int numDoubles;

unsigned int numChars;

int* intArray;

double* doubleArray;

char* charArray;

} rl_abstract_type_t;

Keep in mind that charArray is an array of characters. It is not necessarily null terminated. We don't enforce null termination. Remember, 3 chars takes up 3 array spots, but "123" takes up 4 ("\0" at the end).

If you do the following, bad things will probably happen if the char array is not null terminated:

printf("My char array is %s\n",observation.charArray);