Work Related to HQ
RoboCup
RoboCup, the Robot World Cup Iniative, is an international effort to
advance AI and robotics using the domain of soccer. Although the emphasis
is on building physical robotic soccer players, there is a
Soccer Simulator available. The simulator allows client programs
to control simulated robotic players, and simulates the information such
players would collect to send to the client programs. The simulator is
written in C++ and clients communicate using UDP socket connections.
RoboCup is similar to HQ in that it is a two-player game
where computer programs control multiple pieces (soccer players). It
is also an imperfect information game, since each piece has a restricted
view of the game.
In RoboCup, it is difficult to evaluate the strength of higher-level control
algorithms since success depends heavily on how well pieces dribble, kick,
intercept, etc. HQ was designed to minimize the effects of such low-level
abilities to concentrate on high-level control issues.
The robotics focus of RoboCup places severe time restrictions on the contestant
programs. Although both RoboCup and HQ have the same discrete model of time,
HQ allows contestants minutes versus seconds in which to make a move. This
allows more sophisticated strategies to by investigated.
The RoboCup simulator moves all the pieces and allows all the pieces to
change their motion at each time step. HQ moves all the pieces at each time
step, but allows only one piece to make a motion change per time step. The
RoboCup approach is necessary in soccer due to the autonomous decision-making
of soccer players. However, the HQ approach is more appropriate for the
isolated study of control strategies.
Writing a successful RoboCup client program requires a significant investment
of time attending to low-level details. This impedes progress on the
high-level group control issues tested by HQ.
Finally, RoboCup does not give a learning program time to learn about its
contestants before it is rated in a tournament. HQ allows programs to learn
from practice games with contestant programs before they are rated.
RARS
RARS is a Robot Auto Racing Simulator written in C++. It is an environment
for testing robot control strategies using the domain of auto racing. Like
Robocup , computer programs control multiple pieces
(race cars) in an imperfect information game. Pieces have a restricted
view of the race track and other pieces.
Like Robocup , there is limited communication between
pieces. In fact, it is discouraged. There are no time restrictions on the
contestant programs.
Of particular interest is the allowance for learning contestants. Each
contestant can read and write to a data file. Practice runs on a track can
be made. However, the emphasis of the game is on the static performance
of a contestant.
The UMASS Machine Learning Lab
The Machine Learning Lab at the University of Massachusetts at Amherst has
written some C code that allows computer programs to play each other on the
net in the games of Othello, Checkers, or Hearts. The code will run on UNIX
machines and allows a program to be available to any contestant using a
daemon process that listens to a port for other players. The protocol for
communicating moves is defined. This allows tournaments between
programs to be held at any time. This approach is described in the following
techreport .
Of particular interest is the algorithm used to assign rankings to the
different programs. A tournament is held once a day. It basically consists
of a single bubble sort pass through a sorted list of the programs. Two
programs are compared by playing a match. The order of the programs on
the list is swapped depending on the results of the match.
After this pass of matches, the ranking of each program is adjusted by
for (i=0; i less than n_players; i++)
player[i].rank = player[i].rank * DECAY + (1.0 - DECAY)*(i+1);
where DECAY = 0.5.
JavaBots
The JavaBots system was used to develop JavaSoccer, a Java version of the
RoboCup system. In JavaSoccer the individual
soccer players can learn different behaviors. Each piece
learns independently of the other pieces. The focus of the research is
on the coordinated strategy that emerges (some pieces behave as offensive
players and others as defensive). The focus of HQ is in learning
effective coordinated strategies directly.
Robot Warfare 1
RW1 is related to HQ in that users submit contestant programs, and contestants
have incomplete information about the state of the game. It differs
substantially from HQ. RW1 programs are written in a restricted language that
does not support learning. Contestants are single pieces, limiting the
range of interesting possible strategies. The board is only a discrete 10x20
grid. The tournament rules do not favor programs that learn.
C++Robots
An improvement over Robot Warfare 1
in that contestant programs are written
in C++ and the board is continuous. However, contestants are still single
pieces, limiting the complexity of possible strategies. The tournament
rules do not favor programs that learn.
C-Robots
Similar to C++Robots except that the game is
real-time, whereas C++Robots is turn based.
RoShamBo
A tournament for computer programs to compete in the game of
Rock-Paper-Scissors. Contestants are C code functions that simply
return an integer value: 0 = rock, 1 = paper, or 2 = scissors. Contestants
have access to two history arrays for a match that record the history
of their choices and the history of the other contestant choices. A
match consists of from 100 to 10,000 turns.
A simple count is kept for each
turn to determine the winner of the match. A tournament pairs all
contestants. Contestants are emailed to the tournament organizers and
are compiled locally.
This tournament is of interest because it is an incomplete information
game and allows learning. It would be more interesting if contestants had
access to the history of every match played, not just the current match.
Also, the speed of a contestant is important, which hinders some learning
methods.
The Glicko Rating System
The Glicko rating system extends the Elo rating system by incorporating a
measure of uncertainty of a player's rating. The Glicko system is now
used on the
Free Internet Chess Server .
It accounts for the increasing
uncertainty of a player's rating over time between rated matches. Since
HQ allows a program to choose its competitor and whether the resulting
match is rated, it is possible that some time will pass between rated
matches.