Sunday, July 12, 2009

Dominoes

Playing dominoes is hard. Like playing card games there are many games you can play with dominoes and last night I was exposed to Muggins/Fives. Muggins has a points system and unlike the kiddie version of dominoes that I grew up with is not a game of chance (think War versus Bridge). The big tipoffs that it is a hard problem are that A) it is a partial information game [you can only see the board and your own hand] and B) grown men play against each other for money [I'm told the variant we were playing is very popular in the Bahamas].

The first thing I did was look for AI research papers on solving dominoes. After an hour of feeding terms to search engines I can say: there is no research. This is strange because Checkers (a boring perfect information game) was still interesting enough to researchers that it was solved only 10 years ago. People don't bet on checkers so the fact that dominoes is a research orphan left me intrigued. Of course there's a personal angle too: If it was an easy problem to solve I could solve it and then travel while grifting strangers out of their money; If it was a hard problem to solve (like Bridge or Go) then I'd add it to my list of "fun things to play."

In Muggins you get points when all of the exposed end tiles add up to a multiple of five. If you go out first you also get bonus points equal to the sum of the pips in your opponents' hands.

After a couple hours I had a working 150 line simulator and two simple strategies: play any legal move (dumb strategy) and play the highest scoring legal move (less dumb). I use the term "working" loosely. A number of nefarious bugs related to non-randomness lurked in the code. These caused the last player added to win ties in scoring, who goes first, and some other spots. You can view the final dominoes.py source here.

To implement the strategies I went for the simplest thing that could possibly work: generator coroutines. They have a simple interface and they keep state so even somewhat complicated strategies are possible without writing a big interface class. You just write a function with a couple breakpoints and everything just works. Here is the generator for "play the first move that is legal"

def dumb_player(player_no, board, hand):
''' play the first domino that is legal '''
yield None # signal that we are setup
while True:
for a, b in pairs_and_reverse(hand): # (a,b)+(b, a)
try:
board.play(player_no, a, b)
# if we get here it is a legal play
yield None
break
except IllegalPlay: pass
else:
# draw a possibly legal play
board.draw(player_no)
return


The boiler plate of setup and try/except Illegal play was identical over all the simple strategies so I refactored so that the strategy is just a scoring function applied to all legal moves. The simple strategy becomes ("a" and "b" are the two ends of the domino, "a" is the side that matches the board and "b" is the open end left after play):

def dumb_player(board, player_no, a, b):
''' randomly choose a legal play '''
return None # all plays are equal

.. and the strategy for playing the highest scoring move possible is

def score_player(board, player_no, a, b):
''' always plays the immediately highest scoring tile '''
sc = board.play(player_no, a, b)
board.undo()
return sc

Add a function that does a round-robin tourney of all scoring functions 5000 times each and you have a quick fitness test.

Well, it turns out dominoes is easy (or at least simple). This blew away all my assumptions; If you look at the source I have a dozen scoring functions some of which consider the pips on the already played tiles and the secret pips in the players hand. The best of them beats the very simple "score_player" just 50.5% of the time. 50.5% is a solid money maker if you are playing blackjack against the house at hundreds of hands an hour, but peanuts if you are playing 10 games of dominoes an hour against someone who is using the simple score_player strategy.

The source is out there and understandable (500 lines, 300 of which are short strategy functions). If anyone can consistently beat my best attempt "score_blocker6" then post the source in the comments and I'll buy you a beer next PyCon.

Sunday, July 5, 2009

ICFP Contest 2009

The ICFP 2009 programming contest started a couple weeks earlier than last year, and unfortunately I only found out about it as it was ending. The challenge is to solve as best as possible in 72 hours a problem that can't be brute forced in 72 hours. I love this challenge; I've been participating in the contest (in python) since 2002 and even left EuroPython early last year just to compete. I don't do writeups every year but I did for 2007, 2004, and 2002. This year I didn't participate officially but I did tackle the program after the fact.

The best resource for writeups and implementations is the FUN team page. They did their entry in python as did an amazingly large number of contestants this year. Their page has links to a dozen writeups, a sub-reddit, and some good pages on the maths of orbital mechanics.

The contest is a great chance to try out new tools and methodologies. I tried Unit Testing for the first time during an ICFP weekend (many moons ago) and I've been sold ever since. If Unit Testing helps you finish a time-limited competition faster then how could it possibly hurt during normal dev? I've also used the contest to try out pair programming, Test Driven Development, and pyrex [I won't talk about those more unless someone asks].

This year I used the contest as a chance to try out the ctypes module and Pygame gui library. More on that below, but first to the ICFP problem:

Problem Defined

This year, like many past years, the challenge was to implement a virtual machine that runs the binaries provided by the organizers. The VM binaries simulated a bunch of orbital mechanics. You then had to write programs that interacted with the simulations to push a satellite from one orbit to another, meet up with other satellites in orbit, and more complicated variations on the theme.
My strategy was the same as in past years:
  1. Read the problem description, write a reference implementation in pure python, and test the hell out of the reference implementation.
    [actual time: 1 hour]

  2. Use that to write a visualizer and explore the problem mechanics and horribly underspecified written problem description.
    [actual time
    [actual time: 2 hours. I had to install pygame and read the docs first]

  3. Write a work-alike in screaming fast C so as much time as possible can be spent solving the problem instead of waiting on simulations to finish.
    [actual time: 3 hours. I had to learn the ctypes module and fool around with it.


The VM was very simple; it consisted of add/subtract/multiply/copy/if-test-else operations. Knocking out a working and tested version took about an hour. The tests were important because even my 150-line python implementation had a couple bugs in it. They also exposed some really shitty bugs in the written problem description. For instance a table listed 10 bits in one opcode as the 'imm' value but it should have been 3 bits of value with 7 bits of padding. The unit tests picked up bad values and a re-examination of the problem description led me to footnote 1.5 which said the value is 3-bits (why didn't they update the table too? no idea).
The nature of the VM lent itself to a fast C implementation. The executable cannot change itself so if the 30th opcode adds memory locations 101 and 102 it always adds memory locations 101 and 102. It never does a conditional jump or anything else funky. Writing a C version of the inner loop was almost as simple as adding print statements to the python version and then throwing the output at gcc. To understand the VM a little better (and test it even more) I added my own opcodes that did asserts and wrote a self-test binary that was a translation of my python unit tests into VM code. I could be confident in the C translation because it ran the self-check the same as the python version. The C version runs 1000 times faster than the pure python version, which is nice. Oddly, adding -Ox compiler flags makes the self test completely shit the bed; I say odd because the program is extremely deterministic so I guessed it would optimize nicely.

Writing a visualizer early (in pygame) was invaluable. The first draft just drew the current orbit, the target orbit, and the satellite's current position (state was printed to stdout). I assigned the arrow keys to manipulate the thrusters and discovered another bug in the spec -- setting thrust dx/dy points the thrusters in that direction so if you want to increase your speed in direction X you need to fire in direction negative X. Playing with the visualizer also answered some other ambiguities in the spec -- the simulation keeps track of your relative position to the Earth but it really means your relative position to the CENTER of the Earth adn not the surface. That isn't just important to know it also makes all the maths much easier.

Solving the level 1 problems was easy. Included in the problem description was maths for calculating thrust vectors for moving an object from one orbit to another. The maths, however, were for doing it using a minimum of fuel but the score for your solution is maximized by doing it as quickly as possible and using all your available fuel. Thanks to the first and simplier solutions I understood enough orbital mechanics to know you always wanted to fire thrusters either perpendicular or parallel to the tangent of the orbit at your current position. Thanks to the screaming fast C implementation I could brute force how much and which way to fire to get a high scoring solution. Having a fast state push/pop was a huge advantage too. Because the simulator works in discrete 1-second intervals the "real" mathematic solution is off by a little; so instead you want to do things a little early or a little later than the "ideal." That discrete solution is easily brute forced once you have the "real" solution.

Writing the ctypes/so library was interesting. The VM is specified as having a program execution area (a list of opcode, arg1, arg2 tuples), a memeory area (a finite array of doubles), a boolean status flag, a double score value, and a couple short double arrays for IO. Because the program loop is deterministic it goes away when you translate it to C. Then you are left with a bunch of double arrays of known size, one boolean, and one other double. The organizers left a big hint that you could implement this as a single array of doubles by leaving the first two values in one of the arrays undefined. So the obvious solution is to stick the one special double and the boolean flag in there (as a double) and then just concat all the arrays-of-doubles together. The max combined size is finite and under 3000 * sizeof(double). Pre-allocating a big array of these and memcpy'ing them for push/pop of state becomes dirt cheap. As a bonus it makes the ctypes interface stupid simple too because the struct is just a single array of 3000 doubles. Kinda: python doesn't have a native double type so ctypes converts to float. In order to get and set the raw double values of the VM I made the ctypes definition a union of 3000 doubles and 3000 unsigned long longs; when deciding what to do floats were close enough but when initializing the VM data or writing the trace I could set/get the 8-byte ulonglongs (hurray for c's type ignorance!).

ctypes

In the past if I wanted C speed I always hand rolled CPython extension modules. It is a little bit of extra work but the speed is unbeatable (2x faster than pyrex in my experience). ctypes is so useful I don't ever think I'll ever hand-write an extension module again. The only stdlib module that currently uses ctypes is uuid, but I expect many new modules to use ctypes instead of doing it the hard way.

The VM operates on doubles but python only has a native float type. I was very tempted to create a new core datatype by copying Objects/floatobject.c and search/replace'ing every 'float' to 'double' but because I was using ctypes I opted for the quicker and simpler casting of those doubles a ulonglongs when I needed to set/get. I bet Numpy has a way to deal with all this but I'd already hit my limit on new-tools-learned-per-hour. I understand raw C

ctypes makes easy things easy and hard things possible. If your .so (*NIX .dll) has functions that take an int/void and return an int/void you don't even need to provide a prototype -- it just works. So instead of writing a full featured Python/C wrapper for my basic datatype (an array of 64bit values) I just wrote 6 lines of python that defined the struct layout and then a C library that had a bunch of manipulation functions that returned 1/0 success/failure values and 100 lines of python that mapped property names to assignments/reads of the memory chunk. It was much less work for 90% of the speed. It also meant it was easy to apply the unit tests for my pure-python solution to the hybrid solution because the only difference was the underlying storage - a dict for the pure python and an C-array for the ctypes wrapped .so.

pygame

In the past my goto-GUI has been Tkinter. I've been using Tk every since reading Learning Python in 2001 which is half a python book and half a Tk book. I've accumulated a personal library of Tk elements that do everything from menus to graph plotting to shape drawing. I threw it all out for this year for pygame and ended up with a very decent visualizer that was just a couple hundred lines of python code. I won't be going back to Tk for graphical GUIs in the future (I still like it for text).

[More later]

ps, I use "maths" plural like the British commonwealths simply because I like it; I lived in Australia for a year and it grew on me. However, you won't hear me singularizing "sports" or saying someone is "in hospital" - they are "in the hospital." English is the best language ever because what is legal is whatever works. To paraphrase a variously attributed quote: "English doesn't borrow from other languages. It takes them down dark alleys, bashes them on the head, and rifles through their pockets." Some people will tell you English has a giant number of rules, but really that someone is just trying to make sense of a system where anything goes.