Sunday, July 12, 2009

Dominoes

Playing dominoes is hard. Like playing card games there are many games you can play with dominoes and last night I was exposed to Muggins/Fives. Muggins has a points system and unlike the kiddie version of dominoes that I grew up with is not a game of chance (think War versus Bridge). The big tipoffs that it is a hard problem are that A) it is a partial information game [you can only see the board and your own hand] and B) grown men play against each other for money [I'm told the variant we were playing is very popular in the Bahamas].

The first thing I did was look for AI research papers on solving dominoes. After an hour of feeding terms to search engines I can say: there is no research. This is strange because Checkers (a boring perfect information game) was still interesting enough to researchers that it was solved only 10 years ago. People don't bet on checkers so the fact that dominoes is a research orphan left me intrigued. Of course there's a personal angle too: If it was an easy problem to solve I could solve it and then travel while grifting strangers out of their money; If it was a hard problem to solve (like Bridge or Go) then I'd add it to my list of "fun things to play."

In Muggins you get points when all of the exposed end tiles add up to a multiple of five. If you go out first you also get bonus points equal to the sum of the pips in your opponents' hands.

After a couple hours I had a working 150 line simulator and two simple strategies: play any legal move (dumb strategy) and play the highest scoring legal move (less dumb). I use the term "working" loosely. A number of nefarious bugs related to non-randomness lurked in the code. These caused the last player added to win ties in scoring, who goes first, and some other spots. You can view the final dominoes.py source here.

To implement the strategies I went for the simplest thing that could possibly work: generator coroutines. They have a simple interface and they keep state so even somewhat complicated strategies are possible without writing a big interface class. You just write a function with a couple breakpoints and everything just works. Here is the generator for "play the first move that is legal"

def dumb_player(player_no, board, hand):
''' play the first domino that is legal '''
yield None # signal that we are setup
while True:
for a, b in pairs_and_reverse(hand): # (a,b)+(b, a)
try:
board.play(player_no, a, b)
# if we get here it is a legal play
yield None
break
except IllegalPlay: pass
else:
# draw a possibly legal play
board.draw(player_no)
return


The boiler plate of setup and try/except Illegal play was identical over all the simple strategies so I refactored so that the strategy is just a scoring function applied to all legal moves. The simple strategy becomes ("a" and "b" are the two ends of the domino, "a" is the side that matches the board and "b" is the open end left after play):

def dumb_player(board, player_no, a, b):
''' randomly choose a legal play '''
return None # all plays are equal

.. and the strategy for playing the highest scoring move possible is

def score_player(board, player_no, a, b):
''' always plays the immediately highest scoring tile '''
sc = board.play(player_no, a, b)
board.undo()
return sc

Add a function that does a round-robin tourney of all scoring functions 5000 times each and you have a quick fitness test.

Well, it turns out dominoes is easy (or at least simple). This blew away all my assumptions; If you look at the source I have a dozen scoring functions some of which consider the pips on the already played tiles and the secret pips in the players hand. The best of them beats the very simple "score_player" just 50.5% of the time. 50.5% is a solid money maker if you are playing blackjack against the house at hundreds of hands an hour, but peanuts if you are playing 10 games of dominoes an hour against someone who is using the simple score_player strategy.

The source is out there and understandable (500 lines, 300 of which are short strategy functions). If anyone can consistently beat my best attempt "score_blocker6" then post the source in the comments and I'll buy you a beer next PyCon.

Sunday, July 5, 2009

ICFP Contest 2009

The ICFP 2009 programming contest started a couple weeks earlier than last year, and unfortunately I only found out about it as it was ending. The challenge is to solve as best as possible in 72 hours a problem that can't be brute forced in 72 hours. I love this challenge; I've been participating in the contest (in python) since 2002 and even left EuroPython early last year just to compete. I don't do writeups every year but I did for 2007, 2004, and 2002. This year I didn't participate officially but I did tackle the program after the fact.

The best resource for writeups and implementations is the FUN team page. They did their entry in python as did an amazingly large number of contestants this year. Their page has links to a dozen writeups, a sub-reddit, and some good pages on the maths of orbital mechanics.

The contest is a great chance to try out new tools and methodologies. I tried Unit Testing for the first time during an ICFP weekend (many moons ago) and I've been sold ever since. If Unit Testing helps you finish a time-limited competition faster then how could it possibly hurt during normal dev? I've also used the contest to try out pair programming, Test Driven Development, and pyrex [I won't talk about those more unless someone asks].

This year I used the contest as a chance to try out the ctypes module and Pygame gui library. More on that below, but first to the ICFP problem:

Problem Defined

This year, like many past years, the challenge was to implement a virtual machine that runs the binaries provided by the organizers. The VM binaries simulated a bunch of orbital mechanics. You then had to write programs that interacted with the simulations to push a satellite from one orbit to another, meet up with other satellites in orbit, and more complicated variations on the theme.
My strategy was the same as in past years:
  1. Read the problem description, write a reference implementation in pure python, and test the hell out of the reference implementation.
    [actual time: 1 hour]

  2. Use that to write a visualizer and explore the problem mechanics and horribly underspecified written problem description.
    [actual time
    [actual time: 2 hours. I had to install pygame and read the docs first]

  3. Write a work-alike in screaming fast C so as much time as possible can be spent solving the problem instead of waiting on simulations to finish.
    [actual time: 3 hours. I had to learn the ctypes module and fool around with it.


The VM was very simple; it consisted of add/subtract/multiply/copy/if-test-else operations. Knocking out a working and tested version took about an hour. The tests were important because even my 150-line python implementation had a couple bugs in it. They also exposed some really shitty bugs in the written problem description. For instance a table listed 10 bits in one opcode as the 'imm' value but it should have been 3 bits of value with 7 bits of padding. The unit tests picked up bad values and a re-examination of the problem description led me to footnote 1.5 which said the value is 3-bits (why didn't they update the table too? no idea).
The nature of the VM lent itself to a fast C implementation. The executable cannot change itself so if the 30th opcode adds memory locations 101 and 102 it always adds memory locations 101 and 102. It never does a conditional jump or anything else funky. Writing a C version of the inner loop was almost as simple as adding print statements to the python version and then throwing the output at gcc. To understand the VM a little better (and test it even more) I added my own opcodes that did asserts and wrote a self-test binary that was a translation of my python unit tests into VM code. I could be confident in the C translation because it ran the self-check the same as the python version. The C version runs 1000 times faster than the pure python version, which is nice. Oddly, adding -Ox compiler flags makes the self test completely shit the bed; I say odd because the program is extremely deterministic so I guessed it would optimize nicely.

Writing a visualizer early (in pygame) was invaluable. The first draft just drew the current orbit, the target orbit, and the satellite's current position (state was printed to stdout). I assigned the arrow keys to manipulate the thrusters and discovered another bug in the spec -- setting thrust dx/dy points the thrusters in that direction so if you want to increase your speed in direction X you need to fire in direction negative X. Playing with the visualizer also answered some other ambiguities in the spec -- the simulation keeps track of your relative position to the Earth but it really means your relative position to the CENTER of the Earth adn not the surface. That isn't just important to know it also makes all the maths much easier.

Solving the level 1 problems was easy. Included in the problem description was maths for calculating thrust vectors for moving an object from one orbit to another. The maths, however, were for doing it using a minimum of fuel but the score for your solution is maximized by doing it as quickly as possible and using all your available fuel. Thanks to the first and simplier solutions I understood enough orbital mechanics to know you always wanted to fire thrusters either perpendicular or parallel to the tangent of the orbit at your current position. Thanks to the screaming fast C implementation I could brute force how much and which way to fire to get a high scoring solution. Having a fast state push/pop was a huge advantage too. Because the simulator works in discrete 1-second intervals the "real" mathematic solution is off by a little; so instead you want to do things a little early or a little later than the "ideal." That discrete solution is easily brute forced once you have the "real" solution.

Writing the ctypes/so library was interesting. The VM is specified as having a program execution area (a list of opcode, arg1, arg2 tuples), a memeory area (a finite array of doubles), a boolean status flag, a double score value, and a couple short double arrays for IO. Because the program loop is deterministic it goes away when you translate it to C. Then you are left with a bunch of double arrays of known size, one boolean, and one other double. The organizers left a big hint that you could implement this as a single array of doubles by leaving the first two values in one of the arrays undefined. So the obvious solution is to stick the one special double and the boolean flag in there (as a double) and then just concat all the arrays-of-doubles together. The max combined size is finite and under 3000 * sizeof(double). Pre-allocating a big array of these and memcpy'ing them for push/pop of state becomes dirt cheap. As a bonus it makes the ctypes interface stupid simple too because the struct is just a single array of 3000 doubles. Kinda: python doesn't have a native double type so ctypes converts to float. In order to get and set the raw double values of the VM I made the ctypes definition a union of 3000 doubles and 3000 unsigned long longs; when deciding what to do floats were close enough but when initializing the VM data or writing the trace I could set/get the 8-byte ulonglongs (hurray for c's type ignorance!).

ctypes

In the past if I wanted C speed I always hand rolled CPython extension modules. It is a little bit of extra work but the speed is unbeatable (2x faster than pyrex in my experience). ctypes is so useful I don't ever think I'll ever hand-write an extension module again. The only stdlib module that currently uses ctypes is uuid, but I expect many new modules to use ctypes instead of doing it the hard way.

The VM operates on doubles but python only has a native float type. I was very tempted to create a new core datatype by copying Objects/floatobject.c and search/replace'ing every 'float' to 'double' but because I was using ctypes I opted for the quicker and simpler casting of those doubles a ulonglongs when I needed to set/get. I bet Numpy has a way to deal with all this but I'd already hit my limit on new-tools-learned-per-hour. I understand raw C

ctypes makes easy things easy and hard things possible. If your .so (*NIX .dll) has functions that take an int/void and return an int/void you don't even need to provide a prototype -- it just works. So instead of writing a full featured Python/C wrapper for my basic datatype (an array of 64bit values) I just wrote 6 lines of python that defined the struct layout and then a C library that had a bunch of manipulation functions that returned 1/0 success/failure values and 100 lines of python that mapped property names to assignments/reads of the memory chunk. It was much less work for 90% of the speed. It also meant it was easy to apply the unit tests for my pure-python solution to the hybrid solution because the only difference was the underlying storage - a dict for the pure python and an C-array for the ctypes wrapped .so.

pygame

In the past my goto-GUI has been Tkinter. I've been using Tk every since reading Learning Python in 2001 which is half a python book and half a Tk book. I've accumulated a personal library of Tk elements that do everything from menus to graph plotting to shape drawing. I threw it all out for this year for pygame and ended up with a very decent visualizer that was just a couple hundred lines of python code. I won't be going back to Tk for graphical GUIs in the future (I still like it for text).

[More later]

ps, I use "maths" plural like the British commonwealths simply because I like it; I lived in Australia for a year and it grew on me. However, you won't hear me singularizing "sports" or saying someone is "in hospital" - they are "in the hospital." English is the best language ever because what is legal is whatever works. To paraphrase a variously attributed quote: "English doesn't borrow from other languages. It takes them down dark alleys, bashes them on the head, and rifles through their pockets." Some people will tell you English has a giant number of rules, but really that someone is just trying to make sense of a system where anything goes.

Tuesday, April 7, 2009

HOWTO: get useful information out of the buildbot

The CPython core has a raft of machines that do nothing but pull updates from subversion (the code repository) and run the unit tests. You can see the full and somewhat cryptic list of all the boxes and their status on the buildbot webpage. I had to relearn how to read all the output because I had failing tests that only failed on other people's boxen. So here's the HOWTO.

Find your branch
Ten minutes after your checkin reload the buildbot page and find the machines running the branch you checked into. The machines are titled with the codebase being run, currently either "trunk" (aka 2.7), 2.6 (the maintenance branch), 3.0 (another maintenance branch), or 3.x (the py3k trunk). The other words in the name are some combination of hardware, operating system, and compiler.

Open a bunch of tabs
Each vertical column below the name is a time series of builds and statuses with the most recent at the top. The items are either Green (completed, OK), Red (completed, catastrophe), or Yellow (either still running, ambiguous success, or informational). Open a tab by clicking on the "Build NNN" links on all the machines running the branch you care about. Your checkin is listed in the leftmost column so only pick builds than start above (afterwards) that checkin. Then wait an hour or two. [what does the build number mean? I have no idea but I'm guessing the Nth build for that machine]

Check the builds
Most of the builds should have finished so go ahead and reload all the tabs for the individual machines. If the build is still in progress you can tell by the giant header that says "Build In Progress." If it is done you will see a series of little headers and links. Each header is for the different stages: update from svn, run ./configure, recompile the source, and run the test suite. The link titled "stdio" after each of these should be renamed more plainly "view ./configure output," "view test output" etc. This is what you want to see.

Find the output you care about
Search to find the tests and failures that apply to you. Especially on the trunk there may be failures that aren't your fault. Someone elses' checkin might even be causing an abort before your stuff even gets run. If you stuff works, great! If not..

Checkin, rinse, and repeat
Based on the output you may need to make another checkin and let all the buldbots run again. If the failure isn't verbose enough then you will have to checkin some debugging output and wait for them to run again.

.. and that's all there is to it.

telnetlib progress

The first item in my Fixing Telnetlib TODO ("#1 test the hell out of telnet") is nearly done. The unit tests now test IAC handling and SB negotiation in addition to the read_* methods. As a bonus it looks like I fixed all the race conditions in the read tests'cause the builbots are going greener. (aside: did you know about Queue.join()? I didn't, very handy).

The only remaining nit is that the SB data tests are creating an uncollectable GC cycle. The Telnet object has a reference to the negotiation callback. The negotiation callback needs to call telnetob.read_sb_data() to get at the SB data. So I have a nego_collector class that looks like

class nego_collector(object):
def __init__(self, sb_getter=None):
self.seen = ''
self.sb_getter = sb_getter # cycle, this is a Telnet.read_sb_data bound method
self.sb_seen = ''

def do_nego(self, sock, cmd, opt):
self.seen += cmd + opt
if cmd == tl.SE and self.sb_getter:
sb_data = self.sb_getter()
self.sb_seen += sb_data

The nego_collector either needs to keep a weakref to the function or we have to break the cycle manually. Consider this just another crufty corner in telnetlib.

[woops]. I spoke too soon. Not all the buildbots are passing so I now have a machine running the telnetlib tests in an infinite loop with the CPU heavily loaded. Hopefully I can smoke out the remaining race conditions locally. If not I'll have to sign up to use the Snakebite testing farm.
[later] Fixed. Almost certainly. We now allow a margin of error of 100% (a whopping 0.3 seconds) in our timing assertions and we do fewer of them.

Saturday, April 4, 2009

Speaking about Speaking

AMK's talk How to Give a Python Talk is very informative, you should watch it even if you aren't planning on giving a talk. Why should you watch it? partly because it gives you an idea of what goes into a talk and partly because it demystifies giving a talk enough that it might prompt you into giving one. Lots of solid advice.

Andrew's talk itself is a nice illustration of some of his points. No one would mistake Andrew for a motivation speaker; you don't walk away from that talk with an inexplicable need to buy what he's selling and given the audience you might actually be pissed off if you thought he was trying to sell something. (talk->content != NULL) ? Good_talk : Bad_talk. PyCon attendees care more about red meat than glitter and are very forgiving on presentation if the red meat is there.

How I do it what I do to prepare has heavy overlap with what Andrew recommends. Practice is king. When I step on the stage I'm not nervous per se, but when speaking in front of a large audience I do tend to read the slides much more than I talk about them in practice. So my rule of thumb is to practice a talk where I spend three minutes per slide knowing that I'll drop most of my segues and only spend one minute live talking per slide. Figure out your own constant and practice against that. I was amazed at Ned Batchelder's talk because the the video of his talk matched so closely with his text explication of his slides. The prepared text is almost 1-to-1 which I personally just can't do.

Narrative, Narrative, Narrative: Pick a theme and stick with it. If you don't talk to your premise once every couple minutes then you have failed. My talk was "Class Decorators: Radically Simple" and I tried to say on every example that a decorator was a callable that took one argument and returned something. Raymond Hettinger's talk was "Easy AI in Python" and he started and finished every example emphasizing that a novice could do it. Alex Martelli's talk was "Abstractions as Leverage" and he introduced every slide with a quote from a very dead (and sometimes white) male who had made the same point back when writing was a novelty. It seems odd but part of your job as a speaker is to repeat yourself, repeatedly.

Don't drink coffee: This sucks, but you can't drink your normal amount of coffee before your talk. I was hoping to drink a few cups and balance it out with a bloody mary but my talk was in the AM and the hotel bar wasn't open. Instead I drank only a little coffee so I wouldn't be humming on stage. I'm told Beta Blockers work to suppress the nerves (symphony orchestras use them) but I haven't tried it myself.

Practice is free and Plentiful: It is a not-so-secret fact that user groups, PIGs, and even Cons are starved for presenters. My most recent talk started as a lightning talk and then I gave it at a local user's group and a couple Cons that had 90%+ acceptance rates before giving it at PyCon. Practice is good and the opportunities for practice are many.

You already know something to talk about At the Boston PIG talk-dry-run (all the PyCon presenters gave their talk to 30 people a week before they gave it to 300+) I spent the first five minutes talking about talking. You do know something you can do a talk about and it sounds like "what is something I wish I knew about one year ago?" It's that easy. Try one or three ideas on the local group as a lightning talk and then grow the best one into a proper talk proposal.

It isn't complicated, see you with a speaker's badge next year!

Small test_telnetlib progress

My first patch of test_telnetlib is up. It tests most of the guarantees that the various Telnet.read_* methods make (I'm sure I missed a couple). The only problem is that every single test theoretically has a race condition. In actual practice the chances of a race are 0.0%, but theoretically it isn't sound. I posted it as a patch (as opposed to just committing it) to see if anyone has an opinion.

For the next round of tests I'll be writing unit tests for the out-of-band negotiations parser.

Friday, April 3, 2009

PyCon Errata

Old and New Faces It was good to see everyone, too many names to mention. That includes all the other Boston pythoneers who I tend to see just once a year and in a city not named "Boston." There is never enough time to time to talk to everyone but I did try. I also did my usual thing which is to purposely eat lunch with no one I know [it's my fifth PyCon so this rule has been relaxed to "as few people I know as possible"]. A few mentions: somehow I'd never met Jesse Noller before (despite many PyCons and him being in Boston); Georg Brandl made it over to the US for PyCon for the first time; I didn't run into Martin Blais until day five when he was sitting next to me at sprints; a sixteen year old (who is senior to me on py-dev) thanked me for contributing a patch; and David Mertz (whom I had never met in person) ran up, introduced himself, and disappeared into the ether (far too brief: I have to invite him over for dinner or something).

Limited Excess In a down economy attendance and freebies were also down. Almost no speakers ended their talk with a "and we'ere hiring!" slide as opposed to the past standard of 100%. To my shock and horror I actually had to pay for most of my own dinners and drinks. CCP/EVE Online was a standout in this respect [If you're wondering how a company in Iceland can afford to be generous remember that their subscribers pay in dollars and euros, not kronas].

EVE Fan-Fest I learned about EVE Fan-Fest not from the CCP guys but from a husband/wife team of players. 1500+ gamers descend on Reykjavik annually. This is such a large number of extra people for a country of 300k that the conference has to be closely coordinated with the government, hotels, and airlines. The mind reels.

Code Blindness By the end of sprints I was suffering from the geek equivalent of snow blindness. Throughout sprints I traded bug reports, emails, and checkins with Hiro Yamamoto (the "John Smith" of Japan). He'd miss something and I'd whargarbl his name under my breath. I'd miss something and know he was grumbling half way across the world. I pretty clearly lost that battle when I committed a patch that checked to see if unsigned longs were less than zero (oh sure, the compiler can optimize it out, but still..). Which reminds me, I still need to revert that.

We have a prodigy on our critical path. Python's release manager is Benjamin Peterson and Benjamin is sixteen years old. On the internet nobody knows you're a dog and in open source no one cares if you're in High School. He gets stuff done, end of story. There is a small amount of cognitive dissonance involved, but not much. For instance he gave me an attaboy for a patch I submitted last year - and while I have shoes that are older than he is - he sincerely meant it as a compliment and I took it as such. He's good people to have around - though if he gets a driver's license or a girlfriend we're in a spot of trouble. [I talked to his mother only briefly but she treated his hobby as casually as if he was on a sports team.]

Benjamin is not without precedent. Our now somewhat older prodigy is named Georg Brandl. The idea of prolonged adolescence is pretty new in cultural terms (less than 60 years old). Both men are sterling illustrations that when you treat "kids" like adults, they behave like adults (heck, they were adults in the first place but just not acknowledged as so). Let's have more of this please.

Twitter Twitter was the breakout story of the year at PyCon. I've peeked at it several times but never seen the point. I'm so old school I still refer to IM as "talk." Twitter was nowhere to be seen last year but this year it was pervasive. Sure, most of the tweets were mindless blather but they fill the mindless blather niche very well. "bourbon in the Kennedy room" is useful when broadcast but not the kind of thing you'd send an email about. Michael Foord (aka voidspace) gained 50 followers a day during the conference. I have reluctantly broken down and signed up too. Oddly one of my first tweets was answering the question "do I need stitches for this?" which is something I know much about (I had a very full childhood and I have the scars to prove it).

My Talk Video of my talk Class Decorators: Radically Simple is now online. I was pleased with my performance until I saw the video. Thankfully attendees care more about content that presentation because there are a dozen things I would like to do over; I don't have a future as a motivational speaker. I have done a talk on that same topic several times now and this time was a giant rewrite. The night before I was in bed by midnight but tossed and turned. I ended up giving up and rewriting large portions until 5am. I slept for three hours and what you see was me looking at the slides for the second time. All the ridiculous example slides were what people [unsolicited!] came up and told me is what made class decorators "click" for them. Go figure.

There is a raft of little things I would change about the presentation. Unfortunately I won't ever give it again so I'll have to apply them to my next talk (after I think one up). Bloused shirt? gone, starch that thing and make sure it is tucked in. Conversational voice? gone, I have a separate speaker's voice and I didn't use it (lack of sleep?). USB remote slide dongle? gone, I spent as much time aiming the laser pointer at the screen as I did talking to the room. Wireless mike? keep, standing at the podium sucks [I lucked out - I was in the only room that had a wireless mike and I only got to use it because I asked].

Oh, and the perenial "pause between sentences." For the first five minutes I talked like I was reading a teleprompter. There isn't much you can do about this other than practice.

[and then some more errata]

International As I've mentioned before PyCon is the inverse of EuroPython in that it is 75% American and 25% European (eyeball numbers: I'd love to see hard data on this). The speakers list is somewhat more static because there is a subset of people who go to conventions for fun (myself included). To confuse things further there are a number of Americans who weren't born here and some "Americans" who are American but not in name (Alex Martelli is still Italian for sentimental reasons despite living in and literally marrying into to America).

Martelli's Slides Alex Martelli's slides are immediately recognizable because he uses the same background and the same quirky font on all of them, always. I got the scoop from Anna Ravenscroft (a sometimes PyCon speaker and AKA Mrs Alex Martelli). He is fond of the background and font because they remind him of a blackboard. No one has complained so that's all there is to it.

Sprints are Magic Two days of sprints generated the same amount of python-checkin traffic as a regular month. Questions are just so much cheaper in person than in email that it couldn't be otherwise. Raise you hand and say "can anyone tell me about [interface]" and you get an answer. Person-to-person social pressures also lead to quicker bug resolution. Jesse Noller said something like "I assigned a pickle functools bug to you while you were in the can, it seemed up your alley." It wasn't up my alley but a few hours later I had read the pickle docs and checked in a patch to make functools.partial instances pickle-able.