Tuesday, April 7, 2009

HOWTO: get useful information out of the buildbot

The CPython core has a raft of machines that do nothing but pull updates from subversion (the code repository) and run the unit tests. You can see the full and somewhat cryptic list of all the boxes and their status on the buildbot webpage. I had to relearn how to read all the output because I had failing tests that only failed on other people's boxen. So here's the HOWTO.

Find your branch
Ten minutes after your checkin reload the buildbot page and find the machines running the branch you checked into. The machines are titled with the codebase being run, currently either "trunk" (aka 2.7), 2.6 (the maintenance branch), 3.0 (another maintenance branch), or 3.x (the py3k trunk). The other words in the name are some combination of hardware, operating system, and compiler.

Open a bunch of tabs
Each vertical column below the name is a time series of builds and statuses with the most recent at the top. The items are either Green (completed, OK), Red (completed, catastrophe), or Yellow (either still running, ambiguous success, or informational). Open a tab by clicking on the "Build NNN" links on all the machines running the branch you care about. Your checkin is listed in the leftmost column so only pick builds than start above (afterwards) that checkin. Then wait an hour or two. [what does the build number mean? I have no idea but I'm guessing the Nth build for that machine]

Check the builds
Most of the builds should have finished so go ahead and reload all the tabs for the individual machines. If the build is still in progress you can tell by the giant header that says "Build In Progress." If it is done you will see a series of little headers and links. Each header is for the different stages: update from svn, run ./configure, recompile the source, and run the test suite. The link titled "stdio" after each of these should be renamed more plainly "view ./configure output," "view test output" etc. This is what you want to see.

Find the output you care about
Search to find the tests and failures that apply to you. Especially on the trunk there may be failures that aren't your fault. Someone elses' checkin might even be causing an abort before your stuff even gets run. If you stuff works, great! If not..

Checkin, rinse, and repeat
Based on the output you may need to make another checkin and let all the buldbots run again. If the failure isn't verbose enough then you will have to checkin some debugging output and wait for them to run again.

.. and that's all there is to it.

telnetlib progress

The first item in my Fixing Telnetlib TODO ("#1 test the hell out of telnet") is nearly done. The unit tests now test IAC handling and SB negotiation in addition to the read_* methods. As a bonus it looks like I fixed all the race conditions in the read tests'cause the builbots are going greener. (aside: did you know about Queue.join()? I didn't, very handy).

The only remaining nit is that the SB data tests are creating an uncollectable GC cycle. The Telnet object has a reference to the negotiation callback. The negotiation callback needs to call telnetob.read_sb_data() to get at the SB data. So I have a nego_collector class that looks like

class nego_collector(object):
def __init__(self, sb_getter=None):
self.seen = ''
self.sb_getter = sb_getter # cycle, this is a Telnet.read_sb_data bound method
self.sb_seen = ''

def do_nego(self, sock, cmd, opt):
self.seen += cmd + opt
if cmd == tl.SE and self.sb_getter:
sb_data = self.sb_getter()
self.sb_seen += sb_data

The nego_collector either needs to keep a weakref to the function or we have to break the cycle manually. Consider this just another crufty corner in telnetlib.

[woops]. I spoke too soon. Not all the buildbots are passing so I now have a machine running the telnetlib tests in an infinite loop with the CPU heavily loaded. Hopefully I can smoke out the remaining race conditions locally. If not I'll have to sign up to use the Snakebite testing farm.
[later] Fixed. Almost certainly. We now allow a margin of error of 100% (a whopping 0.3 seconds) in our timing assertions and we do fewer of them.

Saturday, April 4, 2009

Speaking about Speaking

AMK's talk How to Give a Python Talk is very informative, you should watch it even if you aren't planning on giving a talk. Why should you watch it? partly because it gives you an idea of what goes into a talk and partly because it demystifies giving a talk enough that it might prompt you into giving one. Lots of solid advice.

Andrew's talk itself is a nice illustration of some of his points. No one would mistake Andrew for a motivation speaker; you don't walk away from that talk with an inexplicable need to buy what he's selling and given the audience you might actually be pissed off if you thought he was trying to sell something. (talk->content != NULL) ? Good_talk : Bad_talk. PyCon attendees care more about red meat than glitter and are very forgiving on presentation if the red meat is there.

How I do it what I do to prepare has heavy overlap with what Andrew recommends. Practice is king. When I step on the stage I'm not nervous per se, but when speaking in front of a large audience I do tend to read the slides much more than I talk about them in practice. So my rule of thumb is to practice a talk where I spend three minutes per slide knowing that I'll drop most of my segues and only spend one minute live talking per slide. Figure out your own constant and practice against that. I was amazed at Ned Batchelder's talk because the the video of his talk matched so closely with his text explication of his slides. The prepared text is almost 1-to-1 which I personally just can't do.

Narrative, Narrative, Narrative: Pick a theme and stick with it. If you don't talk to your premise once every couple minutes then you have failed. My talk was "Class Decorators: Radically Simple" and I tried to say on every example that a decorator was a callable that took one argument and returned something. Raymond Hettinger's talk was "Easy AI in Python" and he started and finished every example emphasizing that a novice could do it. Alex Martelli's talk was "Abstractions as Leverage" and he introduced every slide with a quote from a very dead (and sometimes white) male who had made the same point back when writing was a novelty. It seems odd but part of your job as a speaker is to repeat yourself, repeatedly.

Don't drink coffee: This sucks, but you can't drink your normal amount of coffee before your talk. I was hoping to drink a few cups and balance it out with a bloody mary but my talk was in the AM and the hotel bar wasn't open. Instead I drank only a little coffee so I wouldn't be humming on stage. I'm told Beta Blockers work to suppress the nerves (symphony orchestras use them) but I haven't tried it myself.

Practice is free and Plentiful: It is a not-so-secret fact that user groups, PIGs, and even Cons are starved for presenters. My most recent talk started as a lightning talk and then I gave it at a local user's group and a couple Cons that had 90%+ acceptance rates before giving it at PyCon. Practice is good and the opportunities for practice are many.

You already know something to talk about At the Boston PIG talk-dry-run (all the PyCon presenters gave their talk to 30 people a week before they gave it to 300+) I spent the first five minutes talking about talking. You do know something you can do a talk about and it sounds like "what is something I wish I knew about one year ago?" It's that easy. Try one or three ideas on the local group as a lightning talk and then grow the best one into a proper talk proposal.

It isn't complicated, see you with a speaker's badge next year!

Small test_telnetlib progress

My first patch of test_telnetlib is up. It tests most of the guarantees that the various Telnet.read_* methods make (I'm sure I missed a couple). The only problem is that every single test theoretically has a race condition. In actual practice the chances of a race are 0.0%, but theoretically it isn't sound. I posted it as a patch (as opposed to just committing it) to see if anyone has an opinion.

For the next round of tests I'll be writing unit tests for the out-of-band negotiations parser.

Friday, April 3, 2009

PyCon Errata

Old and New Faces It was good to see everyone, too many names to mention. That includes all the other Boston pythoneers who I tend to see just once a year and in a city not named "Boston." There is never enough time to time to talk to everyone but I did try. I also did my usual thing which is to purposely eat lunch with no one I know [it's my fifth PyCon so this rule has been relaxed to "as few people I know as possible"]. A few mentions: somehow I'd never met Jesse Noller before (despite many PyCons and him being in Boston); Georg Brandl made it over to the US for PyCon for the first time; I didn't run into Martin Blais until day five when he was sitting next to me at sprints; a sixteen year old (who is senior to me on py-dev) thanked me for contributing a patch; and David Mertz (whom I had never met in person) ran up, introduced himself, and disappeared into the ether (far too brief: I have to invite him over for dinner or something).

Limited Excess In a down economy attendance and freebies were also down. Almost no speakers ended their talk with a "and we'ere hiring!" slide as opposed to the past standard of 100%. To my shock and horror I actually had to pay for most of my own dinners and drinks. CCP/EVE Online was a standout in this respect [If you're wondering how a company in Iceland can afford to be generous remember that their subscribers pay in dollars and euros, not kronas].

EVE Fan-Fest I learned about EVE Fan-Fest not from the CCP guys but from a husband/wife team of players. 1500+ gamers descend on Reykjavik annually. This is such a large number of extra people for a country of 300k that the conference has to be closely coordinated with the government, hotels, and airlines. The mind reels.

Code Blindness By the end of sprints I was suffering from the geek equivalent of snow blindness. Throughout sprints I traded bug reports, emails, and checkins with Hiro Yamamoto (the "John Smith" of Japan). He'd miss something and I'd whargarbl his name under my breath. I'd miss something and know he was grumbling half way across the world. I pretty clearly lost that battle when I committed a patch that checked to see if unsigned longs were less than zero (oh sure, the compiler can optimize it out, but still..). Which reminds me, I still need to revert that.

We have a prodigy on our critical path. Python's release manager is Benjamin Peterson and Benjamin is sixteen years old. On the internet nobody knows you're a dog and in open source no one cares if you're in High School. He gets stuff done, end of story. There is a small amount of cognitive dissonance involved, but not much. For instance he gave me an attaboy for a patch I submitted last year - and while I have shoes that are older than he is - he sincerely meant it as a compliment and I took it as such. He's good people to have around - though if he gets a driver's license or a girlfriend we're in a spot of trouble. [I talked to his mother only briefly but she treated his hobby as casually as if he was on a sports team.]

Benjamin is not without precedent. Our now somewhat older prodigy is named Georg Brandl. The idea of prolonged adolescence is pretty new in cultural terms (less than 60 years old). Both men are sterling illustrations that when you treat "kids" like adults, they behave like adults (heck, they were adults in the first place but just not acknowledged as so). Let's have more of this please.

Twitter Twitter was the breakout story of the year at PyCon. I've peeked at it several times but never seen the point. I'm so old school I still refer to IM as "talk." Twitter was nowhere to be seen last year but this year it was pervasive. Sure, most of the tweets were mindless blather but they fill the mindless blather niche very well. "bourbon in the Kennedy room" is useful when broadcast but not the kind of thing you'd send an email about. Michael Foord (aka voidspace) gained 50 followers a day during the conference. I have reluctantly broken down and signed up too. Oddly one of my first tweets was answering the question "do I need stitches for this?" which is something I know much about (I had a very full childhood and I have the scars to prove it).

My Talk Video of my talk Class Decorators: Radically Simple is now online. I was pleased with my performance until I saw the video. Thankfully attendees care more about content that presentation because there are a dozen things I would like to do over; I don't have a future as a motivational speaker. I have done a talk on that same topic several times now and this time was a giant rewrite. The night before I was in bed by midnight but tossed and turned. I ended up giving up and rewriting large portions until 5am. I slept for three hours and what you see was me looking at the slides for the second time. All the ridiculous example slides were what people [unsolicited!] came up and told me is what made class decorators "click" for them. Go figure.

There is a raft of little things I would change about the presentation. Unfortunately I won't ever give it again so I'll have to apply them to my next talk (after I think one up). Bloused shirt? gone, starch that thing and make sure it is tucked in. Conversational voice? gone, I have a separate speaker's voice and I didn't use it (lack of sleep?). USB remote slide dongle? gone, I spent as much time aiming the laser pointer at the screen as I did talking to the room. Wireless mike? keep, standing at the podium sucks [I lucked out - I was in the only room that had a wireless mike and I only got to use it because I asked].

Oh, and the perenial "pause between sentences." For the first five minutes I talked like I was reading a teleprompter. There isn't much you can do about this other than practice.

[and then some more errata]

International As I've mentioned before PyCon is the inverse of EuroPython in that it is 75% American and 25% European (eyeball numbers: I'd love to see hard data on this). The speakers list is somewhat more static because there is a subset of people who go to conventions for fun (myself included). To confuse things further there are a number of Americans who weren't born here and some "Americans" who are American but not in name (Alex Martelli is still Italian for sentimental reasons despite living in and literally marrying into to America).

Martelli's Slides Alex Martelli's slides are immediately recognizable because he uses the same background and the same quirky font on all of them, always. I got the scoop from Anna Ravenscroft (a sometimes PyCon speaker and AKA Mrs Alex Martelli). He is fond of the background and font because they remind him of a blackboard. No one has complained so that's all there is to it.

Sprints are Magic Two days of sprints generated the same amount of python-checkin traffic as a regular month. Questions are just so much cheaper in person than in email that it couldn't be otherwise. Raise you hand and say "can anyone tell me about [interface]" and you get an answer. Person-to-person social pressures also lead to quicker bug resolution. Jesse Noller said something like "I assigned a pickle functools bug to you while you were in the can, it seemed up your alley." It wasn't up my alley but a few hours later I had read the pickle docs and checked in a patch to make functools.partial instances pickle-able.

Fixing telnetlib

During the PyCon sprints I re-assigned all open (and unclaimed) telnetlib bugs to myself. The biggest longstanding complaint about telnetlib is that non-trivial negotations aren't possible because the negotiation callback is very bare bones. The biggest problem with telnetlib is that there is almost no test suite - which is why some bugs have been open for seven years. So my priorities are first to test the hell out of telnetlib and second to improve negotiation.

The negotiation problem is clearest when dealing with two-way communications like NAWS (Negotiate About Window Size). The first time the server asks DO NAWS the client can reply WILL NAWS and include its current window size. The current negotiation callback supports this just fine. But when the client resizes its window it needs to be able to tell the server, which means Telnet needs a hook for a pending negotiations queue. And forget about the STATUS code which asks the other end of the connection to say what options it thinks have been negotiated - the current Telnet has no notion of state.

Below are the raw TODO and research notes I put together in a few hours at sprints. I used google code search to find some of the attempts to fix telnetlib by either subclassing it or writing a semi-compatible Telnet-alike from scratch (these are harder to grep for, for obvious reasons). The RFCs section marks each RFC as Must/Will/Won't implement. "Must implement" means core stuff for the Telnet class, "Will implement" means the telnetlib should include a negotation implementation for that RFC, and "Won't implement" means it won't (because the RFC is either archaic or otherwise unused in the wild). The BUGS list includes all open bugs and the closed bugs I want to revisit or double-check.

---- TESTING TELNETLIB ----
* Testing
- test the read_* gaurantees
- test timeouts (already implemented?)
- test the sb handling
* make real negotation possible
* add real timeout and prompt exceptions
* make Telnet objects context managers
* process_rawq is a train wreck. Make sure we do something compatible but less icky.
* figure out where the hell they found all those contstants.
* Why is chr(17)/"\021" blindly filtered out of the stream?


---- BUGS ----

OPEN

http://bugs.python.org/issue5188
telnetlib process_rawq buffer handling is confused

http://bugs.python.org/issue2550
SO_REUSEADDR doesn't have the same semantics on Windows as on Unix

http://bugs.python.org/issue1360221
telnetlib expect() and read_until() do not time out properly

http://bugs.python.org/issue1252001
Issue with telnetlib read_until not timing out

http://bugs.python.org/issue1049450
Solaris: EINTR exception in select/socket calls in telnetlib

http://bugs.python.org/issue708007
TelnetPopen3, TelnetBase, Expect split
[THIS, a rewrite of telnetlib. Mine for good stuff]

http://bugs.python.org/issue1678077
improve telnetlib.Telnet so option negotiation becomes easie

http://bugs.python.org/issue1772788
chr(128) in u'only ascii' -> TypeError with misleading msg

http://bugs.python.org/issue1737737
telnetlib.Telnet does not process DATA MARK (DM)

http://bugs.python.org/issue1772794
Telnetlib dosn't accept u'only ascii'

CLOSED

http://bugs.python.org/issue2451
No way to disable socket timeouts in httplib, etc.

http://bugs.python.org/issue822974
Telnet.read_until() timeout parameter misleading

http://bugs.python.org/issue630829
telnetlib.py: don't block on IAC and enhancement

http://bugs.python.org/issue723312
ability to pass a timeout to underlying socket

http://bugs.python.org/issue1520081
telnetlib.py change to ease option handling.

http://bugs.python.org/issue664020
telnetlib option subnegotiation fix

http://bugs.python.org/issue723364
terminal type option subnegotiation in telnetlib

---- RFCs ----

http://en.wikipedia.org/wiki/Telnet
Wikipedia lists all the relevant RFCs at the bottom.

[--FORMAT--]
URL
Short Description
Will/Won't implement
[--FORMAT--]

http://www.iana.org/assignments/telnet-options
List of officially assigned option codes
Must implement.

http://tools.ietf.org/html/rfc854
(1983) Telnet protocol definition.
Must implement.

http://tools.ietf.org/html/rfc855
(1983) Telnet negotation.
Must implement.

http://tools.ietf.org/html/rfc856
(1983) Telnet binary protocol.
Won't implement. This was obviated by Kermit, Zmodem, and the like.

http://tools.ietf.org/html/rfc857
(1983) Telnet ECHO negotiation.
Will implement.

http://tools.ietf.org/html/rfc858
(1983) Supress Go-Ahead. Nego supression of "your turn" messages for full duplex connections.
Won't implement.

http://tools.ietf.org/html/rfc859 (Obsoletes http://tools.ietf.org/html/rfc651)
(1983) Telnet status. Ask other party to retransmit what they think the current negotiated options are.
Will implement.

http://tools.ietf.org/html/rfc860
(1983) Timing mark. A work around for servers that can't read the socket as fast as people type (!!!).
Won't implement.

http://tools.ietf.org/html/rfc861
(1983) negotiating about negotiating
Proln't, Doubtful this is still in effect.

http://tools.ietf.org/html/rfc885
(1983) End-of-Record code.
Might, I have a vague recollecting that this is used as a prompt sigil.

http://tools.ietf.org/html/rfc1073
(1988) NAWS (Negotiate About Window Size)
Will implement.

http://tools.ietf.org/html/rfc1079
(1988) Baud rate negotiation
Won't implement.

http://tools.ietf.org/html/rfc1091 (Obsoletes http://tools.ietf.org/html/rfc930)
(1989) Terminal type negotiation
Will implement.

http://tools.ietf.org/html/rfc1184 (Obsoletes http://tools.ietf.org/html/rfc1116)
(1990) Telnet linemode nego. Basically save packets by being less interactive.
Won't implement.

http://tools.ietf.org/html/rfc1372 (Obsoletes http://tools.ietf.org/html/rfc1080)
(1992) Terminal flow control. Local terminal stuff.
Won't implement.

http://tools.ietf.org/html/rfc2217
(1997) SLIP-lite protocol for sharing a modem.
Won't implement.

http://tools.ietf.org/html/rfc2946
(2000) Telnet Encryption nego.
Won't implement (does anyone actually use this?)

http://tools.ietf.org/html/rfc4777
(2006) IBM iSeries hardware telnet extensions.
Won't implement (starngely, the RFC argues against implementing itself)

---- Alternate Implementations ----
[found using google code search]
a hacky ECHO negotiator

subclass-and-patch NAWS negotiator

a from-scratch wrapper

a from-scratch reimplementation w/ better (but unpythonic) negotiating.