If you’ve ever wondered what a programming competition would look like if it was more like a game than a textbook exercise, then the ICFP 2006 contest may interest you.
Every year the International Conference on Functional Programming runs a programming competition, announcing the winners at the conference. Open to all, it’s a chance for enthusiasts to show off their favourite programming language, no matter how popular or obscure their choice. Or perhaps an honest propaganda exercise to show that functional languages can work as well as their mainstream counterparts.
The challenges normally involve writing a bot, writing a compiler, or a mixture of both. Past years have involved solving boardgames on square and hexagonal grids, finding small and efficient programs for simple virtual machines, and other strategy and optimization tasks. In 2006, instead of one gigantic problem, they went for a fun mixture of smaller, simple puzzles all in different languages.
The 2006 contest takes place in the context of a backstory about an ancient cult of programming language researchers known as the Cult of the Bound Variable. According to this story, the CBV was active in Pittsburgh, Pennsylvania thousands of years ago, carrying out computations on primitive devices powered by falling sand. Thirty years ago, excavators discovered an artifact of the Cult’s scholarship, an extensive codex that was written in no known language. Archaeologists soon gave up on deciphering the Codex, and, until this year, the Codex was stored away in the basement of the Carnegie Museum of Natural History.
Contestants were provided with a disk image and specification for a ‘historic computer’, and asked to implement a virtual machine. Upon booting it up for the first time, they were presented with a somewhat familiar login screen for a 'UMIX’ system.
12:00:00 1/1/19100
Welcome to Universal Machine IX (UMIX).
This machine is a shared resource. Please do not log in to multiple simultaneous UMIX servers. No game playing is allowed.
Please log in (use ’guest’ for visitor access).
;login:
Inside the guest account you’ll find a broken password cracker. Written in a quirky BASIC clone using roman numerals, it crashes on any non zero result. Your reward for editing the program and getting it to run is a new account to use. Each of the accounts on the machine have a programming puzzle inside (some of which parodied earlier ICFP challenges).
For three days in July, the contestants raced each other to find accounts and solve the puzzles inside. For the first two days the team scores were public, but the final scores would not be revealed until September.
The talk follows the competition as it unfolded, revealing the puzzles one by one as the teams discovered them. These puzzles included ascii art, reverse engineering, theorem proving, term rewriting and an adventure game. Before announcing the winners, they explain some of the interesting parts involved in creating and running the challenge. The accompanying report covers the puzzles in more depth, but only the final scores. If you have the time I’d recommend watching the talk first.
The contest was a mix of computation and humour that hasn’t been seen in earlier IFCP contests, and not matched since. If you’d like to try it, the support materials are still available at http://www.boundvariable.org.
Earlier this week I was lucky enough to see Joe Armstrong talk about the principles behind Erlang. During the talk he mentioned one of my favourite technical reports—Why do Computers stop? by Jim Gray for Tandem Computers.
Tandem Computers built fault tolerant and high availably machines before the web existed. The uptime of their machines was legendary, as an apocryphal technical support call demonstrates:
Hi, is this Support? We have a problem with our Tandem: A car bomb exploded outside the bank, and the machine has fallen over.
….
No, No it hasn’t crashed, it’s still running, just on its side. We were wondering if we can move it without breaking it.
The report offers some insight as to how these legends were born—isolation, failing fast, transactional updates, process pairs and supervision—as well as some interesting statistics and observations.
For a little more analysis and a summary, I’d recommend reading @mononcqc’s excellent summary, especially to those who want to skip straight to the good bits of the report.
If you want to parse Ruby, I wish you luck—there isn’t any documentation or a formal grammar that fully describes the language. Instead, you have to look at the source code to truly understand Ruby’s syntax. This is where the nightmare starts.
Take a look a Ruby’s parser, defined in parse.y. I’ll wait for your eyes to stop bleeding, and then we’ll try to understand how it got that way.
Parsing code is traditionally a two stage process, first transform the stream of characters into a stream of tokens (lexing), then build trees out of these tokens (parsing)—i.e first you turn “1 + 2” into num(1), op(+), num(2) and then build a tree, i.e op(+, [num(1), num(2]).
A little more trickery is involved in parsing ruby, because of ruby’s (locally) ambiguous syntax. For example, x could be a variable, or a method call, and so x +1 could be ‘add 1 to x’ or it could be 'call x, with the argument +1’. To parse ruby unambiguously , you must know which variables are in scope.
Parsing Ruby successfully with this two stage process requires dealing with the ambiguity in the parser, or in the lexer—use ambiguous tokens and let the tree builder handle working out the right meaning, or produce distinct tokens so the tree building is easy and straight forward..The former option is slow and memory hungry, so Ruby went for the latter option.
To parse the earlier example, x+1, Ruby’s parser tracks the local variables in scope. After reading x, different flags are set if x is a local variable. On reading +, Ruby checks these flags, and emits an infix operator if it’s after a local variable, and a unary one otherwise.
Similarly, distinguishing puts ([1]).map {|x| x+1} from puts([1]).map {|x| x+1} is handled by setting flags after skipping whitespace, and later checking these flags to emit an LPAREN or an LPAREN_ARG token. As well as semantic whitespace, flags are used for blocks, class definitions and method names too.
It might take a few days of digging, but once you understand the interplay between the flags, the tree builder, and the tokenizer—you can finally start to extract the rules of ruby syntax. Perhaps you can write a cleaner parser this time, but in the end it is just easier to copy parse.y than try to re-implement it. So far, porting parse.y is what every successful Ruby implementation has done
Aside: In MRI’s parse.y, JRuby’s Ruby19Parser.y and IronRuby’s Parser.y you can find the lexer state 'EXPR_MID’, and as far as I can tell, it doesn’t do anything.
Although I’ve talked about Ruby parsers, I haven’t really explained why anyone sane would want to write one. By luck and various cabal like connections, I stumbled across a rather interesting project to write a new Ruby Interpreter using PyPy. I wanted to help, and started looking at fixing the parser.
In the end, all I could offer was the advice I’ve written here. There are two ways to parse ruby—Do exactly what Matz did, or slowly lose your sanity, gazing into the abyss that is parse.y
Postel’s Principle is an adage from the specification of TCP, aimed at implementors, commonly cited and worshipped:
Be conservative in what you do, be liberal in what you accept from others
Postel’s Principle is wrong, or perhaps wrongly applied. The problem is that although implementations will handle well formed messages consistently, they all handle errors differently. If some data means two different things to different parts of your program or network, it can be exploited—Interoperability is achieved at the expense of security.
These problems exist in TCP, the poster child for Postel’s principle. It is possible to make different machines see different input, by building packets that one machine accepts and the other rejects. In Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection, the authors use features like IP fragmentation, corrupt packets, and other ambiguous bits of the standard, to smuggle attacks through firewalls and early warning systems.
Postel’s Principle creates these problems by encouraging people to accept faulty input, but without enforcing consistency. These problems aren’t unique to Postel’s Principle—similar notions underpin attacks like the confused deputy, and cross site scripting attacks.
In early versions of perl there was the poison null byte attack. By passing parameters like ‘file.cgi%00file.jpg’ to a CGI script (where %00 is a null byte, URL encoded), perl would see one string, but the C library underneath would see another. perl would see 'jpg’ as the extension, but C would see 'cgi’.
Similarly, HTTPS certificates suffered from problems with null bytes and corrupt input. The person issuing the certificate, and the person validating it saw different names, due to a null byte. When one implementor is a little more liberal than the next, or a little less, these discrepancies occur. I am not blaming Postel for these bugs, but the Principle is often invoked to defend them.
Treat valid or expected inputs as formal languages, accept them with a matching computational power, and generate their recognizer from their grammar.
Treat input handling computational power as a privilege, and
reduce it whenever possible.
The paper, and other work and talks from the LANGSEC group, outlines a manifesto for language based security—be precise and consistent in the face of ambiguity, and use simpler parsers to shrink your attack surface. You should be liberal in what you accept, but this should be formalised and standardised, not left to the implementor.
Instead of just specifying a grammar, you should specify a parsing algorithm, including the error correction behaviour (if any). Security requires being able to validate or sanitise input consistently across all implementations, but this doesn’t have to come at the expense of interoperability. Notably, HTML5 gained a standardized parser, not for security reasons, but to ensure that bad HTML looks the same on every browser.
Interoperability doesn’t just mean working in the same way, but failing in the same way too. Implementation specific behaviour is the root of all evil, and invoking Postel’s Principle will not redeem your soul.
I had the joy of attending EMF2012, an outdoor festival of tech, design and play. It’s not every festival you go to that has immaculate wireless, high speed internet, a power grid, clean toilets, or for that matter a barista, a blacksmith, and a bar under a motorway—but strange and wonderful things happen when you have a critical mass of enthusiasm.
I’d like to thank Jonty and Russ and the rest of the EMF crew and attendees for an awesome weekend. There were good parts and the not-so good parts, or in other words—I gave a talk.
Entitled “Programming is terrible—lessons learned from a life wasted”, the name and content might be a little familiar to those who have read this tumblr, but even avid viewers may find something in it I haven’t recycled yet.
The feedback engaged me—As a result of sharing my ideas and troubles, I got to encounter new ideas, support and criticism, which in turn encouraged my narcissistic quest to publish to a larger audience—i.e this tumblr.
(Being a Short Treatise on the Nature of Failure; How Failure is Evaluated; How Failure is Attributed to Proximate Cause; and the Resulting New Understanding of Patient Safety)—Richard I. Cook, MD
If you’re one of the many who wake up screaming “WHY IS EVERYTHING BROKEN”, this might be quite cathartic. Richard Cook isn’t a programmer but he has a fascinating insight into what makes complex systems work, and occasionally fail. Although grounded in experiences in healthcare, this perfectly describes the crazed world many developers fear: Operations
Quicksort is one of the best known sorting algorithms, judging by responses to interview questions. Everyone loves Quicksort because it is so straight forward to explain–
Pick an element from the list, call it the pivot
Split the list into parts greater and smaller than the pivot
For these two sublists, repeat the procedure until you have one element or less
As much as I like quicksort, it is very hard to implement well but easy to implement poorly. As Bentley and McIlroy’s classic paper “Engineering a sort function” explains, there are many common implementation problems, including a lack of thread safety and terrible performance on certain inputs–
Yet in the summer of 1991 our colleagues Allan Wilks and Rick Becker found that a qsort run that should have taken a few minutes was chewing up hours of CPU time. Had they not interrupted it, it would have gone on for weeks. They found that it took n2 comparisons to sort an ‘organ-pipe’ array of 2n integers: 123..nn.. 321.
Unable to find a good enough qsort, we set out to build a better one.
Bentley and McIlroy show how to mitigate the worst case by careful selection of a pivot, amongst other tricks. Instead of picking the first, or a random pivot, they argue for a dynamic scheme based on the array size–
With more effort finding a central element, we can push the number of comparisons closer to n lg n. […] Our final code therefore chooses the middle element of smaller arrays, the median of the first, middle and last elements of a mid-sized array, and the pseudo-median of nine evenly spaced elements of a large array.
There is one final improvement for quicksort. Almost twenty years later, Vladimir Yaroslavskiy found a beautiful trick to improve both the average and the worst case timings—Instead of one pivot, pick two—Dual Pivot Quicksort. Even Jon Bentley was impressed–
I think that Vladimir’s contributions to Quicksort go way beyond
anything that I’ve ever done, and rank up there with Hoare’s original
design and Sedgewick’s analysis — Jon Bentley From the openjdk discussion
It is reassuring to know we can always find new tricks for old algorithms. Using two elements instead of one is new to quicksort, but this trick surfaces in other places too. The power of two random choices: A survey of techniques and results shows that moving from one to two choices has a significant effect in load balancing–
….even a small amount of choice can lead to drastically different results in load balancing. Indeed, having just two random choices yields a large reduction in the maximum load over having one choice, while each additional choice beyond two decreases the maximum load by just a constant factor.
Similarly cuckoo hashing use a similar trick, using two hash functions instead of one to improve worse case performance. Still, if we are worried about the performance, sometimes it’s better to choose a different algorithm than tune another, as Bentley et al remark–
if worst-case performance is important, Quicksort is the wrong algorithm. (A quick memory fault might even be preferred to wasting weeks of cycles on a worst-case run.
Kernighan talks about good, and bad programming style, at the Institute for Advanced Study in 2009. With from old and new examples, building on the books “The Elements of Programming Style”, and “The Practice of Programming”.
(Yes, the slides *do* appear on the video, after a few minutes)
Years ago I joined a small company, spun out from another startup, to turn a patent into profit. The patent was about a better way to run a chat service, similar to IRC.
Chat servers have to keep track of clients, the chatrooms they are in, and deliver messages between clients. Normally when you have multiple servers involved, each of these must keep track of all the clients, chatrooms and messages too. With the patent, it offered a way to spread the load across all the machines, rather than duplicating the work.
To distribute the work, the rooms would be managed by one machine in particular and rooms could be moved from one machine to another. To make these migrations invisible, routers sat between clients and the rooms, buffering messages while the room was in transit. This meant that you could add new machines or remove them, without interrupting the conversations.
Although i’ve talked about chatrooms and clients, the patent was really about migrating processes. The process could work as a chatroom, or it could implement some special logic and keep track of extra information. When the process moved, the extra information was passed along too.
The patent solved a hard problem and promised easy scaling, but it came at a price—the extra routers introduced latency, processes could not communicate with each other, and there was no real way to split one large process across two servers. If your problem didn’t neatly break into isolated pieces, it was a very tough fit. Despite these drawbacks, and because Cloud, Elastic and Scale were fashionable, we went in search of applications to build.
I asked my Erlang loving friends for help. They thought process migration was a neat idea, but instead of processes with special data, they wrote processes without any. This elimination of state means If a process dies, you can just restart it. To them, a migration was really killing one process and starting another somewhere else. When I relayed my findings, optimism prevailed–
We will just need to make those things stateful to take advantage of our system
With state it isn’t so easy to be reliable—you need to run multiple copies of the process, so when one copy dies you can continue using the others. The patent covered migrating one process, not multiple ones, so we added backup servers. Before a process moved, the backups would be terminated, and after the move new backups would be created. This didn’t work so well in practice, and for a while turning on backups made the system crash. Eventually I realised that the problem was much harder—migration and reliability were conflicting goals.
Migration implies that the service can only exist in one place. Reliability requires keeping multiple copies in sync. Instead of moving processes from one machine to another, we are really making new copies, and terminating old ones. However the patent covered migration, so no matter what, migration was the solution. We wrote the framework around the patent, and the demo applications were constrained in turn.
We demoed an airplane booking system. By using one process per flight you could elastically scale with demand, with the minor drawback that you couldn’t search across multiple flights. The chat room demo we had couldn’t create new rooms.
Without a real application, features proliferated. Instead of making choices, we made options. If you encounter a framework that wasn’t extracted from a product, run away screaming, no matter how extensible it claims to be.
Patented or not, If the developers haven’t made anything useful using their framework, neither will you.
For some of you, I can convince you to watch this because It was produced by Edward Tufte. For those who enjoyed Gary Hustwit’s Design Trilogy (Helvetica, Objectified, Urbanized), this is a similar documentary.
I’m not qualified to answer this question, I burn out—but many other programmers I have met have managed to sustain employment and increase their pay. I will share with your their winning strategies–
Although you will be forced to document your software, don’t be afraid to write ugly prose, and ensure you leave out failure cases, types or arguments. Hopefully you will always be too busy to document and test the code. You have important bugs to fix.
Write lots of code. Lots of code. Autocomplete Helps. Use your own ad-hoc naming scheme. Write your own wrappers around standard library functions. Reinvent liberally. Learn to use the advanced features of your ide and language and use them everywhere. Don’t be afraid to seperate everything out into modules that only make sense when combined.
Fix problems by creating new ones. Ensure that if you close the bug for now, someone new will re-open it. You can create an equilibrium by constantly shifting the bug around from person to person. Never raise a bug, always fix the problem—A workaround in your code is quicker than fixing the underlying faults.
Ensure your tests only pass some of the time. Better if only on your machine with some elaborate setup. Become the central point of failure for the development—those who aren’t will be passed over or lose their job.
When I started at the company, I was told about the best programmer they’d found. Singlehandedly ported the founders code from one Microsoft product to the next, growing through the ranks to become an Architect. A Hero everyone looked up to, unbeaten in the race to close bugs,
Except sometimes they’d reopen. Roughly, we listed things for sale, all with slightly different tax rates. We’d check the prices and occasionally it would be wrong, and the bug was raised to the top and pronounced fixed. Meanwhile, someone else was raising a bug, the tax was wrong on a different item. Eventually the real problem surfaced—tax was set globally across the entire site, not per item. Each bug filed and closed changed every price on the site, breaking the rest.
When I left the company, my code was documented, I’d trained the team, and other people could actually change the build system. That and being one of the youngest serving employers made it easy for me to be made redundant. Unfortunately for the team, the operational overhead was low, and so the remaining members were left to fight each other for the jobs left
Years on, the architect is still there, although defrocked. As is much of their code, many still afraid to touch it.
Job security comes from constant creation of work only you can do. If you act like you are the only programmer and this is the only bug you have, you will be rewarded for sabotaging the product.
I’m not bitter. If you write yourself out of a job, it probably wasn’t worth keeping.