programming is terriblelessons learned from a life wasted

Hello, I am tef. I am available for hire.

Hello, I am tef. I am available for hire. (twentygototen.org)

If anyone is looking for a no-good, terrible programmer, I will be available for work soon. I like working on algorithms and data structures, so i’m better at writing code for machines to speak to each other, than user-facing code.

I’ve come to the end of my time at Code Club, as I managed to solve more problems than I caused. We’re hiring someone to take on the web development role, and soon we’ll be hiring people write more projects too.

You’ll note by my spartan CV that I’m not the best fit for front end heavy work, so I’ve opted to take a step back from Code Club to let someone else lead the way forward. It’s been awfully fun at Code Club, and I will miss it.

I am excited to find out what’s next for me, and for Code Club too. I still deeply care about education, but it’s not the only thing I want to change.

Startup Ketamines — Demotivational motivational posters for “startup culture”

Startup Ketamines — Demotivational motivational posters for “startup culture” (tomscott.com)

Earlier today I saw a website (Startup Vitamins) selling “motivational posters” for startups. I figured we needed a little more honesty in the posters, before giving up and playing with a cat.

Tom Scott has transformed them into lovely A4 sized posters, which are free to print if you use the office printer.

miniKanren is the principal member of an eponymous family of relational (logic) programming languages. Many of its critical design decisions are a reaction to those of Prolog and other well-known 5th-generation languages. One of the differences is that, while a typical Prolog implementation might be thousands of lines of C code, a miniKanren language is usually implemented in somewhere under 1000 lines. Though there are miniKanren languages of varied sizes and feature sets, the original published implementation was 265 lines of Scheme code. In those few lines, it provides an expressiveness comparable to that of an implementation of a pure subset of Prolog.

We argue, though, that deeply buried within that 265-line miniKanren implementation is a small, beautiful, relational programming language seeking to get out.

μKanren: A Minimal Functional Core for Relational Programming

Papert’s Dreams and our Grim Meathook Reality

In “Meanwhile, at code.org”, Bret juxtaposes the ideals of Seymour Papert and the dreams of entrepreneurs and venture capitalists. Papert wanted to use programming as a way to let children explore powerful ideas and let their imagination run wild. The agenda of the political, wealthy, and powerful is to build a new generation of worker bees to fuel their startups. One sees code as a liberation, and the other as a vocation.

I’ve talked about these sorts of things before. I see code as a medium for design, engineering, science, art and play, and a computer as a lever long enough to move the world. Which is why I’m thankful for tools like Scratch and the work of the lifelong kindergarden group at MIT. They’re not the only ones trying to Revive Papert’s Dream

[In Papert’s first article about LOGO “Twenty Things to Do with a Computer” he] described how children might program computers to control robots, compose music, create games, draw recursive pictures, and do many other creative activities.

It was a radical vision. At the time, in 1971, computers still cost tens of thousands of dollars, if not more. The first personal computers would not become commercially available for another five years. Yet Papert foresaw that computers would eventually become accessible for everyone, even children, and he wanted to lay the intellectual foundation for how computing could transform the ways children learn and play.

Some aspects of Papert’s dream have become a reality. […] At the same time, important elements of Papert’s dream remain unfulfilled. Papert envisioned a world in which children not only learn to use new technologies, but become truly fluent with new technologies. In Papert’s view, children should be able to design, create, and express themselves with new technologies. Rather than just interacting with animations, games, and simulations, children should learn to program their own animations, games, and simulations — and, in the process, learn important problem-solving skills and project-design strategies.

Despite the naysayers, to me programming is the ultimate sandbox game. Which is why I want to put these tools in the hands of children just to see what wonders they create, following in the footsteps of Seymour Papert. Just like Mitch Resnik

[After a] keynote presentation at a major educational technology conference, someone asked: “Wasn’t Seymour Papert trying to do the same things 20 years ago?” The comment was meant as a critique; I took it as a compliment. I answered simply: “Yes.” For me, Seymour’s ideas remain as important today as when he published his first article about Logo in this magazine in 1971. His ideas continue to provide a vision and a direction for my research. I will be happy and proud to spend the rest of my life trying to turn Seymour’s dreams into a reality.

I want to dream bigger — Code is just one way to revolutionise education, putting powerful ideas in the hands of the next generation. It’s never been about the code, but the learning through play.

Getting away with rewriting code from scratch.

Joel Spolsky’s oft cited tribute to the sunk cost fallacy, Things You Should Never Do, Part I, extolls the follies of starting from scratch. With a length less than one percent of a Steve Yegge ramble, and with almost as much nuance as a tweet, he warns people about the dangers of starting afresh–

They did it by making the single worst strategic mistake that any software company can make: They decided to rewrite the code from scratch.

When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.

You are throwing away your market leadership. You are giving a gift of two or three years to your competitors, and believe me, that is a long time in software years.

The reality of Netscape’s demise isn’t so simple, as jwz elaborates in the excellent piece Groupware Bad

See, there were essentially two things that killed Netscape (and the real answer is book length, so I’m simplifying greatly, but)

  1. The one that got most of the press was Microsoft’s illegal use of their monopoly in one market (operating systems) to destroy an existing market (web browsers) by driving the market price for browsers to zero, instantaneously eliminating something like 60% of Netscape’s revenue. Which was, you know, bad.
  2. But the other one is that Netscape 4 was a really crappy product. We had built this really nice entry-level mail reader in Netscape 2.0, and it was a smashing success. Our punishment for that success was that management saw this general-purpose mail reader and said, “since this mail reader is popular with normal people, we must now pimp it out to `The Enterprise’, call it Groupware, and try to compete with Lotus Notes!”

    To do this, they bought a company called Collabra who had tried (and, mostly, failed) to do something similar to what we had accomplished. They bought this company and spliced 4 layers of management in above us. Somehow, Collabra managed to completely take control of Netscape: it was like Netscape had gotten acquired instead of the other way around.

    And then they went off into the weeds so badly that the Collabra-driven “3.0” release was obviously going to be so mind-blowingly late that “2.1” became “3.0” and “3.0” became “4.0”. (So yeah, 3.0 didn’t just seem like the bugfix patch-release for 2.0: it was.)

Ignoring Microsoft’s sneaky tricks, the root problem Netscape had was abandoning code that worked, not rewriting it. Although for Netscape a rewrite helped in their demise, Microsoft made a similar mistake in letting Internet Explorer languish, while a rewrite (Firefox) gained traction.

You can rewrite old Code, but the old Code still needs to be maintained, and migrations should be slow and steady. In my short life as a programmer, I’ve managed to rewrite two codebases without destroying the future of the company by following this simple dogma.

There are good, and bad reasons for rewriting code. jwz’s CADT model aptly sums up the bad, but sometimes rewrites are good because it is too expensive to add a feature to existing code, or the depth of the changes are *highly* invasive. Sometimes it is that the old code is a lovecraftian horror.

The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong.

Sometimes, the code is a genuine mess. I replaced a VB6/XSLT horror with something less grotesque in python, with the added benefit that we could now test code before deploying it.

In another instance, the code relied on amazon web services, and now obsolete unmaintained libraries. The project needed to work outside of amazon, and on a new platform. The code itself was littered with customer specific hacks and fixes which weren’t necessary for the new project. Starting afresh with hindsight allowed us to build a system where we could keep these one-off tweaks contained and separated.

In both cases, the old code was still maintained, and many years on, the old code is still running in production. However, the new code now does the overwhelming majority of the work. Migrations are done slowly, one at a time, and usually when it breaks in such a way that only the new version can handle it.

Total rewrites can often be better than rewriting a substantial chunk of your code too. In Interesting bits from “An Analysis of Errors in a Reuse-Oriented Development Environment”, MononcQc (or Fred), neatly sums up some studies on the effectiveness of rewrites–

if you need to rewrite more than 25% of a piece of code, rewriting it from scratch may be as good of an option when it comes to errors and defects.

Rewriting your code from scratch could be the single biggest mistake you make, but equally so, not-rewriting your code could lead to the same result. The old saying “There are only two types of software, failures and legacy code” still has some truth in it. Even if you do decide to rewrite things, the old code won’t disappear overnight.

Programming the weird machine — About 31 minutes in there is a demonstration of a fun a glitch in Super Mario World. By connecting up a bot to a console and faking the controller inputs, one rather canny person manages to execute arbitrary code.

James Mickens, the funniest person in Microsoft Research.

Slack off from your job today, and spend some quality time giggling.


On passwords and security — This World of Ours

In general, I think that security researchers have a problem with public relations. Security people are like smarmy teenagers who listen to goth music: they are full of morbid and detailed monologues about the pervasive catastrophes that surround us, but they are much less interested in the practical topic of what people should do before we’re inevitably killed by ravens or a shortage of black mascara. It’s like, websites are amazing BUT DON’T CLICK ON THAT LINK, and your phone can run all of these amazing apps BUT MANY OF YOUR APPS ARE EVIL, and if you order a Russian bride on Craigslist YOU MAY GET A CONFUSED FILIPINO MAN WHO DOES NOT LIKE BEING SHIPPED IN A BOX. It’s not clear what else there is to do with computers besides click on things, run applications, and fill spiritual voids using destitute mail-ordered foreigners. If the security people are correct, then the only provably safe activity is to stare at a horseshoe whose integrity has been verified by a quorum of Rivest, Shamir, and Adleman.

On the true horror of system programming — The Night Watch

The main thing that I ponder is who will be in my gang, because the likelihood of post-apocalyptic survival is directly related to the size and quality of your rag-tag group of associates. […] The most important person in my gang will be a systems programmer. A person who can debug a device driver or a distributed system is a person who can be trusted in a Hobbesian nightmare of breathtaking scope; a systems programmer has seen the terrors of the world and understood the intrinsic horror of existence. […]

A systems programmer will know what to do when society breaks down, because the systems programmer already lives in a world without law.


On consensus protocols — The Saddest Moment

Whenever I go to a conference and I discover that there will be a presentation about Byzantine fault tolerance, I always feel an immediate, unshakable sense of sadness, kind of like when you realize that bad things can happen to good people, or that Keanu Reeves will almost certainly make more money than you over arbitrary time scales. Watching a presentation on Byzantine fault tolerance is similar to watch- ing a foreign film from a depressing nation that used to be controlled by the Soviets—the only difference is that computers and networks are constantly failing instead of young Kapruskin being unable to reunite with the girl he fell in love with while he was working in a coal mine beneath an orphanage that was atop a prison that was inside the abstract concept of World War II.

Mobile Computing Research Is a Hornet’s Nest of Deception and Chicanery

Mobile computing researchers are a special kind of menace. They don’t smuggle rockets to Hezbollah, or clone baby seals and then make them work in sweatshops for pennies a day. That’s not the problem with mobile computing people. The problem with mobile computing people is that they have no shame. They write research papers with titles like “Crowdsourced Geolocation-based Energy Profiling for Mobile Devices,” as if the most urgent deficiency of smartphones is an insufficient composition of buzzwords. The real problem with mobile devices is that they are composed of Satan. They crash all of the time, ignore our basic commands, and spend most of their time sullen, quiet, and confused, draining their batteries and converting the energy into waste heat and thwarted dreams.

On the rise and fall of hardware design— The Slow Winter

Unfortunately for John, the branches made a pact with Satan and quantum mechanics during a midnight screening of “Weekend at Bernie’s II.” In exchange for their last remaining bits of entropy, the branches cast evil spells on future genera- tions of processors. Those evil spells had names like “scaling- induced voltage leaks” and “increasing levels of waste heat” and “Pauly Shore, who is only loosely connected to computer architecture, but who will continue to produce a new movie every three years until he sublimates into an empty bag of Cheetos and a pair of those running shoes that have individual toes and that make you look like you received a foot transplant from a Hobbit, Sasquatch, or an infertile Hobbit/Sasquatch hybrid.” Once again, I digress. The point is that the branches, those vanquished foes from long ago, would have the last laugh.

I used Prolog in a comparative languages course. The biggest program we did was a map-coloring one (color a map with only four colors so that no bordering items have the same color, given a mapping of things that border each other). I say biggest because we were given the most time with it. I started out like most people in my class trying to hack the language into letting me code a stinking algorithm to color a stinking map. Then I wrote a test function to check if the map was colored and, in a flash of prolog, realized that that was really all I needed to code.
http://c2.com/cgi/wiki?PrologLanguage
Back when PHP had less than 100 functions, the function hashing mechanism was strlen(). In order to get a nice hash distribution of function names across the various function name lengths, names were picked specifically to make them fit into a specific length bucket.
Rasmus on php.internals, via phpmanualmasterpieces
Parsing Techniques — A Practical Guide, is simply one of the best computer science books i’ve read, if not just the best book on the theory and practice of parsing.
It’s a magnificent tome of work, broad in scope and detailed in depth, with at least...

Parsing Techniques — A Practical Guide, is simply one of the best computer science books i’ve read, if not just the best book on the theory and practice of parsing.

It’s a magnificent tome of work, broad in scope and detailed in depth, with at least 400 references in print, and an astounding 1600 references online. Accessible, clear, and thorough, it is of use for both the novice and the expert.

If you’ve ever been a little curious about parsing, or wanted to improve on a technique, this is the book for you — it is the definitive work on parsing.

The trouble with TCP — It’s good but we’re stuck with it

TCP is possibly one of the most admired and least loathed protocols, you just have to find a bitter systems researcher and ask them what doesn’t suck that much. Sometimes they’ll say UTF-8, but TCP is also at the top of the list (which is good, because most of the internet is built atop of it).

TCP carries email, webpages, and a whole slew of data between computers, providing a reliable, ordered stream of data atop an unreliable, unordered IP network — but TCP isn’t perfect, and has a long history of unused or broken features.

For example: when the network became congested, older versions of TCP would retransmit aggressively, causing congestive collapse, bringing the entire internet to a halt.

Another problem was SYN flooding, where a computer could be tricked into exhausting all of its connections, but this was eventually solved by adding SYN Cookies. Some features, such as TCP urgent, have never worked in practice..

TCP has some design flaws, but sometimes the problems are with how TCP is implemented and used — TCP is a reliable ordered stream over a series of messages, but many protocols are a series of messages over TCP — forcing implementations to work around or re-implement TCP’s features.

Although TCP provides reliable delivery, an acknowledgement only says that the computer has received a message, not processed it. Applications must implement their own acknowledgements atop to ensure that the data has been processed.

Some protocols attempt multiplexing or pipelining too, issuing concurrent commands over a single connection, and encounter head of line blocking— where the ordered delivery gets in the way of multiplexing the messages. They also have to implement flow control, framing, and timeouts too.

There is an alternative protocol, SCTP, which promises to be a better transport for these messages, built around messages rather than streams, but it hasn’t taken off. Why? We’re stuck with TCP.

TCP’s greatest design decision was the end-to-end principle, in that only the computers communicating had to worry about reliability and ordering, and the computers in-between could pass around messages in delightful ignorance. This is no longer true.

Now TCP is burnt into the routers, firewalls, and home equipment, it’s really hard to send something that isn’t TCP (or UDP) over the network. TCP is also burnt into the operating systems too, it’s impossible for applications to change TCP’s behaviour to suit their needs.

If we admit TCP is fossilised, is this admitting defeat? Not yet. The TCP Minion project attempts to work around TCP as found to evolve the protocol, even if the wire format stays the same. Alternatively, we can just re-implement TCP over UDP, over and over again.

A few of my favourite theses

In the heady days of my earlier years, I would come home from the pub and immerse myself in academia. I have selected for you four delectable theses if you wish to indulge in the same.


Joe Armstrong’s Making reliable distributed systems in the presence of software errors.

This is about the development and design of the Erlang language and runtime—the core principles involved in making software more reliable, and more robust. The ideas presented here have standing outside of distributed systems, and can be appropriated for many other worlds of software. Although easy to state: Keep components isolated and let them crash, build a supervision hierarchy to handle failures — the thesis explains them vividly, and examines them in depth.

Aside: When I was lucky enough to talk to Joe Armstrong, I asked him about his thesis (well, almost). The colophon laments that his typesetting software was not mature enough to render his work, and we shared a brief respite from the world of system design to wax lyrical about typography.


Hàn Thế Thành’s Micro-typographic extensions to the TeX typesetting system.

Programming and typography have been deeply entwined since Knuth’s seminal work “Computers and Typography”. It should come as no surprise that many programmers have been entranced by typesetting, as a fastidious attention to detail is required by both.

In this thesis, Hàn Thế Thành outlines refinements for TeX’s line breaking through the use of optical typography, or simply, making margins look cleaner, but technically less aligned. Instead of bluntly pushing letters against a hard line, he extends TeX to gently allow some letters and punctuation to saunter over the margins to give a pleasing result. You can take advantage of this with a simple \usepackage{microtype} with full naïvety of the inner workings, but I find it fascinating nonetheless.


Ben Lippmeier’s Type Inference and Optimisation for an Impure World.

I cannot think of anything more appropriate than citing Aliester Crowley in a work about impurity in functional programming. This is a clever argument to embrace rather than reject the notions of destructive update in programming languages—rather than smuggle side effects through the type checker, extend it to allow it to reason about what your program is doing.

Unfortunately this flexibility comes at a price, lazy evaluation, which for many Haskell programmers will only be taken from their cold dead hands. For those less versed in wizardry, fear not! The thesis only breaks into hieroglyphics for type inference deep inside, and by then, you’re hooked.


Sylvain Schmitz’ Approximating Context-Free Grammars for Parsing and Verification.

Unfortunately less accessible than the others, but still interesting. A formalism for grammars is examined and used to generate a novel parsing contraption, as well as finding out ambiguity in grammars — if you have suffered under shift/reduce conflicts, this may interest you.

The parser cheats in an interesting fashion. When it encounters local ambiguity, it just ploughs on in a superset of the grammar until it finds something that disambiguates, rewinds, and then continues as if nothing has ever happened. This trick gives broader coverage of grammars, but maintains linear time.