Getting away with rewriting code from scratch.
Joel Spolsky’s oft cited tribute to the sunk cost fallacy, Things You Should Never Do, Part I, extolls the follies of starting from scratch. With a length less than one percent of a Steve Yegge ramble, and with almost as much nuance as a tweet, he warns people about the dangers of starting afresh–
They did it by making the single worst strategic mistake that any software company can make: They decided to rewrite the code from scratch.
When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.
You are throwing away your market leadership. You are giving a gift of two or three years to your competitors, and believe me, that is a long time in software years.
The reality of Netscape’s demise isn’t so simple, as jwz elaborates in the excellent piece Groupware Bad–
See, there were essentially two things that killed Netscape (and the real answer is book length, so I’m simplifying greatly, but)
- The one that got most of the press was Microsoft’s illegal use of their monopoly in one market (operating systems) to destroy an existing market (web browsers) by driving the market price for browsers to zero, instantaneously eliminating something like 60% of Netscape’s revenue. Which was, you know, bad.
But the other one is that Netscape 4 was a really crappy product. We had built this really nice entry-level mail reader in Netscape 2.0, and it was a smashing success. Our punishment for that success was that management saw this general-purpose mail reader and said, “since this mail reader is popular with normal people, we must now pimp it out to `The Enterprise’, call it Groupware, and try to compete with Lotus Notes!”
To do this, they bought a company called Collabra who had tried (and, mostly, failed) to do something similar to what we had accomplished. They bought this company and spliced 4 layers of management in above us. Somehow, Collabra managed to completely take control of Netscape: it was like Netscape had gotten acquired instead of the other way around.
And then they went off into the weeds so badly that the Collabra-driven “3.0” release was obviously going to be so mind-blowingly late that “2.1” became “3.0” and “3.0” became “4.0”. (So yeah, 3.0 didn’t just seem like the bugfix patch-release for 2.0: it was.)
Ignoring Microsoft’s sneaky tricks, the root problem Netscape had was abandoning code that worked, not rewriting it. Although for Netscape a rewrite helped in their demise, Microsoft made a similar mistake in letting Internet Explorer languish, while a rewrite (Firefox) gained traction.
You can rewrite old Code, but the old Code still needs to be maintained, and migrations should be slow and steady. In my short life as a programmer, I’ve managed to rewrite two codebases without destroying the future of the company by following this simple dogma.
There are good, and bad reasons for rewriting code. jwz’s CADT model aptly sums up the bad, but sometimes rewrites are good because it is too expensive to add a feature to existing code, or the depth of the changes are *highly* invasive. Sometimes it is that the old code is a lovecraftian horror.
The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong.
Sometimes, the code is a genuine mess. I replaced a VB6/XSLT horror with something less grotesque in python, with the added benefit that we could now test code before deploying it.
In another instance, the code relied on amazon web services, and now obsolete unmaintained libraries. The project needed to work outside of amazon, and on a new platform. The code itself was littered with customer specific hacks and fixes which weren’t necessary for the new project. Starting afresh with hindsight allowed us to build a system where we could keep these one-off tweaks contained and separated.
In both cases, the old code was still maintained, and many years on, the old code is still running in production. However, the new code now does the overwhelming majority of the work. Migrations are done slowly, one at a time, and usually when it breaks in such a way that only the new version can handle it.
Total rewrites can often be better than rewriting a substantial chunk of your code too. In Interesting bits from “An Analysis of Errors in a Reuse-Oriented Development Environment”, MononcQc (or Fred), neatly sums up some studies on the effectiveness of rewrites–
if you need to rewrite more than 25% of a piece of code, rewriting it from scratch may be as good of an option when it comes to errors and defects.
Rewriting your code from scratch could be the single biggest mistake you make, but equally so, not-rewriting your code could lead to the same result. The old saying “There are only two types of software, failures and legacy code” still has some truth in it. Even if you do decide to rewrite things, the old code won’t disappear overnight.