Earlier this year, we released secure messaging clients across iOS, Android, and the browser. Over the last few months, however, it became clear that the web client was having slower release iterations than the other platforms while also hitting some performance ceilings. We decided it was finally time to fix all the issues, but as we got deeper into the code, we realized there was little we could reuse. The abstractions and implementation needed to change.
When you commit to rewriting a large chunk of code, you’re most likely about to make a huge mistake. Joel Spolsky argued this in his classic essay, “Things you should never do”. Yet when we successfully released it on schedule last week, I was frankly surprised because rewrites are usually such a dismal failure in the software industry. Since then, we’ve been thinking about the lessons we learned and how we managed scope creep. Here are some of those lessons:
- Know exactly why a rewrite will help the user on day one. This is probably the most important lesson to help keep things on the right track. Whether it’s better responsiveness, better features, or fewer bugs, have concrete and quantifiable goals that must be addressed. The rewrite is not for the developer, it’s for the user. Making the code “cleaner” is not good enough.
- Redefine the problem. Oftentimes, many requirements are in fact not really required. Insurmountable problems in the old code can become trivial if you question the right assumptions. I’ve found “hammock driven development” helps at this stage.
- Simplify the data model. How you store your data, on-disk and in memory, makes all the difference in what happens next. The fewer copies and immutable versions floating around the better, but avoid puritanical thoughts; denormalized data can be a beautiful thing.
- New infrastructure is expensive. Even if you outsource servers or services to somewhere in the cloud, the dependency and management of these new resources adds complexity (and fragility) you may not foresee. Designing simpler but performant infrastructure may be harder, but better.
- You probably don’t need a caching layer. Given modern hardware and the fact that databases already do some forms of caching for you, introducing more complexity into your stack can probably be avoided for many real world applications. Scrutinize whether something needs to be cached and at what level. Just remember Phil Karlton’s wise words: “There are only two hard things in Computer Science: cache invalidation and naming things.”
- You probably need a caching layer. If done right, caching can open a lot of doors. Yet this shouldn’t always entail introducing some new layer into your stack. Sometimes a good old static HashMap will do. There’s a cache for every budget.
- You can optimize that SQL query a lot more than you think. It never ceases to amaze me how much faster a SQL query can be made after you’ve told everyone it can’t be made any faster. These days it’s so easy to blame the relational backend and try to argue for a NoSQL solution. Be creative, you might be pleasantly surprised.
- If you’re deleting more code than you’re adding it means one of two things. You are either losing features and introducing old bugs or you’re actually doing something right. Make sure you know which one you’re doing at all times.
- The chance of a successful rewrite is inversely proportional to the age of the product. Code complexity is often a function of time: rebuild it when it’s young, and humor it when it’s old.
We didn’t rebuild our entire charge capture app, but we did end up rewriting a significant piece of the secure messaging feature. Whenever you get a sinking feeling of scope creep on a refactor, I think that Joel’s warnings and a few lessons can go a long way toward a successful strategy.