Tuesday, October 1, 2019

The Single Worst Mistake any Software Product Based Company Can Make

I am a long time follower and reader of Joel Spolsky. Way way back in 2000, he penned what has become a seminal blog post.

Things You Should Never Do, Part 1

It should be required reading for all leadership teams in all software companies, especially the small ones, because those are the ones most likely to commit Hari-Kari in the way described in the above post. I will quote it.

We’re programmers. Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. We’re not excited by incremental renovation: tinkering, improving, planting flower beds.There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:It’s harder to read code than to write it.This is why code reuse is so hard. This is why everybody on your team has a different function they like to use for splitting strings into arrays of strings. They write their own function because it’s easier and more fun than figuring out how the old function works.

Here's something I've seen happen time and time again in my career.

Manager Dave:   Steve, why did we ship version 3.2 and break all our customer's ability to do business again? I thought once we hired QA people that would stop happening.
Team Lead Steve (haggard, has had 3 hours sleep):   Um, Dave, regressions in 3.2 were caused by the requirement in that release to rewrite a core module of our system to please one of our customers, that broke all three hundred installed customers.
Manager Dave:  Can you explain to me why that is?
Team Lead Steve:  Goes on to explain the concept of technical debt, makes reference to configuration spaces (parameter 310 is on, parameter 311 is off, parameter 312 is set to X, parameter 313 is set to Y).
Manager Dave (cradling his head in his hands): So what do we do, Steve?
Team Lead Steve:  Well we could rewrite everything get rid of all the technical debt, refuse to add parameters to turn features on and off in 900 ways.

In my view, the thing that saves people from making the single worst mistake (the ground up rewrite, discussed in the blog post from Joel Spolsky) is that most people realize the economic disaster before they commit all their resources to it.  Wait, we can't ship any minor updates to our customers for THREE YEARS? We won't HAVE any customers in three years.

Another way the rewrite conversation comes up is when it's time to grow your team from four delphi devs to to five, and you can't find one that will move to Spooksville Indiana, and join your local team, and you aren't going to add remote team members. If you moved to C#, and rewrote, all your problems will be over, there are reams of unemployed C# developers everywhere, and you can probably grow your team to 20 people no problem, you just need to rewrite 9 million lines of code in C# that are currently in Delphi. 

There are still teams running on Delphi 7 and Delphi 5 because they can't figure out the complexities of moving up to modern unicode Delphi  who think they can't manage the unicode transition, but they can successfully rewrite everything in C#.

Note that there ARE times when I think you can rewrite, ground up, same language or different, same database or different. They are when all the following stars align:

1. When you can keep your existing team working on your classic product, and start a parallel effort,. and afford to develop both in parallel for the years to a decade it will take before the new product can replace the old.

2. When you don't just think you know your requirements, your designs and your use cases but you actually do know them.  Most of the time you have a confirmation bias telling you that you know this stuff, but guess what, you're probably wrong here.

The next thing, if the economics don't kill you, in a rewrite, that's gonna kill you is second system effect. Second System Effect is a term coined by Fred Brooks and is the title of one of the essays/chapters in the seminal book "The Mythical Man Month" that every senior developer or software team leader/manager should read.

Joel says the same thing thus:

It’s important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time. First of all, you probably don’t even have the same programming team that worked on version one, so you don’t actually have “more experience”. You’re just going to make most of the old mistakes again, and introduce some new problems that weren’t in the original version.

I have noticed that old code is scarred with bug fixes. And we don't like the look of it we wish it looked clean and rather like pseudo-code. The thing is the real world has lock conflicts, network timeouts, retries,  user errors, windows defender locking and even altering your on-disk files, disk performance and corruption issues, video card bugs,  USB driver glitches.   Need I go on?

The real world is a mess, and if your product works at all for your customers, it's amazing that you got that far. Don't blow it now.  Fix your bugs, and clean your messes up. And you better stop digging new holes (increasing technical debt) before you can expect to see the old technical debts get paid off.

One of the leading causes of technical debt is crazy features for ONE customer.   You could rethink when and where you add custom hacks for one customer who is loud and persistent, and perhaps could accomplish their goals another way without breaking your product in half.

Thinking carefully about Risk management, and careful heads-up planning is necessary to avoid killing your software products or your whole software company.

Don't rewrite your code ground up.



7 comments:

  1. Spot on post. Let me guess, problems finding team members or someone has suggested a re-write in C#?

    ReplyDelete
    Replies
    1. Well, that may be an issue for very small teams, but in a bigger team that is the least of your issues, provided:
      - Your team is schooled in more than one language (language neutral)
      - You have written proper functional documentation
      - You have written proper technical documentation

      In such a case (been there done that with about 40 colleagues) conversion to another language, indeed, C#, is the lesser of your issues.

      Delete
  2. That's an anti-pattern. From the point of risk management you would not want to use tools which existence is hanging by a very thin thread. I would have written otherwise, but we are discussing Delphi here.

    ReplyDelete
  3. The biggest problem with rewrites is that it is almost always done by people that do not know why the current code is the way it is. Code is a type of capital that depends upon another type of capital for it's creation. You need human and intellectual capital to CREATE code but then that code embodies capital that exists nowhere else, namely all the little things learned during the maintenance and actual USE of the code. So when you hand off a rewrite to a new team and don't even show them the old code, you are DESTROYING far more capital than you are creating. You are losing the human capital that knows the existing code and why it is the way it is, and are losing the knowledge embedded in the code itself.

    If a firm is large enough and has enough profit to survive several years without significant income, they can likely survive such a blow, but companies on the margin can not. I worked at Webroot when they had just won PC Magazine's Editors Choice Award three years in a row for their anti-malware product. They decided that they needed to rewrite the entire product, because the new highly paid hotshot VP from Austria thought it was unacceptable that it was written in Delphi instead of a "professional language like C++". So first they rewrote the engine, the guts of the antimalware/antivirus logic. The bigwigs in California boasted that whereas a major competitor, Norton, had taken two whole years to do a rewrite, Webroot was going to do it in six months. Webroot fired all the Delphi programmers that had written the existing award-winning engine, hired new C++ programmers and was unable to add any new features to the engine for over two years, in fact losing functionality, as the C+ programmers tried desperately to reproduce what had taken roughly 8 years to perfect, in six months. Then twelve months. Then two years. Then they were all fired, the C++ engine was completely thrown out and Webroot bought an antimalware engine from Sophos. In the meantime, Webroot lost it market position, and lost all its intellectual capital. Including one of the world's top experts in software security and reverse-engineering virii, worms and malware.

    The guy who initiated all that nonsense and ruined the company got millions of dollars a year in salary and was promoted vigorously, until he left for even more money at another firm.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. "Note that there ARE times when I think you can rewrite, ground up, same language or different, same database or different. They are when all the following stars align..."

    Ground up complete rewrites from scratch never work. There are no stars to align. If you want to rewrite a large complex system you need to rewrite piece by piece. Break the system down into its component parts, connect them by interfaces, then fulfill those interfaces by new pieces as you find the time and resources to write them. But first and foremost, if you have paying customers and you are making a profit, FIX what is broken in the existing system first. Don't be a hero and think you are going to reproduce several years of effort in just a few months. You aren't and won't.

    ReplyDelete