Why we suck at estimates

Let’s start with a waterfall cautionary tale

Bob was a software engineer in the mid 90’s.  He was working for one of those really large companies that had the lion’s share of the market.  In theory, they released a new product once a year.  In practice, they released product once every 18 months or two years.  Bob’s team had just finished v4 of their popular document creation and management software.  This iteration was only 7 months late from when it was promised.  All in all, not too far off of their average.

The team was getting together to discuss version 5 with all of the key stakeholders.  This time was going to be different.  This time they were going to account for everything and release on time and under budget.  All of the managers vowed that they had learned from their mistakes and they ‘had this’.  Bob inwardly rolled his eyes but did his best to hop on board and toe the company line.  He sat through all of the meetings and gave his input on estimates, research required, design methodologies and architecture.  He thought they were underestimating many of the innovative new elements and said so.  The team increased those estimates by a token amount and signed off on the yearly project plan.

The first four months seemed to be going ok.  They were spending a little too much time on fixing defects coming in from v4 out in the field.  They had already released three hot fixes to deal with the more serious ones.  They were also struggling with two new technologies that they were trying to bring into the product.  One was a new open source library and the other was based on a functional language that the team didn’t have any experience with.  Their architect, Roger, said they were only days away from cracking both of those particular nuts so the atmosphere was still  optimistic.

Going into month 6, the road started to get a lot bumpier.  Turns out, the open source library they chose doesn’t scale.  They were going to have to either scrap all the functionality that relied upon it or rewrite those pieces of the library themselves.  The functional programming piece, on the other hand, was going wonderfully.  They put one of their younger aces on it and he was well ahead of schedule.  Unfortunately, he liked coding in this new language so much that he had just accepted a new position where he could code in it full time.  It was going to take at least three months to replace him and the team only had two weeks to do the knowledge transfer.

Bob’s immediate manager had put on at least 10 pounds and looked like he was working hard on his first ulcer.  When the sales team came in month 8 to say that their competitors had just released a new menu system that was killing it in the marketplace, the team knew they were in for a major scope change.  Those silver tongue bastards in sales had convinced the leadership team that they had to have this functionality or risk ruin.  The CEO, a salesman by trade, agreed.  Bob’s manager was told that they could scrap one of the smaller pieces of functionality to replace it with this.   Bob’s manager’s arguments of, the technology was not built to support the new menu system, fell on deaf ears.  The sales team got their way.

Bob and the team worked diligently to duct tape this new functionality on to the existing architecture.  They were building mountains of technical debt, but who cared, they were getting close to go time.  They hit their code complete date at month 11.  Everybody celebrated for about a day until QA got their hands on it.

Come month 15, QA declared the product a disaster.  Some of the bugs they were finding were so unwieldy that the team had to rewrite major core elements of the code base that had worked just fine in v4.  There were also rumors that the new menu system was not doing as well in the market as they had first heard and there was a large group of customers who hated it.  They were so committed at this point to making it work that they pushed through anyway.

The product was released in month 22 to mixed reviews.  95% of the customers who bought it turned off the new menu system in the config screen.  The sales guy who championed the idea left six months ago so there was no one to point the finger at but the leadership team.  Most of the developers working on the project left because they were so demoralized about spending so much time on something no one used.

Well duh, that’s why everyone has switched to agile methodologies, right?

Sadly, agile does not totally fix this problem.  It certainly doesn’t if implemented poorly.

Jump forward 15 years and meet Karen, an engineer for an ERP company.  Her company had just hired a new VP of Engineering for the sole reason of bringing agile into the company.  Their product was an enterprise product that still had to be installed on premise.  They had distant plans to move it to a SaaS product but not anytime soon.

This ERP company had about as good of a track record as Bob’s company when it came to delivering on time.  Their new VP of Engineering had a ton of energy and was well liked by everyone.  He was hugely successful in his last job where he built a bunch of custom websites for Fortune 500 companies.

Karen’s last product release was over two years late, so the team was ready for a change.  They soaked up the idea of agile and its two week sprints like sponges out of water.  The simple user story was such a relief after wading through 200 page design docs for so long.  Plus, QA was going to be working right next to them the entire time!  This had to work, right?

The trouble began in preparing for the very first release.  Their typical release schedule was every two years, plus or minus a year.  The VP fought for monthly releases and lost.  He then fought for quarterly releases and lost.  Our customers will never take releases that often, he was told.  Finally, he convinced the leadership team to do single year releases instead of every two years.  He explained to the team that they would be doing quarterly releases internally.

Sadly, the product team did not report to our new VP of Engineering.  So the product team built a yearlong product spec that did not have any adaptability for change.  The spec also required some serious innovation, new architecture and design patterns that the team had never used before.  The product team was also unwilling to split this spec into quarterly parts knowing that would probably require them to trim some of the functionality set that they had labored so hard over.

In the initial product meetings, the VP of Engineering got bullied into certain estimate limitations by the ‘experts’ in the company who were so adept at delivering product a year or two late.  This bullying happened after he and his team said the initial spec would take at least 18 months.  He got product to cut a couple of big features but no more.  His estimates still had them at 16 months.  He finally acquiesced when the CEO paid him a visit and said that all change had to happen gradually, even agile.

After the first quarterly internal release it was clear that there was no way they were going to make their annual release.  The new architecture was still very unsteady and had a host of unknowns.  Our VP was praised by the leadership team when he brought up these issues this early in the process.  “Most of the time we don’t find out about these type of problems until a month before the release,” said the CEO.  The VP of Engineering gave him a pained grin.

Come the second quarterly release, the team had figured out the architectural issues but in twice the time they had estimated.  The product team was already pissed that this meant a lot of feature cuts.  However, they were exuberant that they already had something to play with so early.  Naturally, they took the MVP out in the field and got some responses.  Not all of it was positive so they started changing scope.

Our VP of Engineering was excited about the changed scope because this was how agile was supposed to happen!  However, he lost the leadership team when he said that the team would have to reevaluate all of their previous estimates.  He tried again for monthly releases, cautioning that there would be even more scope changes as the customer feedback came in, so why not get rid of the whole idea of an annual release?  This just would not sink in with the old guard.

Our VP was being courted by other companies throughout and when he lost this last battle he gave up and took another position.  Karen and four other members of the team left with him.  They heard a year later that the product was released only 5 months late.  The company was celebrating that as a win.

These failures are so common in software development and a lot of that failure comes down to our reliance upon estimates.

Why is estimation so hard?

There are several principles of estimation that I have found to be universal.  If you don’t account for these principles, chances are you’re going to pay for it later.

  • Almost all estimates are given as best case scenarios
    • When we are asked how long something will take, we always think, “how long with this take if I have no interruptions?”
    • Having no interruptions is a fantasy.  We are constantly being put into context switching scenarios where we rarely find the ‘zone’.
    • Story points are your friend here.  Story points are relative based off of past history of the team and they do a good job of accounting for interruptions.
    • If you are forced into giving hourly estimates, try to push for days as your smallest unit of time and pad those estimates based on previous results.
  • We can only estimate as if the world around us is static
    • If there is one constant to this existence it is that change will happen.  Entropy is the way of the universe.
    • We can take a bunch of guesses about what may happen over a long period of time, but they will just be guesses and most of them will be wrong.
    • The further you look in the future, the more wrong you will be.  Keep your releases as short as possible.
  • We cannot estimate things we have never done before with any degree of accuracy
    • If you have unknowns or are working with technology that you or your team has never worked with before, your only true answer to how long it will take is: I don’t know.  It’s OK to say I don’t know.
    • Don’t fake it.  You need to de-risk these unknowns by prototyping or bringing in experts that have done these things before.
  • People who don’t understand technology assume that all technology is similar
    • Many execs assume that if they can talk intelligently about a product, they know how long it should take to build it.  This is so painfully wrong.
    • There is some willful ignorance here but most of it is that tech folks do a poor job of educating others.
    • Chances are you will hear this question many times: that program took only a month to build, why does this one take a year?  Don’t take offense, answer it honestly.
    • There is also an assumption that all technologists are the same.  This has gotten better over the last couple of years but it’s amazing how many people believe that a full stack developer means that that person knows about every technology.
    • There’s enough tech out there these days to require specialists.  Asking a UI developer to build a data layer is like asking an ophthalmologist to perform open heart surgery.  They could probably figure it out given enough time, but would you want them to?
    • Educate these folks as often as possible.  Don’t assume they understand technology like you do.
  • Your code will be far buggier than you think
    • Getting QA involved early and often helps dramatically
    • Unit test, unit test, unit test.  Look at TDD if the team can stomach it.
    • Automated testing helps a ton here too.
    • You have to budget for fixing defects.  So many developers give estimates as if everything will work perfectly the first time.
  • No one thinks about nonfunctional requirements while estimating until reminded
    • Nonfunctional requirements are things like scalability and performance.  Nobody thinks about this until the product is delivered and the CEO asks, “is it normal for the loading screen to take 5 minutes?”
    • Optimization takes a lot of time.  Plan for it.

Regardless of how progressively agile your company is, chances are you will be asked for estimates sometime along the way. Keep in mind that all estimates are guesses and that the further out you are the more likely you are to be wrong.  If you keep these estimation principles in mind, you will have the best chance of not making a complete ass of yourself.

Leave a comment