Estimation is one of the hardest parts of software projects. Scrum does not scale properly for estimation outside of a single team. Not providing estimates is not a reasonable option in any well run business at any significant position1. Where feasible, provide estimates with confidence intervals and update both as the project progresses2. This is akin to expressing Confidence in knowledge. An estimate of time with without an estimate of confidence indicates high confidence implicitly3.

Techniques for technical project estimation

One Approach1

  1. Gather information on what to build. Whether from specs or from user stories or from market research.
  2. Read and understand all affable information so you understand what you are being asked to build.
  3. Breakdown your requirements into logical feature groups, keeping an eye out for out of scope asks.
  4. Identify dependencies, and decide whether they are true dependencies at a given time. E.g. can you mock something out while work proceeds on the dependent item
  5. Estimate - use gross calculation and bring in the leads and principal engineers
    1. Use time boxing at this phase to avoid getting bogged down in detail
    2. Challenge estimates that surprise you
  6. Using the dependency graph along with estimates gives you a PERT chart. Which you can the use Forward Pass and Backward Pass scheduling to produce a critical path and from there a No Earlier Than (NET) date.
  7. Compare your NET date with your deadline. Make proper hiring/contracting decisions based on that comparison. Balance your total team weeks (or available team weeks).
  8. Allow teams to begin working on zero-dependency items as early as possible to get them familar with the project space

Jacob Kaplan-Moss Estimation Technique

Jacob Kaplan-Moss uses a technique that accounts for both the estimated time and the level on uncertainty. Estimate a likely time value using the below table, then estimate an uncertainty factor. Repeat for all tasks and break down larger tasks. For high uncertainty tasks, you can drive that down by researching, spiking, or diving into the work. Sum best case and worst case values for all tasks and express the estimate in the form3:

“I expect it’ll take about 3 weeks. Worst-case it could take up to 8 weeks.”

 Spreadsheet for Project Estimation

Time Estimates

ComplexityTime
small1 day
medium3 days
large1 week (5 days)
extra-large2 weeks (10 days)

Uncertainty

Uncertainty LevelMultiplier
low1.1
moderate1.5
high2.0
extreme5.0

Using Monte Carlo simulations

Monte Carlo method can be used to predict whether a project will be done before a given date. At a high level, you first sample the team’s throughput over a given set of days. Assuming you know how many items (story cards) you need to complete the project, you then sample the throughput for each working day from now to the date in question. You repeat this simulation for some significant amount of cycles (say a million), and that will give you a rough idea of probability. As the project proceeds you will re-peat this process and re-forecast. As you proceed in time, there will be less left to do and your sampling history will be larger. Therefore, your estimate should begin to converge on reality. One thing to note is that you are not accounting for sizing of stories because this throughput model assume that work items are roughly the same size. Therefore, you could look at backlog grooming as an attempt to make each story card roughly the same size work (break it down as much as reasonably possible).

This method assumes that the future will look like the past. Your team will be consistent in throughput (e.g. if you have a 30 day sample, they are expected to be have somewhat similar to the 30 day sample). Your team will be consistent in size4.

Histograms and Confidence intervals

While the above approach works when you are trying to find the probability of finishing by a given date, more likely we want to understand the date we are likely to finish by. This can be accomplished by running the simulation and sampling throughput until each loop finishes the project and then recording that date. To get a 95% confidence interval, you find the date for which 95% of the simulations land on or before that date4.