Friday, April 15, 2016

The #NoEstimates Game

Introduction

The #NoEstimates game is a group activity about the power of using data to forecast. It gives practitioners and stakeholders a better way to understand and allow for the effects of variation on their project planning efforts.

The game uses historical data about team performance to not only forecast the duration of a future project, but also uses the data to quantify the uncertainty around that forecast. This is a big step up from relying on a rule-of-thumb and an even bigger improvement on wishing that forecasts are exact!

This activity does not teach you everything that you need to know about #NoEstimates, but it does form an excellent jumping off point, with the simulation and surrounding discussion giving participants a real feel for the spirit of #NoEstimates.

Yes: dice-rolling will be part of the activity!

Once you've played the game, the next step is to start using the simple and convenient Skillfire project forecasting app on your actual projects. In fact, the #NoEstimates game doubles as training in the use of the app.

Please: If you want to use the output of the app to discuss project deadlines, don't just present the results to your stakeholders: play the game with them first!

Credits: This work derives from Adrian Fittolani's approach to Monte Carlo forecasting of projects. The Skillfire forecaster was a collaboration between myself and Tim Newbold.

Objectives

Participants will learn
  1. That uncertainty in project duration is unavoidable for any team (or process) that has variable output
  2. How to roughly estimate the degree of uncertainty for a new project based on historical data
  3. How to use the Skillfire project forecasting app and interpret its output
Bonus: Playing the game leads nicely into a broader discussion of the value and limitations of estimation and forecasting, and how other Agile and allied #NoEstimates techniques can be used to deliver value and coordinate effectively in the presence of uncertainty.

Duration

  • 20 to 45 minutes (depending on length of discussion)

Equipment

  • Whiteboard and markers
  • Lots of six-sided dice: one per participant
  • Each participant will need a pen and paper

Procedure

1. Introduce the Historical Data

In a typical #NoEstimates approach, rather than trying to estimate the size or duration of individual chunks of work (here we talk about stories, but these could be tasks, items, whatever) we assume an experienced, stable team that knows how to decompose work into small pieces

For those used to working in story points, this corresponds to a team that is skilled enough to split everything into (say) 1 to 3 point stories. At this point, perhaps we can stop bothering estimating, but have to keep slicing thinly. For those work in time-based estimation, a similar state of affairs is achieved when no item takes longer than (say) a person-day to complete. 

Now ...
Consider two teams, team P and team V. Both teams delivered their most recent project (both consisting of 48 stories) in 6 sprints, but that their delivery patterns differ significantly. 

Whiteboard

SprintTeam PTeam V
1
8
5
2
8
9
3
8
14
4
8
5
5
8
5
6
8
10
Total48 stories48 stories

Questions for the group / discussion

  1. Does your real-life team (or teams) resemble Team P or Team V?
  2. What are the full names of Team P and Team V? [Answer: Predictability and Variation.] 
  3. How does the degree of variability in output reflect internal team factors, external factors, and the nature of the team's work?

2. Back-of-the-envelope calculations

The next project has been estimated to require 80 stories for completion.
Scientists and Engineers often make calculations under simplifying assumptions on the back of an envelope. We can do that here.

Questions for the group / discussion

  1. How many sprints do you estimate that team P will take? Team V? [Just give a number.]
  2. What's the shortest and longest duration that you expect for each team?

Commentary

Under simple assumptions team P can be expected to keep delivering at 8 points per sprint and deliver in exactly 10 sprints. On average, team V will do the same, but could get lucky or unlucky. 

Most experienced practitioners will add some margin to an unbiased estimate such as these, say +/-30%. But this rather arbitrary, and does not reflect on the variability of the the team's delivery patterns.

Questions for the group / discussion

  1. If you ran this project for real, what factors could increase the actual project duration for either team?
  2. What  factors could decrease the duration?

Sample responses

Increase
  • External dependencies and delays
  • New stories emerge
  • Re-doing / re-working / iterating necessary
  • Change in technology
  • Change in team composition
  • Team members distracted / diverted / leave
  • Team members added late in the project
Decrease
  • Effective cutting of scope (some stories are cut)
  • Capable team members added early in the project
  • Shortcuts found

3. Play the game

Now we simulate a single sprint conducted by team V, first as a group, by rolling a single die.

E.g. rolling a 1 corresponds to a sprint in which 5 stories were delivered, a 2 implies 9, and so on.

Here's a complete simulation of an 80 point project:

Example: 8 and a bit sprints (call it 9) to deliver the project
Sprint 1: rolled 3, delivered 14, stories remaining 66
Sprint 2: rolled 2, delivered 9, stories remaining 57
Sprint 3: rolled 6, delivered 10, stories remaining 47
Sprint 4: rolled 4, delivered 5, stories remaining 42
Sprint 5: rolled 2, delivered 9, stories remaining 33
Sprint 6: rolled 5, delivered 5, stories remaining 28
Sprint 7: rolled 1, delivered 5, stories remaining 23
Sprint 8: rolled 3, delivered 14, stories remaining 9
Sprint 9: rolled 3, delivered 14, stories remaining -5

Distribute dice and get participants to roll a die, look up the result, and keep a running tally on the board to show how to simulate an entire project.

Now, have everyone play the game independently and note their own results. How many sprints does each simulation take?

Meanwhile draw up the beginnings of a chart for the class results on the board. [See next figure.] Have participants stack a dot for each of their games. About 30 simulations gives a nice visualisation of the distribution of sprint durations:



Here each dot corresponds to a single simulation's duration.

In the worst case, every sprint delivered 5 stories (which happened once). In the theoretically best case the team delivers 14 points every sprint (which didn't happen this time) and finishes in (just under) 6 sprints.

Empirically, we see that the likely delivery time is 10 to 13 sprints for team V, but it could be worse!

More formally: 
  • half the sprints take 12 or fewer sprints, making 12 the median result.
  • discarding the bottom 3 (10%) and top 3 (10%) results, we can forecast that the project should take between 10 and 13 sprints with 80% confidence. [But we really should run a lot more simulations to stabilise the distribution.]

To be really safe we need to assign 16 sprints, but a better strategy would be to make sure we're delivering stories in priority-order and start de-scoping if things are going slowly or especially if the last few stories just aren't that valuable.

4. The Cumulative Distribution

An alternative way to read off the median and a confidence range is via the cumulative distribution. We obtain this by adding dots as we move to the right (cumulating).



You can attempt to draw this on the whiteboard, but be sure to rescale. I usually just sketch the idea before moving on to the Skillfire app.

5. The forecasting app

Now, for a stable forecast we really need a lot more than than 30 simulations, so let's switch from dice to computer-based random-number generation and conduct 10,000 trials with the Skillfire project forecasting app: skillfire.co/project-forecaster



Just enter team V's historical data and the estimate for the new project size, press Simulate, and the app does the rest. The continuous graph is acheived by counting a partial sprint at the end, but otherwise this is simply what you would get if you sat around rolling a lot of dice.



Notice that the Median has shifted back down to 10 sprints (30 rolls was insufficient!) and the confidence interval is now set to 90% rather than 80%.

A great use of this tool is to help explain why further analysis or discovery won't eliminate uncertainty if team output is variable.

While we used six sprints of historical data for convenience (sides on a die), when using the app there's no need to restrict yourself to just six sprints of data.

6. Closing

Be sure to re-cap: we now have the technology to quantify uncertainty based on history, but this is only part of the conversation. Recall all the different ways project duration can get inflated (and the few by which it gets shortened!)

Remember: don't just use the app and present the results: play the game with colleagues and stakeholders first.

Let me know how you go!

More Q & A

Q: Where can I find out more about #NoEstimates?
A: Start by reading this interview with Neil Killick.

Q: Can I measure story points and use this as an #Estimates technique instead?
A: Absolutely. You can even try both!

Q: If scope typically increases by 20%, how do I factor that in?
A: Run simulations with inflated number of stories and report both.

Q: What else can I do with a lot of dice?
A: Play Tenzi! A standard Tenzi set comes with lots of dice.

Q: Can you show more simulations
A: Sure ...

Game 2: 7 sprints exactly (fast delivery)

Sprint 1: rolled 6, delivered 10, stories remaining 70
Sprint 2: rolled 6, delivered 10, stories remaining 60
Sprint 3: rolled 3, delivered 14, stories remaining 46
Sprint 4: rolled 3, delivered 14, stories remaining 32
Sprint 5: rolled 3, delivered 14, stories remaining 18
Sprint 6: rolled 2, delivered 9, stories remaining 9
Sprint 7: rolled 2, delivered 9, stories remaining 0

Game 3: 11 sprints (slower delivery)
Sprint 1: rolled 2, delivered 9, stories remaining 71
Sprint 2: rolled 6, delivered 10, stories remaining 61
Sprint 3: rolled 1, delivered 5, stories remaining 56
Sprint 4: rolled 6, delivered 10, stories remaining 46
Sprint 5: rolled 5, delivered 5, stories remaining 41
Sprint 6: rolled 6, delivered 10, stories remaining 31
Sprint 7: rolled 2, delivered 9, stories remaining 22
Sprint 8: rolled 2, delivered 9, stories remaining 13
Sprint 9: rolled 1, delivered 5, stories remaining 8
Sprint 10: rolled 1, delivered 5, stories remaining 3
Sprint 11: rolled 2, delivered 9, stories remaining -6


No comments:

Post a Comment