Estimating Software Delivery
Why it’s so difficult, plus a proposal for a loosely time-based story point scale.
Accurately estimating the complexity of a proposed software feature and the time necessary to deliver it to customers is an essential part of managing a successful team. There are many ways that a team can calculate estimates that once successfully adopted allow managers to confidently make commitments and reliably deliver on them to stakeholders. By whatever means these estimates are derived, one important factor is that they are normalized—this is, the estimate given by one team member means the same things as the same estimate given by another team member. By mitigating the different interpretations of complexity between team members and across different development platforms and skillsets, the team, its management and its stakeholders are all speaking the same language. This is a prerequisite for generating and recording metrics about a team’s performance which can be analyzed overtime to spot problems or make incremental improvements to efficiency and throughput.
But therein lies the problem: How can you get a whole team to start speaking the same estimate language? The SCRUM framework provides little help in that it advises not to attach any real-world values—such as time—to the numbers used to represent an estimate. One common strategy is to use numbers from the Fibonacci sequence as “story points” that each represent abstract complexity. The combined total of story points from the whole team in each sprint represents the total complexity that can be accomplished within the sprint’s time frame. This, of course, is velocity, a measurement of team throughput that is key for analyzing team performance and capabilities. Another common strategy simplifies this even further by using “T-shirt sizes”—small, medium, large, etc.—in order to measure complexity.
The problem with these strategies is that, while clever, they do not achieve the desired normalization. One engineer’s idea of complexity could vary from that of another engineer. Even if they agreed on complexity, a more experienced engineer might be able to deliver on the task faster than a less experienced engineer. The accuracy of the estimate is then dependent to some degree on a task’s final assignee, who may not be the one who provided the initial estimate. A team established from senior engineers who takes on less-experienced members is now no longer calibrated.
The goal is to accurately measure velocity—a number that represents a team’s relative capabilities within a fixed team period, such as a two-week sprint. Sprint after sprint this can be analyzed in order to spot problems, such as a sprint in which too little work was successfully accomplished. And over time it can be used to measure improvements, i.e. how much more work can be accomplished or how much more efficiently can the same amount of work be completed and delivered to customers. But in order to make this cumulative measurement accurate, the components from which it is derived must also be accurate. And in order to do that, a task must only encapsulate work that can be executed from start to finish by a single team member. This task can then be estimated by that team member who can take into account all of the known and unknown complexity as well as his or her own capabilities and limitations.
Now that individuals are giving estimates, differences between skillsets and experience are mitigated. But because the work varies between developers of difference platforms as well as the work of designers, quality assurance testers, product owners and other team members, there is still a lack of normalization in these estimates.
Story Point Scale
The last step towards achieving the desired normalization is to use some shared, quantifiable measurement of complexity. The most obvious candidate is time—how long it will take a given individual to execute and delivery a given task of a given discipline. While SCRUM advises not to attach real world values to what should be purely abstract estimations of complexity, this practical adaptation is a tradeoff in order to make more sense of the numbers.
During sprint planning, each team member should provide an estimate using one of the following story point values:
A simple task that involves some time to accomplish, but less than a few hours. Updating a configuration variable, refactoring a few lines of code or uploading new assets might be good examples. This is also the minimum estimate.
A somewhat complex task that cannot be accomplished in less than a few hours, but will not take much more than a single day to finish and ship. For myself and my teams in the past, this value has also been the default estimate for bug fixes of which the root cause is yet unknown. Depending on your platform, another value may be more appropriate.
A moderately complex task that will take more than one or two days to accomplish, but can be completed in much less than a full week. This is common for introducing new features that are relatively simple and aren’t expected to require much experimentation as they are implemented.
Less than or equal to one week.
A complex task that could take up to one whole week to complete. This is a more complex feature that may take some experimentation during its implementation. There could also be some unknown factors involved, such as mitigating the side effects of changes to a large or unfamiliar code base.
Greater than one week.
A highly complex task that requires at least one week to accomplish, but will not require an entire two-week sprint. This kind of task may also involve external factors are that not a easy to control, such as the coordination of dependencies like consuming new internal services or third-party libraries.
Greater than a two-week sprint.
This is the maximum estimate that can be given and is too large to be actionable at the moment. This is a task of huge complexity or that contains many unknown factors and could not be completed within a single two-week sprint. The solution for these tasks is to break them down in smaller tasks that can be accomplished individually.