Skip to content

enthiosys agile product management
Syndicate content
agile product management, motivated from within
Updated: 25 weeks 4 days ago

Agile Teamwork - MORE THAN Iterations, Time-Boxes, and Estimates

Thu, 05/19/2011 - 06:56

Sandy Walsh, a developer at Rackspace who also works on the OpenStack cloud development project, has written a blog post asserting that Iterations and Time-boxing are (Mostly) Useless. While I enjoy Sandy’s thought provoking content, I feel compelled to respond with some alternative points of view, as my company is currently providing Scrum-centric Agile coaching services to Rackspace. I’ve also enjoyed working directly with J.B. Rainsberger, who commented on Sandy’s post, as a Board member of the Agile Alliance, and suggest that JB makes many fine comments. You can also see Glen Campbell’s post as a reply to Sandy here. Given that I like to write, I hope you find a lot of value in this rather long post.
Agile and Code Management
Let’s handle the easiest item that Sandy addresses – code management. Sandy correctly points out that Agile and Kanban stress traditional code management practices and that modern code management tools help enable agility. Yup. Just like we’ve always known that test automation is a good thing, but we’re starting to see a lot more of it because of better test automation frameworks (aka tools).
XP, Agile, and Experience
XP is the Agile method that plays best to highly accomplished, experienced developers. To understand why, we need to understand what a cognitive psychologist means when we say that someone has “experience”. We mean that this person has a deep, thorough, and accurate cognitive library of stereotypical plans (you can think of a cognitive plan as similar to a software pattern) that they can use to efficiently and competently solve complex problems.

Expertise, therefore, is the ability to “know” the “right” way to solve a problem because you’ve solved this or similar problems many times before. It is this notion of “expertise” that cause individual “expert” developers to complain loudly when asked to decompose backlog items into 3hr or less tasks in a Sprint planning meeting -  “Dangit – I just know how to get this done – why do I have to task it out?” In reality, experts do know how to get these things done, and task-decomposition can be really painful (this is a well-known problem in knowledge engineering; the greater the expertise, the harder it is for the expert to explicate their cognitive library).

The reality, however, is that most development teams are composed not of superstar experts, but of a mix of people with different skills and experiences. As such, the Sprint planning process of decomposing backlog items into 3hr (my target recommendation) task helps ensure that thorough planning is accomplished (as JB recommends), enables the team to work as a team (because tasks can be done by various individuals within the team), and increases confidence that the team will accomplish their work as a team with measurable progress.

You can find a detailed discussion of cognitive libraries and the interesting effects of different levels of expertise in Chapter 1 of my book “Journey of the Software Professional”. As I said earlier, of all the Agile methods, XP is the method that plays most to experience, and that the greater the experience of the development team, the more likely that they will gravitate towards efficient execution of XP practices.

I’ve coached mature Agile teams who have earned the right to simple decompose tasks without hour-based estimates, because they’ve proven that their tasks are such that reliable progress can be made. This proof is that when they plan for a task to be accomplished within a day… it is. Over time, this becomes a very powerful habit – learning to plan so that more often than note, teams can move tasks from “work-in-progress” to “done” and from “to-do” to “work-in-progress” every day. However, I do believe that this is something a team must earn, and until they’re reliably creating and delivering against small tasks, they should be doing hours based estimates.
I Don’t Want “Command and Control” (Unless I’m Commanding and Controlling)
I find arguments from developers against fine-grained task decomposition very interesting. One hand, most developers dislike being told by management what they should be doing. This aspect of a self-managed team – that the team chooses how tasks are consumed – is one of the single most liberating aspects of Scrum-based Agile. And yet, in the same conversation, these developers often rail against detailed task decomposition because they “know” who is going to do the task.

Huh?

Let’s get this straight. The Sprint planning process is pretty straightforward: The team decomposes tasks into the work that needs to be done. The team chooses who works on the tasks. If the team chooses to have one person do the bulk of the decomposition of a single user story – then great – do it. If the team chooses to have one person do the bulk – or even all – of the tasks associated with a backlog item – then great.

I can’t put my finger on why this concept appears to challenging to some of the teams I’ve coached.

This is, however, a lot more to this story. If you’re really motivated to create high performing teams, read on.
Team Performance and Shared Transactional Memory
A team that chooses to forego fine-grained task decomposition is also choosing to forego an amazing opportunity to create a higher performance team. I discuss this extensively in chapter 7 of Journey of the Software Professional, so here are some highlights. (See also What’s Collaboration and Some Answers to “What’s Collaboration?”).

I’d like to specifically focus on an important aspect of team performance based on something referred to as a collective mind or a shared transactional memory. Here is how I described it more than 25 years ago – a description I believe is still valid today.

Suppose you are comparing two teams, Team A, and Team B. Team A is obviously more effective than Team B. The question is “Why?” Lots of potential variables come to mind. Team A might have better tools, a better working environment, more experienced developers, and so forth. But what if you could hold all these variables equal? What could now cause Team A to be more effective than Team B?The answer lies in how the members of Team A have molded their collective experience into a sum greater than the parts. Somehow, Team A is more effective when working together than when working apart. But how can this happen?Think about some of your earliest interactions with your colleagues. What did you know about them when you first met? What do you know about them now? As we work together, we learn about each other. We learn Jill is an excellent analyst, Raji is a great designer, and Ruth is a powerful motivator. The team forms a collective mind, in which the interdependent actions of the team create a “a separate transactional memory system, complete with differentiated responsibility for remembering different portions of common experience” [Weick 1993]. More plainly, we not only know Jill is an excellent analyst, we rely on our knowledge of Jill being an excellent analyst, and begin to assign her tasks capitalizing on her skills. We remember the tasks she has been given, and rely on her memory when we need information about those tasks. A collective mind enables the team to become more effective in problem solving precisely because each member of the team can rely on other members to provide experience and skills we do not possess as individuals.Conversely, just because the raw potential of the team exceeds the individual does not mean it will be realized. There are times when a team can perform much worse than any single individual. One way this happens is when teams fail to account for their own poor performance. Instead of working to identify what is wrong and fix it, effort is spent identifying other groups that can “take the blame” [Kahn 1995]. Another way this can happen is through groupthink. Groupthink occurs when each member of the team stops critically examining decisions in order to make them better. Instead, effort is spent finding ways to justify a poor decision [Janis 1971]. Collective mind and groupthink represent two extremes: one optimal and the other to be avoided at all costs.What are the implications of collective mind? First, the potential of each team is unique, initially based on the individual ability of each team member. When harnessed, this can produce incredible performance, as when the Macintosh team created the first Macintosh or when the Chicago Bulls won three NBA championships in a row. Second, individual talents alone do not imply success. A collective mind is based on the collective skills and experience of the team. No matter how good the star, he or she needs a supporting cast. Third, a collective mind can only be formed as the result of interactions among team members. This process takes time. How long did it take the Bulls to win their first NBA championship, even with Michael Jordan leading the team? Fourth, a collective mind is a fragile thing, easily lost and not easily recreated. Did the Bulls immediately win the title when Michael Jordan returned to the team in 1995? Finally, because team members rely on each other for information, skills, experience, and ability, changing the composition of the team can have substantial repercussions.Note: Send me an email if you want the references cited above.

The alert reader will see the implications of Sprint task decomposition in helping form high-performing teams: It helps form the collective mind. And it keeps this mind fresh, because team members can choose to perform different tasks.
Some Thoughts on Estimating
My experience with estimating is that estimating large, poorly specified, inter-dependent backlog items is very hard. That’s different than saying we estimate future things more poorly.That’s why agilists like backlog items that follow Bill Wake’s acronym INVEST. The “S” means that the backlog items are “small enough” to be estimated. More generally, humans are better at estimating smaller things more accurately than larger things, and we’re better at estimating things within our experience base (cognitive library) than outside our experience base. Which is why we lean heavily on the people who own the backlog to create “small” backlog items. Which is also hard, because customers want big chunky innovative things that make your product awesome, but to create these big things in a reliable manner, we need to decompose them into smaller things. Sandy recommends 1-3 days, which is fine, but I suspect that Sandy would agree that Innovation with a capital “I” doesn’t happen in 1-3 days. That’s OK – but it means that we have to have different levels of estimating to serve the needs of the business. I recommend three levels of estimates: shirt sizes, points, and hours. Here is how I like to use them.

Shirt sizes are created by one or two trusted senior technical leaders for roadmap items, “epics”, and backlog items. While these items don’t typically meet requirements of the INVEST acronym, shirt sizes, as non-binding estimates of time become invaluable tools for product managers and product owners to understand the implications of certain business decisions.

Since Sandy references Barry Boehm’s spiral method, I’m guessing that Sandy is also familiar with Wide-Band Delphi as an estimating practice. Points-based estimates are just wide-band delphi, repackaged with a bit of fun.

We use points based estimates as the next level of estimates of the amount of work that the team expects that they need to do to complete all of the backlog items that they business believes is required to release software to customers. It is a way to provide a more accurate schedule estimate than a WAG, because by tracking work-completed-per-unit-of-time, we can create a velocity. Mike Cohn does a good job of explaining this in his many varied writings. And yes, items that are subjected to points-based estimates should indeed follow the INVEST acronym, which is part of the art of Agile Product Management – learning to “split” large, innovative, market-changing roadmap items into collections of small, INVESTable, Sprint-items. Of course, Scrum isn’t new in this regard – task decomposition is a fundamental activity in any project regardless of method.

Wide-Band Delphi has other benefits. It helps teams feel like, well, teams, because they were involved as a team in creating the estimates. It provides an opportunity for significant organizational learning as various team members clarify their understanding of the backlog item being estimated. And it provides real-time opportunities to have critical conversations about the story before work begins.

Lastly, we use hours when planning a Sprint to provide confidence that the specific backlog items undertaken for the Sprint have a high degree of confidence in completing by the end of the sprint. This is tracked in the “burn down”, and is useful for “early warning” during the Sprint. This may not be needed in a Kanban-style model, but I’m in agreement with JB – through tasking of a story helps ensure it is completed effectively.

Of course, we agree on the value of responding to change rather than following a plan. But we still plan. At the roadmap, release, and Sprint.
What About End of Sprint Waste?
Sandy expresses what appears to be grave concern over the potential for “waste” at the end of the Sprint. In nearly a decade of coaching Agile teams, I’ve never seen a team consistently have “waste” at the end of the Sprint. Here’s why. A Scrum team that engages in release planning will have, as Sandy illustrated, created a release plan based on a number of Sprints and the expected number of stories – points – they will complete in each Sprint. Let’s say that in Sprint one the team estimates that they’ll deliver between 25 and 35 points. It is, after all, their first sprint, so they have no historical data on their own performance. Let’s assume the team agrees to tackle 29 points and at the end of Sprint one they complete all 29 points and have time left over.

First, this team should be celebrating. They delivered “Done, Done” work within their estimated velocity. This is outstanding.

Every team I’ve coached will adjust their plan for Sprint two and pull in additional backlog items. So, will the original plan for Sprint two might have been another 29 points, the adjusted plan for Sprint two will have something like 33 points. Note that the team still hasn’t adjusted their estimated range of points – they have no data that suggests they’re off.

This process continues until the team reaches a reasonable number of points that they can consume on a regular basis. If they increase their performance because of thorough planning or just better development, then they’ll gradually raise their range of estimated points – perhaps from 25 to 35 points / Sprint to something like 35 – 45 points / Sprint. That’s a marvelous improvement, and ultimately, increase team velocity is everyone’s job.It won’t result in indefinite waste in every Scrum cycle because the team is adjusting their work based on the reality of what they’re delivering.

And, for the record, I feel that Sandy is (perhaps inadvertently) painting a somewhat optimistic picture of Scrum teams. Most Scrum teams, especially those who are new to Scrum, underestimate how hard it can be to get a backlog item to “Done, Done”. They don’t usually have free work at the end of the Sprint. Instead, they’re hustling to meet their commitments, and usually downgrading the number of points in the next Sprint.

And on those rare occasions when a team does have a bit of extra time in their Sprint I am completely comfortable letting them decide how to spend it. (See also: Entropy Reduction and Sustainable Agile Development).
Predicting the Future – Always A Challenge!
I appreciate that Sandy has a sensitivity to helping provide an estimate of when the team will be finished. He correctly identifies the realistic needs for business to understand this. However, his solution will only really work for certain contexts, and we need other solutions for other contexts.

One context is a Silicon-Valley software startup like The Innovation Games Company. We make a fantastic brand engagement platform called Knowsy. And we often operate more like the Kanban model that Sandy espouses than the typical Scrum multi-sprint release model I usually teach to my enterprise clients. That make sense – we’re not yet driven by market forces and we’re a small, co-located team. We release software when we’re ready. And we have zero complex team coordination issues. So, we release when we’re ready.

Most enterprise / business software, however, doesn’t fit this context. As Sandy points out, there are complex, inter-related dependencies with other teams. And these teams need to move from “How fast can you ship this?” to “When should we ship this to maximize our profit”? Note that the former is a question usually driven by product teams who are unclear of the market events and market rhythms of the market segments they’re serving, while the latter question is driven by product teams who understand when it is most advantageous to ship bits. In these cases, a release plan, which I define as a the expected number of Sprints required to consume the number of required backlog items to create enough business value to motivate the release, is a critical component of the overall product development process.

Release planning is hard work, and I find that teams who have enjoyed the benefits of running a few Sprints and have a sense of “Done, Done” typically do a better job. Of course, that’s not a requirement, because estimating effort using Wide-Band Delphi doesn’t require a team having Sprint experience. It is just nice, and that’s why we try to sequence transition projects by having teams get good at Sprinting before being expected to develop a release plan. Regardless, release planning is a grand tradition in Agile (re-read the Agile Manifesto if you’re not sure).

All of this said, it is important that we, as a global community, continue to experiment with ways to estimate the future. One area that I consider promising, but have not formally explored with any client, is the use of a prediction market. Microsoft, for example, has reported some success with prediction markets (see, for example, this rather dated article from Business Week). In my ideal world, I’d like to compare a prediction market model with the release planning process I advocate to see if there is convergence. In both cases, I suspect that as the items in the release become more INVESTable the accuracy of both the release plan and the prediction market will increase. But maybe I’m wrong. Maybe the key to getting actionable estimates on relatively imprecisely specified work is through a prediction market.

Lastly, I want to recognize that there other approaches to estimating. Function points, for example, is based on mapping new work to a known standard, and is certainly appropriate in certain contexts. However, function points is often too slow for relatively small, fast-moving Agile teams, and, since I’m not qualified to use it, I don’t.
Are You Optimizing for Individual or Organizational Efficiency?
This is not an easy question. Before jumping to any conclusions, I ask you to consider the nature of the organization that is creating the code and the responsibility that this organization has to its respective constituents. Brooks pointed out ages ago in The Mythical Man-Month that a small team is dramatically more productive on a per-person basis than a large team. But you couldn’t create the OS 360 with a 9-person team, and companies like Rackspace can’t create new industries and amazing products like Cloud Servers of Cloud Files without large groups of developers.

So, yes, I’ll readily admit that any process that deals with macro planning and multi-team coordination will introduce a certain degree of process overhead that detracts from “writing code”. All of my clients know that I recommend that they create 2-3 year roadmaps, representations of system architectures, and diagrams of system / team dependencies. All of these are not “writing code” and all of them, to some level, introduce a level of managerial overhead. And while we seek to keep these to a minimum, we do them because the coordination of large groups of people require certain forms of managerial overhead.

I’ll conclude by answering Sandy’s actual questions:
Sandy’s Question: “What do you think? Would your daily development process be better if you didn’t have to break down tasks to super-fine resolution?”
Luke’s Answer: It depends on the nature of the task. If you’re asking me to “go solo” and work alone in tackling a backlog item that is within my cognitive library and one that I’ve proven that I can successfully perform multiple times in the past, then, no, my personal daily development process will not be more efficient with a “super-fine” resolution task decomposition. However, if you’re asking me to tackle a backlog item that is new, or to work with a team that may not have a reasonably similar cognitive library as my own, or honor the idea that a Scrum team chooses which team member(s) will work on tasks, then yes my team will be more efficient with a “super-fine” resolution task decomposition, where super-fine is defined as tasks with a 3hr or less estimate.
Sandy’s Question: “As a manager, could you better estimate ability-to-deliver based on higher-level sentiment vs. tasks completed?”
No. The best way to estimate future ability-to-deliver future work is similarly sized actual work completed by the team doing the work. Teams who undertake serious and thorough wide-band delphi estimation and then rigorously track their work to a consistent standard of “Done, Done” create the best “ability-to-deliver” estimates. (See Jeff Sutherland’s excellent data on such teams).
Sandy, thanks for giving the global community something good to discuss.

Categories: Companies