Handling Bugs in an Agile Context

I was honored to be included on the lunch and learn panel at the Software Quality Association of Denver (SQuAD) conference this week. One of the questions that came up had to do with triaging bugs in an Agile context. Here’s my answer, in a bit more detail than I could give at the panel.

The short answer is that there should be so few bugs that triaging them doesn’t make sense. After all, if you only have 2 bugs, how much time do you need to spend discussing whether or not to fix them?

When I say that, people usually shake their head. “Yeah right,” they say. “You obviously don’t live in the real world.” I do live in the real world. Truly, I do. The problem, I suspect, is one of definition. When is a bug counted as a bug?

In an Agile context, I define a bug as behavior in a “Done” story that violates valid expectations of the Product Owner.

There’s plenty of ambiguity in that statement, of course. So let me elaborate a little further.

Let’s start with the Product Owner. Not all Agile teams use this term. So where my definition says “Product Owner,” substitute in the title or name of the person who, in your organization, is responsible for defining what the software should do. This person might be a Business Analyst, a Product Manager, or some other Business Stakeholder.

This person is not anyone on the implementation team. Yes, the testers or programmers may have opinions about what’s a bug and what’s not. The implementation team can advise the Product Owner. But the Product Owner decides.

This person is also not the end user or customer. When end users or customers encounter problems in the field, we listen to them. The Product Owner takes their opinions and preferences and needs into account. But the Product Owner is the person who ultimately decides if the customer has found something that violates valid expectations of the behavior of the system.

Yes, that does put a lot of responsibility on the shoulders of the Product Owner, but that’s where the responsibility belongs. Defining what the software should and should not do is a business decision, not a technical decision.

Speaking of expectations, let’s talk about that a little more.

When the Product Owner defines stories, they have expectations about what the story will look like when it’s done. The implementation team collaborates with the Product Owner on articulating those expectations in the form of Acceptance Criteria or Acceptance Tests.

It’s easy to tell if the software violates those explicit expectations. However, implicit expectations are a little more difficult. And the Product Owner will have implicit expectations that are perfectly valid. There is no way to capture every nuance of every expectation in an Acceptance Test.

Further, there are some expectations that cannot be captured completely. “It should never corrupt data or lose the user’s work,” the Product Owner may say, or “It should never jeopardize the safety of the user.” We cannot possibly create a comprehensive enough set of Acceptance Tests to cover every possibility. So we attend to both the letter of the Acceptance Tests and the spirit, and we use Exploratory Testing to look for unforeseen conditions in which the system misbehaves.

Finally, let’s talk about “Done.” Done means implemented, tested, integrated, explored, and ready to ship or deploy. Done doesn’t just mean coded, Done means finished, complete, ready, polished.

Before we declare a story “Done,” if we find something that would violate the Product Owner’s expectations, we fix it. We don’t argue about it, we don’t debate or triage, we just fix it. This is what it means to have a zero tolerance for bugs. This is how we keep the code base clean and malleable and maintainable. That’s how we avoid accumulating technical debt. We do not tolerate broken windows in our code. And we make sure that there are one or more automated tests that would cover that same case so the problem won’t creep back in. Ever.

And since we just fix them as we find them, we don’t need a name for these things. We don’t need to prioritize them. We don’t need to track them in a bug tracking system. We just take care of them right away.

At this point someone inevitably asks, “But don’t we need to track the history of the things we fix? Don’t we want to collect metrics about them?” To that I answer “Whatever for? We’ve caught it, fixed it, and added a test for it. What possible business value would it have to keep a record of it? Our process obviously worked, so analyzing the data would yield no actionable improvements.”

If we are ever unsure whether something violates the Product Owner’s expectations we ask. We don’t guess. We show the Product Owner. The Product Owner will say one of three things: “Wow, that’s a problem,” or “That’s outside the scope of this story, I’ll add it to the backlog,” Or “Cool! It’s working exactly as I want it to!” If the Product Owner says it’s a problem, we fix it.

If the Product Owner says “Technically, that’s a bug, but I would rather have more features than have you fix that bug, so make a note of it but leave it alone for now” then we tell the Product Owner that it belongs on the backlog. And we explain to the Product Owner that it is not a bug because it does not violate their current expectations of the behavior of the software.

Someone else usually says at this point, “But even if the Product Owner says it’s not a problem, shouldn’t we keep a record of it?” Usually the motivation for wanting to keep a record of things we won’t fix is to cover our backsides so that when the Product Owner comes back and says “Hey! Why didn’t you catch this?” we can point to the bug database and say “We did too catch it and you said not to fix it. Neener neener neener.” If an Agile team needs to keep CYA records, they have problems that bug tracking won’t fix.

Further, there is a high cost to such record keeping.

Many of the traditional teams I worked with (back before I started working with Agile teams) had bug databases that were overflowing with bugs that would never be fixed. Usually these were things that had been reported by people on the team, generally testers, and prioritized as “cosmetic” or “low priority.”

Such collections of low priority issues never added value: we never did anything with all that information. And yet we lugged that data forward from release to release in the mistaken belief that there was value in tracking every single time someone reported some nit picky thing that the business just didn’t care about.

The database became more like a security blanket than a project asset. We spent hours and hours in meetings discussing the issues, making lists of issues to fix, and tweaking the severity and priority settings, only to have all that decision making undone when the next critical feature request or bug came in. If that sounds familiar, it’s time to admit it: that information is not helping move the project forward. So stop carrying it around. It’s costing you more than it’s gaining you.

So when do we report bugs in an Agile context?

After the story is Done and Accepted, we may learn about circumstances in which the completed stories don’t live up to the Product Owner’s expectations. That’s when we have a bug.

If we’re doing things right, there should not be very many of those things. Triaging and tracking bugs in a fancy bug database does not make sense if there are something like 5 open issues at any given time. The Product Owner will prioritize fixing those bugs against other items in the product backlog and the team will move on.

And if we’re not doing things right, we may find out that there are an overwhelming number of the little critters escaping. That’s when we know that we have a real problem with our process. Rather than wasting all that time trying to manage the escaping bugs, we need to step back and figure out what’s causing the infestation. Stop the bugs at the source instead of trying to corral and manage the little critters.

Subscribe

Subscribe to our e-mail newsletter to receive updates.

16 Responses to Handling Bugs in an Agile Context

  1. Michael Dubakov March 13, 2009 at 4:59 pm #

    I agree with most points in the article (good one!). There are several things that are ‘discussable’ (like whether you need bugs database, on my opinion it depends).

    Elisabeth responds: Oh, I agree. If, for whatever reason (legacy system, lots of escapees, whatever), there are lots of bugs, then the team needs a tracking system to manage them.

    I do have one question. Imagine a team on quite large project that applying agile development. They already have let’s say 200 open bugs, half of them worth to be fixed and about 20% of them MUST be fixed. What is the best strategy to handle this situation?

    Elisabeth answers: it sounds like this is a question about how to handle technical debt. If so, my answer is the same as for paying down any kind of debt: budget to pay it down, and pay it down as fast as you can afford to, and in the meantime don’t accrue any more.

  2. James Martin March 13, 2009 at 7:52 pm #

    There has been a bit of a struggle on the software-craftmanship mailing list this week to define a “line we won’t cross” when it comes to shipping software with bugs in. To me, you’ve hit the nail squarely on the head with this paragraph:

    “Before we declare a story “Done,” if we find something that would violate the Product Owner’s expectations, we fix it. We don’t argue about it, we don’t debate or triage, we just fix it. This is what it means to have a zero tolerance for bugs.”

    Even outside of an Agile context (something we’re trying to be too prescriptive about) I think it applies to anyone who cares about software – shipping something that violates the product owner’s expectations is “the line we won’t cross”.

    Thanks for this post, it’s very valuable!

  3. Markus Gärtner March 14, 2009 at 4:53 am #

    By exchanging Product Owner with the customer your point can be generalized. The problem most teams and projects struggle with is a single-point of contact customer that gives consistent answers. A product owner might be seen as a customer proxy that decides “yeah, this is as I expect it to be” or not.

    The problem for most teams is to decide to fix bugs right when they occur or the time waiting for feedback from a remote customer. Feedback time asks for noting the bug and getting back to it when you have the answer. Agile teams try to avoid this problem with on-side customers and easy access to domain experts. Michael Bolton made me aware to raise any point occuring as a tester to the one you will use the software – and that’s the customer or the product owner or …

    Elisabeth responds:

    Thanks for raising this point! However, I think you’ve highlighted why I didn’t use the word “customer” in this post. In Extreme Programming, the Customer is a role, and usually not the actual customer for the software. The term “Customer” was an unfortunate naming choice; many are confused by the distinction between the customer and the Customer. So let me clarify: by Product Owner, I mean the person who has the authority to tell the implementation team what to build. That’s usually not a customer or a user. The tester might show the user an issue to get feedback, but the Product Owner still makes the ultimate decision about what the software should do. A good Product Owner will take the user’s opinion into account, but will also have to take other things into account like other users’ opinions, business processes/strategy, etc. That’s why a single user – or tester – cannot be the one to decide what the software should or should not do. The decision involves many more factors than just one person’s opinion. And that’s also why it is so very difficult to be a good Product Owner.

  4. Shrini March 14, 2009 at 10:43 am #

    >>> In an Agile context, I define a bug as behavior in a “Done” story that violates valid expectations of the Product Owner.

    I think you did not elaborate on “valid” qualifier that was used for “expectations”. what makes a product owner’s expectation as “valid”. I am asking because, you used that “qualifier”.

    What about “invalid” expectations? Is “valid” same as “reasonable” ?

    If I omit “valid” from your definition of bug – will it any difference?

    Shrini

    Elisabeth responds:

    The expectation is valid if it is within the spirit of the scope of the original Acceptance Criteria or Acceptance Tests for the story. Sometimes people, including Product Owners, have unreasonable expectations. “When I specified the Login story,” an unreasonable Product Owner might say, “I meant that you should implement a full Role-Based Access Control security scheme. The lack of Roles and Groups is a bug.” That would be ridiculous, of course. And most Product Managers wouldn’t do that. But the “valid” qualifier is intended to convey the idea that the Product Owner is not free to make up random expectations on the fly. This begs the question: who decides what’s valid and what’s not. That’s a discussion, and a decision, for the whole team.

  5. Erik Petersen March 14, 2009 at 7:48 pm #

    You said <>
    One of the issues I (and others I’ve spoken to) encounter is the edge of Done. If the product is showcasable, often that is seen as Done. If the product is for public use over the web, and we have bugs in one common browser, Test sees that as not Done yet, and we have a possible issue for discussion. Your thoughts?

    Elisabeth responds:

    It’s either Done, or not, according to the team’s shared definition of Done. And for web apps, the definition of Done must include a list of browsers to be supported (or we can’t ever really be done because there’s always another browser). If the team claims “Done” when the story doesn’t work under a supported browser, doesn’t tell the PO about the browser limitation, and just hopes not to be caught, then the team is cheating. However, if the team discusses the limitation with the PO in advance, the PO may decide to Accept the story as-is, and add a new story to the backlog for the recalcitrant browser (hi, IE6). Story splitting is OK. Cheating is not.

    On needing bug databases, I had a long discussion on this with Jim Shore et al while we were reviewing the drafts of Agile Development. I argued that while I don’t need to know what the bugs were, I want to know where they were to get clues on bug clusters (especially major bugs). There may also be highly technical configuration/setup bugs that you know will reoccur so it is good to save the knowledge for next time rather than lose it. Lastly, there is the issue of empowering users. Often they have an issue or a concern that you don’t want to raise as a story, but it can go in the bug database as a record of concern. This helps counter user distrust of being forced into accepting something when they have genuine concerns…..

    Elisabeth responds:

    It sounds like you feel a need for a place to put information for future use. A bug database is one solution, however I prefer to use other systems for capturing information: wikis, text files/READMEs in the source control system, collaborative forums, wherever the team captures other kinds of info. Bug tracking systems typically make lousy knowledge bases because they usually are not optimized for finding information, but rather for managing work items to be done.

    As for user concerns, it is the PO’s job to manage that. If the PO wants to use a bug tracking system to manage user concerns, that’s fine. But no one should usurp the PO’s authority by using a separate system to store user feedback. If the PO is dropping balls, the team can raise the issue to the PO and offer to help. But logging issues in a bug tracking system when the PO has a different process for managing user concerns is not helpful, and in fact can be very damaging.

  6. Lisa Crispin March 15, 2009 at 7:10 pm #

    Yeah! The focus should be on bug prevention, not bug tracking or triage. Right on!

  7. Ted M. Young March 15, 2009 at 10:23 pm #

    Interesting blog post. It sounds like in some cases it’s a bug (i.e., after acceptance and possibly delivery, something’s not right), and sometimes it’s a backlog story, e.g., the PO says “That’s not right [it doesn't completely match their expectation], but it’s not important right now [it's an edge case or a rarely used feature], so let’s move on to the next story and call this one done”. Is that what you’re saying?

    btw, this is why we have “acceptance sessions” every week where the PO (along with the rest of the team) gets to see the newly coded story (not done, but where the developer thinks they’ve coded everything) and then makes comments or adjustments, e.g., “change that wording”, “that total looks wrong”, etc. Sometimes we’ll track those comments as bugs (really, stories to be done later), or the developer will just go fix it/them after the session.

  8. Marta G.F. March 16, 2009 at 2:41 am #

    One question I have is how to handle UI defects. Sometimes it’s not entirely easy to add an automated test to cover an issue in the UI (found before the story is “Done”), but we still want to make sure we catch it if it happens again. We could add a manual test for it, but unless we go through it on every build, it won’t be as successful as an automated test or provide as immediate feedback. Is there any strategy you would suggest for handling UI issues (not “bugs”!) in an Agile context?

    Elisabeth responds:

    If it’s truly a UI thing – like whether a field is enabled/disabled – then as hard as it can be to write that automated test, I advocate doing it. (There was a thread on this very topic on sw-improve recently. My response there applies.)

    And I know it’s hard, honest. But that’s why getting the developers involved is so important. Automating a test on an untestable UI can take hours and hours, and making the fixes to the UI to make it more testable can take mere minutes.

    However, it might not be necessary to test it through the UI. Sometimes it looks like a UI thing because that’s how the issue was found, but it’s something that can be tested below the GUI. And in that case, I advocate automating the test at the lowest level possible. (Making too many of the automated tests go all the way through the GUI leads to all kinds of problems.)

  9. Ravindar March 18, 2009 at 1:02 pm #

    I liked this article. Now to get all the QA people thinking like this.

  10. Joseph Beckenbach March 19, 2009 at 6:41 am #

    Hi, Elizabeth! Two incidents from my own experience might .

    In 2002-2004, I took the lead-tester role in a small biotech startup, bringing protein prediction software out of Bill Goddard’s lab at Caltech (Pasadena, California). For eighteen glorious months of Extreme Programming joy, we turned out products at a rate rivalling the best of the Scrum teams that John Sutherland’s been (rightfully) raving about the past year or so.

    We had two defects escape the team room. Both times, the development team had misunderstood the Product Owner (Joe) on a subtle but (later-discovered) crucial point. This was prime learning on everyone’s part, even the guys who developed the algorithms in the first place. (That indirectly led one of our researchers to find several SARS anti-viral drug candidates one lunch hour, but that’s going off into the weeds ….)

    Our attitude matched yours. If it didn’t meet Joe’s expectation, I didn’t mark it as DONE and the team won’t let me if I tried. I also served as tracker, so this was hard-data feedback, the best type.

    We did accumulate about four dozen nit-picky details by the end, mainly polish stuff like typo corrections and some items which really should have gone into the backlog. Most of it got taken care of at opportune times, like waiting for the other pairs to finish up so we could all go to lunch.

    We moved so fast in part because we chose to resolve all “bugs” not track them and lug them forward. Had we not, we’d have closed shop in a few months, and none of these multi-million-dollar products would exist today.

    Think that’s small beer?

    I applied exactly the same attitude and approach at Hewlett Packard, late 1990s, integrating HP/UX. Cleaned out a year-long backlog of bugs and issues, fixed all causes of inflow into that backlog, and got onto similar immediate footing, all within nine months.

    This had been scheduled and budgeted for “twelve months minimum” on the critical path. Crafting and providing HP/UX at the time required several thousand developers and hundreds of non-technical supporters full-time.

    My most pessimistic estimates for the direct cost savings this attitude gave HP runs $6.5m. That year, this represented a half-cent per share of additional earnings — a 0.2% boost to earnings per share of a Fortune 20 company.

    Fiduciary duty should require this attitude, I’d argue.

  11. Siddharta February 25, 2010 at 8:01 am #

    Elizabeth, I’m uneasy with this definition of bug.

    We do a story, its accepted and deployed. In deployment a user does something unanticipated that causes it to crash. PO decides that its a rare combination so we’ll do it later.

    By your definition this would be a new story, not a bug, but that doesn’t sound right to me.

    Did it violate the PO expectation when it was accepted? No. Does it violate the current expectation? Yes. We need to allow our understanding of the system to evolve and decide against that.

    IMHO, its a bug if it violates the current, most up to date understanding of the system.

  12. ehendrickson February 25, 2010 at 2:46 pm #

    Seems to me like you agree with my definition of bug: behavior that violates PO’s expectations. Re-read the last 4 paragraphs of the post and I think you’ll see that we’re saying more or less the same thing: it’s a bug because it violated the POs expectations, but it still goes on the backlog.

    The thing I am not sure that we agree on is how often we should expect this should happen. If we’re getting a lot of bug reports in “Done” stories, it suggests to me that something is broken and we should stop to discover why that is.

    Going back to the twitter thread: if we give bugs 0 velocity points, and discover that we’re spending all our time fixing bugs, that will make it excruciatingly visible that we’re spending all our time patching up gaps in the original stories and not making progress toward new capabilities. And that should trigger the discussion about “how come we [as the whole team, including the PO] can’t seem to ship anything without getting a whole bunch of bug reports?”

    Getting velocity “credit” for those bugs will make folks feel better about doing all that work, but it is an illusion. Organizational anesthesia. It masks the harsh truth that we’re not making progress, or our progress is significantly slowed, because of rework. I’d rather have the velocity numbers show the truth about the results so we can deal with it rather than tell us a pleasant fiction that makes everyone feel good about their effort.

  13. Siddharta February 25, 2010 at 5:28 pm #

    I can see where you are coming from.

    To me velocity is simply the amount you are capable of delivering in a sprint. If the velocity is 5, you can do 5 points in a sprint. I used it for sprint planning and thats it. It doesn’t say whether I am making progress or not. To me its like the speedometer in a car: 50km/h is the speed. You could well be driving in circles.

    Story point is just the size. Do bugs have a size? Sure.

    To measure progress, I would stick in a business value amount or something similar. Give bugs a BV of 0. Then we can use that to see if the BV is burning down or not. Figure out the BV velocity to see if you are only fixing bugs or actually making progress.

    The question is: are we mixing up the two and making story points and velocity do double duty in measuring both size and value?

  14. Jaroslav Tulach September 7, 2010 at 9:36 am #

    Half a year ago, when I read this article for the first time, I felt inspired. I started to practice this kind of throw away your bugs lifestyle. However as I am working on an open source project, there are some specifics. For example, do we know who is the project owner?

    In case anyone is interested in open source, here is a link to my current thoughts on the topic: http://wiki.apidesign.org/wiki/Bugzilla
    I’ll be glad to hear some comments.

  15. Peter Lyons September 8, 2010 at 3:08 pm #

    I wrote a somewhat lengthy post in response to this coming from the view of enterprise software development.

    http://www.peterlyons.com/problog/2010/09/agile-bugs/

  16. Johan Paul September 25, 2010 at 12:50 pm #

    Thanks for an interesting read. I agree with you on a agile theoretical point of view, but I wrote an response to this article based on my experience.

    http://www.johanpaul.com/blog/2010/09/non-sterile-agile-comments-on-handling-bugs-in-an-agile-context/