Logo Elisabeth Hendrickson’s Thoughts on Testing, Agile, and Agile Testing

Unintentionally Hard Tests

July 7th, 2008
Filed under Running the Business, Thinking Like a Tester

I finally decided that too many of the things on my To Do list are things that I probably should not be handling personally, like dealing with getting business cards and routine invoicing. And too many of the things that I should be handling personally, and quickly, are being left undone too long.

So I posted an ad for a part time assistant on Craig’s List. And being test obsessed, I devised a test to give me a good indication of whether a given candidate had enough of the skills I needed to bring in for an in-person interview.

I asked the candidates who wrote an articulate reply to the job posting (and there were a lot of them!) to do a little research and find out what conference I’m speaking at in August, where the conference is, and what the titles of my presentations are.

It seemed to me like this was a pretty easy test that would weed out anyone who couldn’t do basic research on the web. But I made the test harder than I intended by asking about conferences in August. I forgot that there are two conferences listed on my web site for August: Agile2008 (where I am not speaking), and STANZ (where I am giving two talks).

When I realized my mistake, I thought “No problem. Good candidates will search for my name in the Agile2008 program and realize I am not speaking there.” Then I tried it myself and discovered that searching for a specific speaker is apparently a use case that the Agile conference organizers didn’t think about when they published the program this year. Even I found it difficult to determine absolutely that I am not on the program. There are too many places to look, and even the PDF program is a little difficult to search for a given speaker’s name.

Sometimes what seems like a simple test - whether of software or of a person - turns out to be much more difficult than we intended or imagined. Although I suppose that as long as none of the job candidates crash as readily as software, we’ll all be OK.

And if any of them manage the task without giving up, I will definitely know something about their ability to find information on the web!

LEWT and Test Puzzles

December 17th, 2007
Filed under Ruminations, Thinking Like a Tester

I’ve just arrived home after a whirlwind trip to London for LEWT (James Lyndsay’s London Exploratory Workshop on Testing). Great fun discussing testing with a fabulous set of people! And since I was only in London for 48 hours my body didn’t have a chance to adjust to the time difference. Result: no jetlag when I got home. Yeah!

Anyway…about LEWT. The topic of LEWT was diagnosis. Our discussions included such questions as “How do we do it?” and “Should testers even be doing it, or is it a developer responsibility?”

In the context of testing, I interpret the verb “to diagnose” to mean “characterizing the conditions that lead to a failure.” So I most certainly believe that testers should diagnose. So should developers. Everyone on a software team should have a hand in understanding bugs well enough to fix them and prevent them in the future. But I think the “should we?” questions arose because the word has connotations from a medical context related to identifying diseases and prescribing cures. And I don’t think testers typically ought to be prescribing cures, or pinpointing the line of code that needs fixing, unless those testers are also developers on the team, responsible for writing production code.

However, I digress.

What I REALLY wanted to share from LEWT are James Lyndsay’s marvelous black box testing machines.

For years, James Lyndsay has used little Flash programs in his Exploratory Testing courses. Most of his machines have colorful buttons that you click, and your task is to understand how your actions are related to the machine’s responses. Sometimes the connection is straightforward. In other cases the relationship between the input you give the machine and what it does is downright puzzling. That’s why James refers to his machines as “crosswords for testers.”

Given that it’s the week right before the holidays, and you probably aren’t going to get much done at work anyway, why not hone your testing and diagnosis skills on James’ machines?

How did we miss THAT?

August 22nd, 2007
Filed under Ruminations, Thinking Like a Tester

“Oh goodness. How did I miss THAT bug?”

Over the years, I’ve asked myself that question numerous times.

I asked that question when another tester found a blazingly obvious, critical bug that I completely missed. (The answer: I spent too much time tinkering with an ineffective automated script I’d written, and too little time observing the actual behavior of the system. That’s the project where I learned a lot about how NOT to do test automation.)

I asked myself that question, repeatedly, when I participated on a project some years ago now where we shipped software that crashed left and right. (I’m still sorting out the answers to that last one. Catastrophic failures are almost never the result of a single, simple error. And this particular catastrophic failure represented failures at all levels in an organization that had, um, issues. But I digress.)

I asked the question again when I learned that a web site I tested had back-button problems. After all, I was sure I’d tested for that. And I had. But I hadn’t re-tested for it after a particular set of code changes that changed some operations from HTTP GETs to HTTP POSTs. Oops.

And I asked myself that question more recently when I learned that a system I worked on earlier this year failed to save a change, and also failed to report an error, when it encountered data misformatted in a particular way in one specific field. Badly formatted data is one of my specialties, and I couldn’t believe I forgot to test the particular case that resulted in the problem. But it turned out that I did, indeed, fail to test what would happen if you entered “www.testobsessed.com” into a URL field instead of “http://www.testobsessed.com”. In hindsight, it’s an obvious test. Another lesson learned.

I reflected on those missed bugs when a colleague, Sandeep, a test manager, recently wrote to say that he’s been asking himself “how did my testing team miss that?”

He decided to seek out patterns of testing problems by categorizing escaped bugs according to the hole(s) in testing that allowed the bug to slip through. The idea is to improve the test effort by figuring out the common causes behind escaped bugs.

My initial reaction was, “that makes sense.” If you can identify the top 20% of testing holes that let 80% of the bugs through, and you can make some serious improvements to the test effort.

And my next reaction was, “but be careful.” Sandeep’s intent is good: use lessons learned from escaped bugs to improve testing. However, asking “How did we miss that?” is perilously close to heading down the slippery slope to “How did JoeBob miss that?” to “It’s JoeBob’s fault.” Having talked to Sandeep, I know he’s not trying to play “pin the blame on the tester.”

So I suggested a small reframe.

Instead of categorizing escaped bugs by asking the question, “How did testing miss that?”, categorize them by asking the question, “How can we improve the probability that testing will find bugs like that in the future?”

It’s a subtle difference.

But the result of reframing the question is that instead of identifying categories as noun phrases like “insufficient test data,” we end up with imperative statements like, “add test data.” Those two categories may look almost identical, but only one is actionable. I can “add test data.” The statement prompts me to do something different next time. But “insufficient test data” only gives me something to regret. And regret won’t help me ship better software.

So how can you categorize escaped bugs to improve the test effort without falling into the blaming trap? Try an Affinity Exercise with the question, “What could we do differently next time to increase the probability that if we have another bug like this we’ll catch it in test?”

To prepare:

  1. Choose a team to participate in the activity. Affinity exercises can work with any number of people, but for this particular activity, I find a smaller group - say 3 to 5 people - works best. It’s a good idea to include people with diverse roles and skill sets.
  2. Set up a meeting time and place. Plan for the whole activity to take 2 hours. And arrange to meet in a place with plenty of table and/or wall space.
  3. Gather (or shop for) office supplies. You’ll need:

    • Index cards or sticky notes. Bigger is better. I like 5×8 cards or the SuperSticky 5×8 Post Its.
    • Felt-tip markers. I like Sharpies because they make consistently dark, readable marks. (Beware: Sharpies are permanent. Do NOT confuse your Sharpies and your White Board markers in the conference rooms. Facilities people get tetchy about such mix-ups.)
  4. Gather a list of escaped bugs you want to analyze. If you have a lot of escaped bugs, prioritize them and time box the exercise. (You probably won’t be able to analyze more than 50 in an hour, possibly less, so don’t print out a list of 500.)

In the Meeting:

  1. Review each bug with the team, asking: “What could we do differently next time to increase the probability that if we have another bug like this we’ll catch it in test?”
  2. Have participants write their suggestions on the cards/stickies, one idea per card, in the form of an actionable statement. The suggestions should complete the sentence, “In the next release/iteration/sprint, we can ______.” Tips:
    • Also ask the participants to make their suggestions as concrete and specific as possible. For example, instead of writing “add test data,” write “add titles with ampersands (&) to the test data.”
    • And ask the participants to stick to test-related actions, and avoid blaming individuals. “Revoke NancySue’s checkin priviledges” is not an acceptable suggestion.
  3. When you’ve reviewed all the bugs, or when an hour has passed, stop reviewing bugs. (If you still have lots of bugs to go and want to continue analyzing after an hour, stop anyway. Finish the rest of the exercise - the grouping. When you’ve worked through the whole process, if you still think more analysis would help, you can always do the exercise again.)
  4. Gather all the cards/stickies, and lay them out on a large work surface: a table, the walls, or even the floor can all work well.
  5. Sort through the cards/stickies as a team. The cards are now owned by the team, and everyone should take a hand in organizing them. Encourage participants to move cards that seem alike together so they are stacked together. Continue until the team agrees that it’s satisfied with the stacks of cards/stickies.
  6. Ask the team to give meaningful names to the stacks of cards/stickies. This is the part of the activity where will generate the more abstract categories like “add test data.”

The result of this exercise is a list of categories for improving the testing effort that we can then use to determine which kinds of improvements will have the biggest bang for the buck. And the best part is that the list emerged from the actual problems your software has had in the field rather than being some arbitrary list of theoretical “improvements” based on someone else’s unrelated experience.

But wait; you’re not done. Now that you’ve created a first draft list of categories, test it. (Did I mention I’m Test Obsessed?) Choose a different set of escaped bugs, and assign each to one or more categories from the list. Notice how easy or hard it seems to find a category for each bug. This will give you a lot of feedback about how well the category list will work in practice. You may find the team needs to spend some additional time iterating on the list.

Once you’re satisfied with your list of categories, you can run the numbers to see how many bugs are in each category. Then you can create a Pareto diagram of the results to see what 20% of the improvement opportunities on your list will result in an 80% improvement. Now you can truly leverage escaped bug information into concrete actions that will make the test effort more effective.

Over time, as you try to use that original list to categorize reports of new bugs in the field, you will probably find that the list becomes less and less relevant. I hope so, anyway. It indicates that the improvement efforts are working, that the team has improved the test effort.

That’s when you know it’s time to do the process all over again to classify the next generation of escaped bugs.