Dec 032009

Not everyone agrees with my definition of Agile.

Dave Nicolette commented that he thinks my definition actually describes Lean. He defines Agile in terms of the Agile Manifesto.

I replied to Dave elsewhere, but wanted to post my response here too since this is a topic that comes up frequently.

I have trouble defining Agile solely in terms of the Agile Manifesto.

Mind you, I believe the Agile Manifesto is a great document. With it, the original signatories gave the industry a focal point, a fulcrum on which to turn. They forever changed our industry by distilling the difference, in terms of their guiding values and principles, between the lightweight approaches they used and the then-generally accepted formal processes & industry “Best Practices.”

And turn we have. “Agile,” at least as a buzzword, is mainstream. Increasingly organizations are adopting Agile methods to remain competitive.

However, I see the manifesto as a beginning, not a definitive end. The community has learned much in intervening years.

Part of what I’ve personally learned is that defining Agile in terms of results short circuits two of the bigger problems I see plaguing the community: 1) religious debates and 2) superficial but ultimately hollow attempts at transitions that result in frAgile processes.

Where people define Agile in terms of practices, I see more instances of Cargo Cult adoption (”We’re Agile because we stand up every morning!”) and religious dogmatism (”You don’t TDD?!? You can’t possibly be Agile!”).

Where people define Agile in terms of values, I see more instances of Agile-as-an-excuse (”Documentation? No. We don’t document anything. We’re Agile!”).

But where people define Agile in terms of results, I see greater focus on the ultimate goal: value to the business.

As it happens, the best way we’ve found to achieve that goal so far—or at least the best way I’ve seen so far—involves embracing the values and principles of the manifesto.

But I believe that remembering why those values and principles are important, remembering the result we’re trying to achieve, is essential. And so I define Agile in terms of results.

Nov 232009

I knew I should have updated my WordPress installation. But it seemed like I always had something more urgent on my plate. I’ve been living in Quadrant 1 for too long. Quadrant 2 tasks—those that are important but not (yet) urgent—have languished until, through the natural course of events, they became urgent. Like, oh say, when this blog got hacked because of a security vulnerability in the older version of WordPress.

Whoopsie.

In my rush, I blew away a few things accidentally. Like my blog roll. So I still have some work to do to get the site completely back to normal.

But in the meantime I think I’ve gotten all the content back. And I’ve migrated to a shiny, new Theme that supports threaded comments.

Please let me know if you find bugs.

Oct 292009

A few days ago, I tweeted that I was looking for nominations for events for an Agile timeline and am extremely grateful for all the responses I received.

The request was for the keynote talk that I just presented at PNSQC. I’ve had several requests for the timeline that resulted, so I figure the easiest (and therefore fastest) way to share the resulting timeline would be to share my slides.

Here they are (pdf, ~1Mb). Enjoy! (As always, comments/questions/critiques welcome.)

Oct 062009

If you work in an Agile organization and are using a heavy weight specialized tool for test management, I have an important message for you:

Stop. Seriously. Just stop. It’s getting in the way.

If you are accustomed to heavyweight test management solutions, you might not realize the extent to which a test management tool is more of an impediment than an aid to agility. But for Agile teams, it is. Always. Without exception.

I don’t make such claims lightly and I don’t expect you to accept my claims at face value. So let me explain.

The Agile Alternative to Test Management

The things you need need to manage the test effort in an Agile context are whatever you are already using for the: Backlog; Source Control Management (SCM) System; Continuous Integration (CI) System; and Automated Regression Tests.

That’s it. You don’t need any other tools or tracking mechanisms.

Any test-specific repository will increase duplication and add unnecessary overhead to keep the duplicate data in sync across multiple repositories. It will also probably necessitate creating and managing cumbersome meta data, like traceability matrices, to tie all the repositories together.

All that overhead comes at a high cost and adds absolutely no value beyond what SCM, CI, & the Backlog already provide.

But, But, But…

I’ve heard any number objections to the notion that Agile teams don’t need specialized test management systems. I’ll tackle the objections I hear most often here:

But Where Do the Tests Live?
Persistent test-related artifacts go in one of two places:

  • High-level acceptance criteria, test ideas, and Exploratory Testing charters belong in the Backlog with the associated Story.
  • Technical artifacts including test automation and manual regression test scripts (if any) belong in the Source Control System versioned with the associated code.

And Where Do We Capture the Testing Estimates?
In Agile, we ultimately care about Done Stories. Coded but not Tested means Not Done. Thus the test effort has to be estimated as part of the overall Story implementation effort if we are to have anything even remotely approaching accurate estimates. So we don’t estimate the test effort separately, and that means we don’t need a separate place to put test estimates.

How Do I Prioritize Tests?
Agile teams work from a prioritized backlog. Instead of prioritizing tests, they prioritize Stories. And Stories are either Done or not. Given that context, it does not make sense to talk about prioritizing the tests in isolation.

Hello, I Live in the Real World. There is Never Enough Time to Test. How Do I Prioritize Tests Given Time Pressure?
If the Story is important enough to code, it’s important enough to test. Period. If you’re working in an Agile context it is absolutely critical that everyone on the team understands this.

But Testing is Never Done. Seriously, How Do I Prioritize What To Test?
This isn’t really a test management problem. This is a requirements, quality, and testing problem that test management solutions offer the illusion of addressing.

The answer isn’t to waste time mucking about in a test management tool attempting to manage the effort, control the process, or prioritize tests. Every minute we spend mucking about in a test management tool is a minute we’re not spending on understanding the real state of the emerging system in development.

The answer instead is to invest the time in activities that contribute directly to moving the project forward: understanding the Product Owner’s expectations; capturing those expectations in automated acceptance tests; and using time-boxed Exploratory Testing sessions to reveal risks and vulnerabilities.

What about the Test Reports?
Traditional test management systems provide all kinds of reports: pass/fail statistics, execution time actuals v. estimated, planned v. executed tests, etc. Much of this information is irrelevant in an Agile context.

The CI system provides the information that remains relevant: the automated test execution results. And those results should be 100% Green (passed) most of the time.

What about Historical Test Results Data?
Most teams find that the current CI reports are more interesting than the historic results. If the CI build goes Red for any reason, Agile teams stop and fix it. Thus Agile teams don’t have the same kind of progression of pass/fail ratios that traditional teams see during a synch and stabilize phase. And that means historic trends usually are not all that interesting.

However, if the team really wants to keep historic test execution results (or are compelled to do so as a matter of regulatory compliance), the test results can be stored in the source control system with the code.

Speaking of Regulatory Compliance, How Can We Be in Compliance without a Test Management System?
If your context involves FDA, SOX, ISO, or just internal audit compliance, then you probably live in a world where:

  • If it wasn’t documented, it didn’t happen
  • We say what we do and do what we say
  • Test repeatability is essential

In that context, specialized test management solutions may be the defacto standard, but they’re not the best answer. If I’m working on a system where we have to be clear, concrete, and explicit about requirements, tests, and execution results, then I would much rather do Acceptance Test Driven Development. ATDD provides the added value of executable requirements. Instead of the tests and requirements just saying what the system should do, they can be executed to demonstrate that it does.

Certainly, doing ATDD requires effort. But so does maintaining a separate test management system and all the corresponding traceability matrices and overhead documentation.

Our Management Requires Us to Use a Specialized Test Management System. Now What?
Send them the URL to this post. Ask them to read it. Then ask them what additional value they’re getting out a test management system that they wouldn’t get from leveraging SCM, CI, the Backlog, and the automated regression tests.

So, have I convinced you? If not, please tell me why in the comments…

Sep 042009

This post started out as a quick little entry about a cool parlor trick you can do with RSpec to make it work for auto-generated test data. But in the middle of writing what was supposed to be a simple post, my tests found a subtle bug with bad consequences. (Yeah for tests!)

So now this post is about auto-generated tests with RSpec, and what I learned hunting down my bug.

Meet RSpec

In case you haven’t encountered RSpec before, it’s one of the Behavior Driven Development developer test frameworks along with JBehave, EasyB, and others.

Each RSpec test looks something like this:

  it "should be able to greet the world" do
      greet.should equal("Hello, World!")
  end

I used RSpec to TDD a solution to a slider puzzle code challenge posted on the DailyWTF.

Auto-Generating LOTS of Tests with RSpec

So let’s imagine that you’re testing something where it would be really handy to auto-generate a bunch of test cases.

In my particular case, I wanted to test my slider-puzzle example against a wide range of starting puzzle configurations.

My code takes an array representing the starting values in a 3×3 slider puzzle and, following the rules of the slider puzzle, attempts to solve it. I knew that my code would solve the puzzle sometimes, but not always. I wanted to see how often my little algorithm would work. And to test that, I wanted to pump it through a bunch of tests and give me pass/fail statistics.

I could write individual solution tests like this:

  it "should be able to solve a board" do
      @puzzle.load([1, 2, 3, 4, 5, 6, 8, 7, nil])
      @puzzle.solve
      @puzzle.solved?.should be_true
  end

But with 362,880 possible permutations of the starting board, I most certainly was NOT going to hand code all those tests. I hand coded a few in my developer tests. But I wanted more tests. Lots more.

I knew that I could generate all the board permutations. But then what? Out of the box, RSpec isn’t designed to do data driven testing.

It occurred to me that I should try putting the “it” into a loop. So I tried a tiny experiment:

  require 'rubygems'
  require 'spec'

  describe "data driven testing with rspec" do

      10.times { | count |
          it "should work on try #{count}" do
              # purposely fail to see test names
              true.should be_false
          end
      }

  end

Lo and behold, it worked!

I was able then to write a little “permute” function that took an array and generated all the permutations of the elements in the array. And then I instantiated a new test for each:

  describe "puzzle solve algorithm" do
      permutations = permute([1,2,3,4,5,6,7,8,nil])

      before(:each) do
        @puzzle = Puzzle.new
      end

      permutations.each{ |board|
          it "should be able to solve [#{board}]" do
              @puzzle.load(board)
              @puzzle.solve
              @puzzle.solved?.should be_true
          end
      }
  end

Sampling

Coming to my senses, I quickly realized that it would take a long, long time to run through all 362,880 permutations. So I adjusted, changing the loop to just take 1000 of the permutations:

  permutations[0..999].each{ |board|
      it "should be able to solve [#{board}]" do
          @puzzle.load(board)
          @puzzle.solve
          @puzzle.solved?.should be_true
      end
  }

That returned in about 20 seconds. Encouraged, I tried it with 5000 permutations. That took about 90 seconds. I decided to push my luck with 10,000 permutations. That stalled out. I backed it down to 5200 permutations. That returned in a little over 90 seconds. I cranked it up to 6000 permutations. Stalled again.

I thought it might be some kind of limitation with rspec and I was content to keep my test runs to a sample of about 5000. But I decided that sampling the first 5000 generated boards every time wasn’t that interesting. So I wrote a little more code to randomly pick the sample.

My tests started hanging again.

My Tests Found a Bug! (But I Didn’t Believe It at First.)

Curious about why my tests would be hanging, I decided to pick a sample out of the middle of the generated boards by calling:

  permutations[90000..90999]

The tests hung. I chose a different sample:

  permutations[10000..10999]

No hang.

I experimented with a variety of values and found that there was a correlation: the higher the starting number for my sample, the longer the tests seemed to take.

“That’s just nuts,” I thought. “It makes no sense. But…maybe…”

In desperation, I texted my friend Glen.

I was hoping that Glen would say, “Yeah, that makes sense because [some deep arcane thing].” (Glen knows lots of deep arcane things.) Alas, he gently (but relentlessly) pushed me to try a variety of other experiments to eliminate RSpec as a cause. Sure enough, after a few experiments I figured out that my code was falling into an infinite loop.

Once I recognized that it was my code at fault, it didn’t take long to isolate the bug to a specific condition that I had not previously checked. I added the missing low-level test and discovered the root cause of the infinite loop.

It turns out that my code had two similarly-named variables, and I’d used one when I meant the other. The result was diabolically subtle: in most situations, the puzzle solving code arrived at the same outcome it would have otherwise, just in a more roundabout way. But in a few specific situations the code ended up in an infinite loop. (And in fixing the bug, I eliminated one of the two confusing variables to make sure I wouldn’t make the same mistake again.)

I never would have found that bug if I hadn’t been running my code through its paces with a large sample of the various input permutations. So I think it’s appropriate to have discovered the bug, thus demonstrating the value of high-volume auto-generated tests, while writing about the mechanics of auto-generating tests with RSpec.

In the meantime, if you would like to play with my slider puzzle sample code and tests, I’ve released it under Creative Commons license and posted it on github. Enjoy! (I’m not planning to do much more with the sample code myself, and can’t promise to provide support on it. But I’ll do my best to answer questions. Oh, and yes, it really could use some refactoring. Seriously. A bazillion methods all on one class. Ick. But I’m publishing it anyway because I think it’s a handy example.)

Sep 012009

It sounds like Joe Stump is having a bad time of it right now.

Joe Stump, formerly of Digg, left Digg to co-found a Mobile games company. They released the first of their games, Chess Wars, in late June.

Soon after, new players found serious problems that prevented them from playing the game. In response, the company re-submitted a new binary to Apple in July. As of this writing, the current version of Chess Wars is 1.1.

The trouble started with patch release #2. Apparently, even six weeks after Joe’s company submitted the new binary (release number 3 for those who are counting), Apple still hasn’t approved it.

Eventually Joe got so fed up with waiting, and with seeing an average rating of two-and-a-half out of five stars, that he wrote a vitriolic blog post [WARNING: LANGUAGE NOT SAFE FOR WORK (or for anyone with delicate sensibilities)] blaming Apple for his woes.

That garnered the attention of Business Insider who then published an article about the whole mess.

Predictably, reactions in the comments called out Joe Stump for releasing crappy software.

I should mention here that I don’t know Joe. I don’t know anything about how he develops software. I think that there’s some delightful irony in the name of his company: Crash Corp. But I doubt he actually intended to release software that crashes.

Anyway, Joe submitted a comment to the Business Insider article defending his company’s development practices:

We have about 50 beta testers and exhaustively test the application before pushing the binary. In addition to that the application has around 200 unit tests. The two problems were edge cases that effect [sic] only users who had nobody who were friends with the application installed.

I’m having a great deal of trouble with this defense.

Problem #1: Dismissing the Problems as “Edge Cases”

The problems “only” occur when users do not have any Facebook Friends with the application. But that’s not an aberrant corner case. This is a new application. As of the first release, no one has it yet. That means any given new user has a high probability of being the first user within a circle of friends. So this is the norm for the target audience.

Joe seems to think that it’s perfectly understandable that they didn’t find the bugs during development. But just because you didn’t think of a condition doesn’t make it an “edge case.” It might well mean that you didn’t think hard enough.

Problem #2: Thinking that “50 Beta Testers” and “200 Unit Tests” Constitutes Exhaustive Testing

Having beta testers and unit tests is a good and groovy thing. But it’s not sufficient, as this story shows. What appears to be missing is any kind of rigorous end-to-end testing.

Given an understanding of the application under development, a skilled tester would probably have identified “Number of Friends with Chess Wars Installed” as an interesting thing to vary during testing.

And since it’s a thing we can count, it’s natural to apply the 0-1-Many heuristic (as described on the Test Heuristics Cheat Sheet). So we end up testing 0-friends-with-app, 1-friend-with-app, and Many-friends-with-app.

So even the most cursory Exploratory Testing by someone with testing skill would have been likely to reveal the problem.

I’m not suggesting that Joe’s company needed to hire a tester. I am saying that someone on the implementation team should have taken a step back from the guts of the code long enough to think about how to test it. Having failed to do that, they experienced sufficiently severe quality problems to warrant not one but two patch releases.

Blaming Apple for being slow to release the second update feels to me like a cheap way of sidestepping responsibility for figuring out how to make software that works as advertised.

In short, Joe’s defense doesn’t hold water.

It’s not that I think Apple is justified in holding up the release. I have no idea what Apple’s side of the story is.

But what I really wanted to hear from Joe, as a highly visible representative of his company, is something less like “Apple sucks” and something much more like “Dang. We screwed up. Here’s what we learned…”

And I’d really like to think that maybe, just maybe, Joe’s company has learned something about testing and risk and about assuming that just because 50 people haphazardly pound on your app for a while that it’s been “exhaustively” tested.

Jul 292009

I’ve been hinting about a new venture on Twitter, and it’s time to explain what’s going on.New Space

I’m in the process of opening a new office. Or rather, my company, Quality Tree Software, Inc. is opening a new space in our current building in Pleasanton, CA.

It’s 1200 square feet of open-layout-Agile-goodness. When it’s done, it will be outfitted in the spirit of the best Agile organizations I’ve seen. It will be one big wide open workspace with lots of natural light. We’ll fill it with modular furniture that will be able to accommodate a variety of uses.

FloorplanThe space is still under construction. So you’ll have to use your imagination to envision the finished space. But trust me. It will be cool. It will look like a well-appointed team room. There will be big whiteboards. There will be a big visible CI monitor. There will be a library. There will be a story wall. There will be big visible charts. There will be desks suitable for pairing. There will be comfy chairs. There will be index cards.

My intent is to create a training space that offers participants an immersive Agile experience. Just as I’ve recommended that people visit Pivotal Labs in San Francisco or Atomic Object in Grand Rapids or Menlo Innovations in Ann Arbor, I hope that others will be inspired to recommend that their friends and colleagues visit our new space to see what an Agile space feels like.

And because having this space means we have our very own dedicated venue, we’ll be able to offer beta-level, not-quite-ready-for-primetime classes at significantly reduced rates. And we’ll be able to experiment freely.

I’m already talking with other training providers about classes they might want to do in the space. Our intent is to host offerings from all sorts of folks. It’s kinda like having a performance venue showcasing awesome trainers and facilitators who are aligned with our values.

In that spirit, the vision is to create far more than “just” a great training space. I also hope that the space can become a kind of community hub. I want it to become the kind of place that people look forward to visiting just, well, because. Because it feels good to be there. Because it reminds them of what a living breathing team space feels like. So we plan to host community events like OSTATLI in the space. And I hope that the space will foster a community of practice for Agile trainers where we can share experiences and material, and collaborate to create better classes.

There’s still a lot more work to be done before we’re ready for visitors. We’re currently targeting an October opening. But construction delays could push that date back. I’ll post updates here, and pictures, as things progress.

In the meantime, I hope you’ll consider visiting us when the space is finished!

May 262009

I think it’s important to define “Agile” when I talk about “Agile Testing.”

Agile is one of those capitalized umbrella terms, like Quality, that means many things to many people. And given that Agile Testing involves testing in an Agile context, it’s hard to talk about it if we have not established a shared understanding of the term “Agile.”

I define Agile in terms of results. Specifically, Agile teams:

  • Deliver a continuous stream of potentially shippable product increments
  • At a sustainable pace
  • While adapting to the changing needs and priorities of their organization


(Tip ‘o the hat due to various sources that inspired my definition, including the APLN’s Declaration of Interdependence for the phrase “continuous flow of value”, Scrum for the phrase “potentially shippable product increment”, XP for the core practice of “Sustainable Pace”, and Jim Highsmith plus too many other people/sources to mention for the idea of adapting to changing needs.)

Teams that are consistently able to achieve those results typically exhibit the following characteristics:

  • A high degree of Communication and Collaboration.
  • Seeking and receiving fast Feedback.
  • Seeking mechanisms to support Visibility so everyone knows what’s going on at any given time.
  • A high degree of Alignment so everyone is working toward the same goals.
  • A shared Definition of Done that includes Implemented, Tested, and Explored before being Accepted by the Product Owner.
  • A relentless Focus on Value.

And teams that manifest these characteristics typically have adopted a combination of Agile management and engineering practices including:

  • Prioritized Backlog
  • Short Iterations (or Sprints)
  • Daily Stand-ups (or Scrums)
  • Integrated/Cross-Functional Team
  • Continuous Integration
  • Collective Code Ownership
  • Extensive Automated Tests
  • etc.

Too many people equate practices (e.g. Prioritized Backlog) and methods (e.g. Scrum) with Agile. But that’s backwards. Agile practices and methods increase the odds of achieving Agility, but they’re not a guarantee. The practices serve the desired outcome, not the other way around.

May 182009

Robert Small wrote me with a question (which he kindly gave me permission to post here, along with my answer):

My GUI developers are driving me nuts! They want to “fully automate” all testing for the GUI. I tried to explain that you cannot automate ease of use (usability) or look and feel and the like. They retort that I can’t give them a clear definition of usability due to the subjective nature of the topic. Advice?

My response:

I understand your frustration.

And I can also see that both you and the developers are right. I suspect you’re talking past each other. The problem is with the word “Test.” I think that you and the developers are both using the same word, but giving it two different meanings.

Let me explain…

First, translate “fully automate all testing for the GUI” as “automatically check that the GUI meets expectations.”

Expectations of a GUI may include: times when controls should be grayed out or invisible; circumstances under which a click should result in one behavior or another; interactions or affordances that should be consistent throughout the UI; or perhaps accessibility guidelines that are part of the acceptance criteria. We can automate tests for these kinds of concrete, explicit expectations.

Then translate “test for ease of use, look & feel, etc.” as “explore the current implementation to discover how it feels when used in practice.”

Until we use the system, we can’t know how it will feel. We can guess. We can speculate. But we can’t know. Exploring the emerging system gives us insight into how well it meets over-arching, subjective, abstract quality goals like “easy to use.”

Checking and Exploring yield different kinds of information. Checking tells us how well an implementation meets explicit expectations. Exploring reveals the unintended consequences of meeting the explicitly defined expectations and gives us a way to uncovers implicit expectations. (Systems can work exactly as specified and still represent a catastrophic failure, or PR nightmare. Just ask Facebook.)

As these translations show, you and the developers are talking about two different activities. They’re talking about Checking: verifying explicit, concrete expectations. You’re talking about Exploring: discovering the capabilities, limitations, and risks in the emerging system.

The developers are right: Checking can, and should, be automated.

And you’re right: Exploring is inherently a creative human-centric activity requiring keen observation and good judgment. We can use automation to support exploration, but we cannot automate the whole process of exploring.

Of course there is a relationship between Checking and Exploring: the information we discover when Exploring may yield new things that need Checking in the future.

However, the fact that the industry as a whole still lumps both Checking and Exploring under the more general term Testing results in disagreements like this, where two sets of people end up talking past each other because each is only seeing one side of the testing equation.

The bottom line is that the team as a whole needs the information, the feedback, afforded by both Checking and Exploring. Attempting to argue for one over the other, as though it’s an either-or choice, creates a false dilemma. The question is not which approach is right, but rather how to ensure we consistently do both.

(Oh, and by the way, this discussion around Checking and Exploring is related to the section I wrote on Exploratory Testing in The Art of Agile Development by James Shore and Shane Warden. I admit I’m biased, since I wrote a section, but I recommend the book.)

Apr 172009

I am so excited!

Thanks to the efforts of my assistant Melinda, who did an awesome job laying out the design, and our friends at CafePress who do the manufacturing and shipping, we now offer a mug with heuristics from the wildly popular Test Heurististics Cheatsheet.

Cheat Mugs!

Order one for yourself and for your favorite test obsessed colleague, friend, and family members. You might also want to order a few for those people on your projects who you wish would be just a little more test obsessed: they’ll get test ideas with every sip of their favorite beverage.

We make $1 with every mug ordered. So order yours today!