About ehendrickson

Author Archive | ehendrickson

Why Test Automation Costs Too Much

“We can’t automate everything. It costs too much.”

I’ve heard that concern—that test automation costs too much—multiple times when talking with folks in the middle of an Agile transition.

They know that automated regression testing is important in an Agile context. But they’re concerned about the costs. And they can’t even imagine getting to a point where all the regression tests are automated.

They’re right to be concerned. Their organizations have typically struggled just to reach a bare minimum of automated test coverage.

So why is that? Why have these organizations historically struggled with test automation?

Usually the organization’s test automation strategy involves a dedicated team creating automated tests after the code is written. At least that was probably their approach before the Agile transition. It’s the strategy promoted by most of the test automation tool vendors back in the 90’s.

(Some of those vendors even persist in promoting that kind of test-last automation strategy today. Unbelievable but true. Don’t even get me started. I’ll just rant. It won’t be pretty.)

But until the organization adopts a new test automation approach, they’re stuck with what they have. And the result of the traditional approach is that:

  1. The test automation is being written in a language or tool that is only accessible (or known) to a small number of specialists. That group of specialists is a bottleneck.
  2. To get anything done, that group of specialists has to reverse engineer the developed software to figure out how to shoehorn automation onto an untestable interface, typically a GUI. (Ow. Pain. Ow.) And they have to go through the GUI: the system quite probably has business logic intertwingled with the GUI layer. Even if it doesn’t, the test automation specialists probably don’t know how to reach into the code base to bypass the GUI.
  3. The specialists may be test automation experts, but they are usually not professional programmers. That means that while they can make test automation tools sing, dance, and jump through hoops, they usually have not studied program design principles (SOLID, patterns). Most test automation code I see that’s written by specialists is kinda smelly. (I don’t blame them. My code was pretty dang smelly too when I was a test automation specialist. Pairing with real professional developers did a world of good for my programming skills.)
  4. The previous generation of commercial specialized test automation tools enforce their own architecture making it dang near impossible to apply software design principles even if you do understand them.
  5. The specialized commercial test automation tools usually cost an arm and a leg. (Need another license to execute the tests on the CI server? Fork over another 5 figures worth of licensing and maintenance fees.)

Bottom line: the reason test automation costs so much is that it’s done in a silo far removed from the development effort.

Buffered from the consequences of design decisions that decrease testability, the developers continue to create software that’s nigh onto impossible to automate.

And isolated from the technical expertise of how the software was constructed, the test automation specialists are in a situation where they cannot help but be both inefficient and ineffective.

Highly functioning Agile teams break down those silos. With the most effective Agile teams I’ve seen, everyone on the team is involved in automating the tests. And the automated tests go into the same shared source repository where the developers check in the code.

When we integrate the test automation effort with the development effort, we reduce the cost of test automation drastically. In doing so, we fundamentally change the cost/benefit tradeoff equation so that fully automated regression tests start looking like an achievable goal instead of an impossible pipe dream.

It’s time to stop imagining that test automation and programming are two separate activities done by different people with different job titles working in different departments.

That approach didn’t work particularly well with the V-Model, but it’s a flat out fail in an Agile context where programming and automating are part-and-parcel of the overall software development effort.

Comments { 19 }

Look! An Update!

So what have I been up to for the last 7 months? There’s the usual stuff: a fair bit of client work, some conferences, and a whole lot of travel: Finland, Germany, Japan, and various locations in the US.

But the most exciting news is that Agilistry Studio is open! We’ve held a bunch of events there including an Open Source Test Automation Love In (OSTATLI), a week-long immersion dubbed “Agile Up to Here” (see writeups by Alan Cooper and Jon Bach), and a bunch of meetups. We have classes coming up there including an immersive Agile transformation simulation (WordCount). Please sign up!

Comments { 1 }

Why I Define Agile in Terms of Results

Not everyone agrees with my definition of Agile.

Dave Nicolette commented that he thinks my definition actually describes Lean. He defines Agile in terms of the Agile Manifesto.

I replied to Dave elsewhere, but wanted to post my response here too since this is a topic that comes up frequently.

I have trouble defining Agile solely in terms of the Agile Manifesto.

Mind you, I believe the Agile Manifesto is a great document. With it, the original signatories gave the industry a focal point, a fulcrum on which to turn. They forever changed our industry by distilling the difference, in terms of their guiding values and principles, between the lightweight approaches they used and the then-generally accepted formal processes & industry “Best Practices.”

And turn we have. “Agile,” at least as a buzzword, is mainstream. Increasingly organizations are adopting Agile methods to remain competitive.

However, I see the manifesto as a beginning, not a definitive end. The community has learned much in intervening years.

Part of what I’ve personally learned is that defining Agile in terms of results short circuits two of the bigger problems I see plaguing the community: 1) religious debates and 2) superficial but ultimately hollow attempts at transitions that result in frAgile processes.

Where people define Agile in terms of practices, I see more instances of Cargo Cult adoption (“We’re Agile because we stand up every morning!”) and religious dogmatism (“You don’t TDD?!? You can’t possibly be Agile!”).

Where people define Agile in terms of values, I see more instances of Agile-as-an-excuse (“Documentation? No. We don’t document anything. We’re Agile!”).

But where people define Agile in terms of results, I see greater focus on the ultimate goal: value to the business.

As it happens, the best way we’ve found to achieve that goal so far—or at least the best way I’ve seen so far—involves embracing the values and principles of the manifesto.

But I believe that remembering why those values and principles are important, remembering the result we’re trying to achieve, is essential. And so I define Agile in terms of results.

Comments { 10 }

New Look & Feel

I knew I should have updated my WordPress installation. But it seemed like I always had something more urgent on my plate. I’ve been living in Quadrant 1 for too long. Quadrant 2 tasks—those that are important but not (yet) urgent—have languished until, through the natural course of events, they became urgent. Like, oh say, when this blog got hacked because of a security vulnerability in the older version of WordPress.


In my rush, I blew away a few things accidentally. Like my blog roll. So I still have some work to do to get the site completely back to normal.

But in the meantime I think I’ve gotten all the content back. And I’ve migrated to a shiny, new Theme that supports threaded comments.

Please let me know if you find bugs.

Comments { 3 }

My PNSQC Keynote with Agile Timeline

A few days ago, I tweeted that I was looking for nominations for events for an Agile timeline and am extremely grateful for all the responses I received.

The request was for the keynote talk that I just presented at PNSQC. I’ve had several requests for the timeline that resulted, so I figure the easiest (and therefore fastest) way to share the resulting timeline would be to share my slides.

Here they are (pdf, ~1Mb). Enjoy! (As always, comments/questions/critiques welcome.)

Comments { 6 }

Specialized Test Management Systems are an Agile Impediment

If you work in an Agile organization and are using a heavy weight specialized tool for test management, I have an important message for you:

Stop. Seriously. Just stop. It’s getting in the way.

If you are accustomed to heavyweight test management solutions, you might not realize the extent to which a test management tool is more of an impediment than an aid to agility. But for Agile teams, it is. Always. Without exception.

I don’t make such claims lightly and I don’t expect you to accept my claims at face value. So let me explain.

The Agile Alternative to Test Management

The things you need need to manage the test effort in an Agile context are whatever you are already using for the: Backlog; Source Control Management (SCM) System; Continuous Integration (CI) System; and Automated Regression Tests.

That’s it. You don’t need any other tools or tracking mechanisms.

Any test-specific repository will increase duplication and add unnecessary overhead to keep the duplicate data in sync across multiple repositories. It will also probably necessitate creating and managing cumbersome meta data, like traceability matrices, to tie all the repositories together.

All that overhead comes at a high cost and adds absolutely no value beyond what SCM, CI, & the Backlog already provide.

But, But, But…

I’ve heard any number objections to the notion that Agile teams don’t need specialized test management systems. I’ll tackle the objections I hear most often here:

But Where Do the Tests Live?
Persistent test-related artifacts go in one of two places:

  • High-level acceptance criteria, test ideas, and Exploratory Testing charters belong in the Backlog with the associated Story.
  • Technical artifacts including test automation and manual regression test scripts (if any) belong in the Source Control System versioned with the associated code.

And Where Do We Capture the Testing Estimates?
In Agile, we ultimately care about Done Stories. Coded but not Tested means Not Done. Thus the test effort has to be estimated as part of the overall Story implementation effort if we are to have anything even remotely approaching accurate estimates. So we don’t estimate the test effort separately, and that means we don’t need a separate place to put test estimates.

How Do I Prioritize Tests?
Agile teams work from a prioritized backlog. Instead of prioritizing tests, they prioritize Stories. And Stories are either Done or not. Given that context, it does not make sense to talk about prioritizing the tests in isolation.

Hello, I Live in the Real World. There is Never Enough Time to Test. How Do I Prioritize Tests Given Time Pressure?
If the Story is important enough to code, it’s important enough to test. Period. If you’re working in an Agile context it is absolutely critical that everyone on the team understands this.

But Testing is Never Done. Seriously, How Do I Prioritize What To Test?
This isn’t really a test management problem. This is a requirements, quality, and testing problem that test management solutions offer the illusion of addressing.

The answer isn’t to waste time mucking about in a test management tool attempting to manage the effort, control the process, or prioritize tests. Every minute we spend mucking about in a test management tool is a minute we’re not spending on understanding the real state of the emerging system in development.

The answer instead is to invest the time in activities that contribute directly to moving the project forward: understanding the Product Owner’s expectations; capturing those expectations in automated acceptance tests; and using time-boxed Exploratory Testing sessions to reveal risks and vulnerabilities.

What about the Test Reports?
Traditional test management systems provide all kinds of reports: pass/fail statistics, execution time actuals v. estimated, planned v. executed tests, etc. Much of this information is irrelevant in an Agile context.

The CI system provides the information that remains relevant: the automated test execution results. And those results should be 100% Green (passed) most of the time.

What about Historical Test Results Data?
Most teams find that the current CI reports are more interesting than the historic results. If the CI build goes Red for any reason, Agile teams stop and fix it. Thus Agile teams don’t have the same kind of progression of pass/fail ratios that traditional teams see during a synch and stabilize phase. And that means historic trends usually are not all that interesting.

However, if the team really wants to keep historic test execution results (or are compelled to do so as a matter of regulatory compliance), the test results can be stored in the source control system with the code.

Speaking of Regulatory Compliance, How Can We Be in Compliance without a Test Management System?
If your context involves FDA, SOX, ISO, or just internal audit compliance, then you probably live in a world where:

  • If it wasn’t documented, it didn’t happen
  • We say what we do and do what we say
  • Test repeatability is essential

In that context, specialized test management solutions may be the defacto standard, but they’re not the best answer. If I’m working on a system where we have to be clear, concrete, and explicit about requirements, tests, and execution results, then I would much rather do Acceptance Test Driven Development. ATDD provides the added value of executable requirements. Instead of the tests and requirements just saying what the system should do, they can be executed to demonstrate that it does.

Certainly, doing ATDD requires effort. But so does maintaining a separate test management system and all the corresponding traceability matrices and overhead documentation.

Our Management Requires Us to Use a Specialized Test Management System. Now What?
Send them the URL to this post. Ask them to read it. Then ask them what additional value they’re getting out a test management system that they wouldn’t get from leveraging SCM, CI, the Backlog, and the automated regression tests.

So, have I convinced you? If not, please tell me why in the comments…

Comments { 34 }

Adventures with Auto-Generated Tests and RSpec

This post started out as a quick little entry about a cool parlor trick you can do with RSpec to make it work for auto-generated test data. But in the middle of writing what was supposed to be a simple post, my tests found a subtle bug with bad consequences. (Yeah for tests!)

So now this post is about auto-generated tests with RSpec, and what I learned hunting down my bug.

Meet RSpec

In case you haven’t encountered RSpec before, it’s one of the Behavior Driven Development developer test frameworks along with JBehave, EasyB, and others.

Each RSpec test looks something like this:

  it "should be able to greet the world" do
      greet.should equal("Hello, World!")

I used RSpec to TDD a solution to a slider puzzle code challenge posted on the DailyWTF.

Auto-Generating LOTS of Tests with RSpec

So let’s imagine that you’re testing something where it would be really handy to auto-generate a bunch of test cases.

In my particular case, I wanted to test my slider-puzzle example against a wide range of starting puzzle configurations.

My code takes an array representing the starting values in a 3×3 slider puzzle and, following the rules of the slider puzzle, attempts to solve it. I knew that my code would solve the puzzle sometimes, but not always. I wanted to see how often my little algorithm would work. And to test that, I wanted to pump it through a bunch of tests and give me pass/fail statistics.

I could write individual solution tests like this:

  it "should be able to solve a board" do
      @puzzle.load([1, 2, 3, 4, 5, 6, 8, 7, nil])
      @puzzle.solved?.should be_true

But with 362,880 possible permutations of the starting board, I most certainly was NOT going to hand code all those tests. I hand coded a few in my developer tests. But I wanted more tests. Lots more.

I knew that I could generate all the board permutations. But then what? Out of the box, RSpec isn’t designed to do data driven testing.

It occurred to me that I should try putting the “it” into a loop. So I tried a tiny experiment:

  require 'rubygems'
  require 'spec'

  describe "data driven testing with rspec" do

      10.times { | count |
          it "should work on try #{count}" do
              # purposely fail to see test names
              true.should be_false


Lo and behold, it worked!

I was able then to write a little “permute” function that took an array and generated all the permutations of the elements in the array. And then I instantiated a new test for each:

  describe "puzzle solve algorithm" do
      permutations = permute([1,2,3,4,5,6,7,8,nil])

      before(:each) do
        @puzzle = Puzzle.new

      permutations.each{ |board|
          it "should be able to solve [#{board}]" do
              @puzzle.solved?.should be_true


Coming to my senses, I quickly realized that it would take a long, long time to run through all 362,880 permutations. So I adjusted, changing the loop to just take 1000 of the permutations:

  permutations[0..999].each{ |board|
      it "should be able to solve [#{board}]" do
          @puzzle.solved?.should be_true

That returned in about 20 seconds. Encouraged, I tried it with 5000 permutations. That took about 90 seconds. I decided to push my luck with 10,000 permutations. That stalled out. I backed it down to 5200 permutations. That returned in a little over 90 seconds. I cranked it up to 6000 permutations. Stalled again.

I thought it might be some kind of limitation with rspec and I was content to keep my test runs to a sample of about 5000. But I decided that sampling the first 5000 generated boards every time wasn’t that interesting. So I wrote a little more code to randomly pick the sample.

My tests started hanging again.

My Tests Found a Bug! (But I Didn’t Believe It at First.)

Curious about why my tests would be hanging, I decided to pick a sample out of the middle of the generated boards by calling:


The tests hung. I chose a different sample:


No hang.

I experimented with a variety of values and found that there was a correlation: the higher the starting number for my sample, the longer the tests seemed to take.

“That’s just nuts,” I thought. “It makes no sense. But…maybe…”

In desperation, I texted my friend Glen.

I was hoping that Glen would say, “Yeah, that makes sense because [some deep arcane thing].” (Glen knows lots of deep arcane things.) Alas, he gently (but relentlessly) pushed me to try a variety of other experiments to eliminate RSpec as a cause. Sure enough, after a few experiments I figured out that my code was falling into an infinite loop.

Once I recognized that it was my code at fault, it didn’t take long to isolate the bug to a specific condition that I had not previously checked. I added the missing low-level test and discovered the root cause of the infinite loop.

It turns out that my code had two similarly-named variables, and I’d used one when I meant the other. The result was diabolically subtle: in most situations, the puzzle solving code arrived at the same outcome it would have otherwise, just in a more roundabout way. But in a few specific situations the code ended up in an infinite loop. (And in fixing the bug, I eliminated one of the two confusing variables to make sure I wouldn’t make the same mistake again.)

I never would have found that bug if I hadn’t been running my code through its paces with a large sample of the various input permutations. So I think it’s appropriate to have discovered the bug, thus demonstrating the value of high-volume auto-generated tests, while writing about the mechanics of auto-generating tests with RSpec.

In the meantime, if you would like to play with my slider puzzle sample code and tests, I’ve released it under Creative Commons license and posted it on github. Enjoy! (I’m not planning to do much more with the sample code myself, and can’t promise to provide support on it. But I’ll do my best to answer questions. Oh, and yes, it really could use some refactoring. Seriously. A bazillion methods all on one class. Ick. But I’m publishing it anyway because I think it’s a handy example.)

Comments { 8 }

Not Exhaustively Tested

It sounds like Joe Stump is having a bad time of it right now.

Joe Stump, formerly of Digg, left Digg to co-found a Mobile games company. They released the first of their games, Chess Wars, in late June.

Soon after, new players found serious problems that prevented them from playing the game. In response, the company re-submitted a new binary to Apple in July. As of this writing, the current version of Chess Wars is 1.1.

The trouble started with patch release #2. Apparently, even six weeks after Joe’s company submitted the new binary (release number 3 for those who are counting), Apple still hasn’t approved it.

Eventually Joe got so fed up with waiting, and with seeing an average rating of two-and-a-half out of five stars, that he wrote a vitriolic blog post [WARNING: LANGUAGE NOT SAFE FOR WORK (or for anyone with delicate sensibilities)] blaming Apple for his woes.

That garnered the attention of Business Insider who then published an article about the whole mess.

Predictably, reactions in the comments called out Joe Stump for releasing crappy software.

I should mention here that I don’t know Joe. I don’t know anything about how he develops software. I think that there’s some delightful irony in the name of his company: Crash Corp. But I doubt he actually intended to release software that crashes.

Anyway, Joe submitted a comment to the Business Insider article defending his company’s development practices:

We have about 50 beta testers and exhaustively test the application before pushing the binary. In addition to that the application has around 200 unit tests. The two problems were edge cases that effect [sic] only users who had nobody who were friends with the application installed.

I’m having a great deal of trouble with this defense.

Problem #1: Dismissing the Problems as “Edge Cases”

The problems “only” occur when users do not have any Facebook Friends with the application. But that’s not an aberrant corner case. This is a new application. As of the first release, no one has it yet. That means any given new user has a high probability of being the first user within a circle of friends. So this is the norm for the target audience.

Joe seems to think that it’s perfectly understandable that they didn’t find the bugs during development. But just because you didn’t think of a condition doesn’t make it an “edge case.” It might well mean that you didn’t think hard enough.

Problem #2: Thinking that “50 Beta Testers” and “200 Unit Tests” Constitutes Exhaustive Testing

Having beta testers and unit tests is a good and groovy thing. But it’s not sufficient, as this story shows. What appears to be missing is any kind of rigorous end-to-end testing.

Given an understanding of the application under development, a skilled tester would probably have identified “Number of Friends with Chess Wars Installed” as an interesting thing to vary during testing.

And since it’s a thing we can count, it’s natural to apply the 0-1-Many heuristic (as described on the Test Heuristics Cheat Sheet). So we end up testing 0-friends-with-app, 1-friend-with-app, and Many-friends-with-app.

So even the most cursory Exploratory Testing by someone with testing skill would have been likely to reveal the problem.

I’m not suggesting that Joe’s company needed to hire a tester. I am saying that someone on the implementation team should have taken a step back from the guts of the code long enough to think about how to test it. Having failed to do that, they experienced sufficiently severe quality problems to warrant not one but two patch releases.

Blaming Apple for being slow to release the second update feels to me like a cheap way of sidestepping responsibility for figuring out how to make software that works as advertised.

In short, Joe’s defense doesn’t hold water.

It’s not that I think Apple is justified in holding up the release. I have no idea what Apple’s side of the story is.

But what I really wanted to hear from Joe, as a highly visible representative of his company, is something less like “Apple sucks” and something much more like “Dang. We screwed up. Here’s what we learned…”

And I’d really like to think that maybe, just maybe, Joe’s company has learned something about testing and risk and about assuming that just because 50 people haphazardly pound on your app for a while that it’s been “exhaustively” tested.

Comments are closed

Creating an Immersive Agile Training Space

I’ve been hinting about a new venture on Twitter, and it’s time to explain what’s going on.New Space

I’m in the process of opening a new office. Or rather, my company, Quality Tree Software, Inc. is opening a new space in our current building in Pleasanton, CA.

It’s 1200 square feet of open-layout-Agile-goodness. When it’s done, it will be outfitted in the spirit of the best Agile organizations I’ve seen. It will be one big wide open workspace with lots of natural light. We’ll fill it with modular furniture that will be able to accommodate a variety of uses.

FloorplanThe space is still under construction. So you’ll have to use your imagination to envision the finished space. But trust me. It will be cool. It will look like a well-appointed team room. There will be big whiteboards. There will be a big visible CI monitor. There will be a library. There will be a story wall. There will be big visible charts. There will be desks suitable for pairing. There will be comfy chairs. There will be index cards.

My intent is to create a training space that offers participants an immersive Agile experience. Just as I’ve recommended that people visit Pivotal Labs in San Francisco or Atomic Object in Grand Rapids or Menlo Innovations in Ann Arbor, I hope that others will be inspired to recommend that their friends and colleagues visit our new space to see what an Agile space feels like.

And because having this space means we have our very own dedicated venue, we’ll be able to offer beta-level, not-quite-ready-for-primetime classes at significantly reduced rates. And we’ll be able to experiment freely.

I’m already talking with other training providers about classes they might want to do in the space. Our intent is to host offerings from all sorts of folks. It’s kinda like having a performance venue showcasing awesome trainers and facilitators who are aligned with our values.

In that spirit, the vision is to create far more than “just” a great training space. I also hope that the space can become a kind of community hub. I want it to become the kind of place that people look forward to visiting just, well, because. Because it feels good to be there. Because it reminds them of what a living breathing team space feels like. So we plan to host community events like OSTATLI in the space. And I hope that the space will foster a community of practice for Agile trainers where we can share experiences and material, and collaborate to create better classes.

There’s still a lot more work to be done before we’re ready for visitors. We’re currently targeting an October opening. But construction delays could push that date back. I’ll post updates here, and pictures, as things progress.

In the meantime, I hope you’ll consider visiting us when the space is finished!

Comments are closed

Defining Agile: Results, Characteristics, Practices

I think it’s important to define “Agile” when I talk about “Agile Testing.”

Agile is one of those capitalized umbrella terms, like Quality, that means many things to many people. And given that Agile Testing involves testing in an Agile context, it’s hard to talk about it if we have not established a shared understanding of the term “Agile.”

I define Agile in terms of results. Specifically, Agile teams:

  • Deliver a continuous stream of potentially shippable product increments
  • At a sustainable pace
  • While adapting to the changing needs and priorities of their organization

(Tip ‘o the hat due to various sources that inspired my definition, including the APLN’s Declaration of Interdependence for the phrase “continuous flow of value”, Scrum for the phrase “potentially shippable product increment”, XP for the core practice of “Sustainable Pace”, and Jim Highsmith plus too many other people/sources to mention for the idea of adapting to changing needs.)

Teams that are consistently able to achieve those results typically exhibit the following characteristics:

  • A high degree of Communication and Collaboration.
  • Seeking and receiving fast Feedback.
  • Seeking mechanisms to support Visibility so everyone knows what’s going on at any given time.
  • A high degree of Alignment so everyone is working toward the same goals.
  • A shared Definition of Done that includes Implemented, Tested, and Explored before being Accepted by the Product Owner.
  • A relentless Focus on Value.

And teams that manifest these characteristics typically have adopted a combination of Agile management and engineering practices including:

  • Prioritized Backlog
  • Short Iterations (or Sprints)
  • Daily Stand-ups (or Scrums)
  • Integrated/Cross-Functional Team
  • Continuous Integration
  • Collective Code Ownership
  • Extensive Automated Tests
  • etc.

Too many people equate practices (e.g. Prioritized Backlog) and methods (e.g. Scrum) with Agile. But that’s backwards. Agile practices and methods increase the odds of achieving Agility, but they’re not a guarantee. The practices serve the desired outcome, not the other way around.

Comments are closed