Sep 042009

This post started out as a quick little entry about a cool parlor trick you can do with RSpec to make it work for auto-generated test data. But in the middle of writing what was supposed to be a simple post, my tests found a subtle bug with bad consequences. (Yeah for tests!)

So now this post is about auto-generated tests with RSpec, and what I learned hunting down my bug.

Meet RSpec

In case you haven’t encountered RSpec before, it’s one of the Behavior Driven Development developer test frameworks along with JBehave, EasyB, and others.

Each RSpec test looks something like this:

  it "should be able to greet the world" do
      greet.should equal("Hello, World!")
  end

I used RSpec to TDD a solution to a slider puzzle code challenge posted on the DailyWTF.

Auto-Generating LOTS of Tests with RSpec

So let’s imagine that you’re testing something where it would be really handy to auto-generate a bunch of test cases.

In my particular case, I wanted to test my slider-puzzle example against a wide range of starting puzzle configurations.

My code takes an array representing the starting values in a 3×3 slider puzzle and, following the rules of the slider puzzle, attempts to solve it. I knew that my code would solve the puzzle sometimes, but not always. I wanted to see how often my little algorithm would work. And to test that, I wanted to pump it through a bunch of tests and give me pass/fail statistics.

I could write individual solution tests like this:

  it "should be able to solve a board" do
      @puzzle.load([1, 2, 3, 4, 5, 6, 8, 7, nil])
      @puzzle.solve
      @puzzle.solved?.should be_true
  end

But with 362,880 possible permutations of the starting board, I most certainly was NOT going to hand code all those tests. I hand coded a few in my developer tests. But I wanted more tests. Lots more.

I knew that I could generate all the board permutations. But then what? Out of the box, RSpec isn’t designed to do data driven testing.

It occurred to me that I should try putting the “it” into a loop. So I tried a tiny experiment:

  require 'rubygems'
  require 'spec'

  describe "data driven testing with rspec" do

      10.times { | count |
          it "should work on try #{count}" do
              # purposely fail to see test names
              true.should be_false
          end
      }

  end

Lo and behold, it worked!

I was able then to write a little “permute” function that took an array and generated all the permutations of the elements in the array. And then I instantiated a new test for each:

  describe "puzzle solve algorithm" do
      permutations = permute([1,2,3,4,5,6,7,8,nil])

      before(:each) do
        @puzzle = Puzzle.new
      end

      permutations.each{ |board|
          it "should be able to solve [#{board}]" do
              @puzzle.load(board)
              @puzzle.solve
              @puzzle.solved?.should be_true
          end
      }
  end

Sampling

Coming to my senses, I quickly realized that it would take a long, long time to run through all 362,880 permutations. So I adjusted, changing the loop to just take 1000 of the permutations:

  permutations[0..999].each{ |board|
      it "should be able to solve [#{board}]" do
          @puzzle.load(board)
          @puzzle.solve
          @puzzle.solved?.should be_true
      end
  }

That returned in about 20 seconds. Encouraged, I tried it with 5000 permutations. That took about 90 seconds. I decided to push my luck with 10,000 permutations. That stalled out. I backed it down to 5200 permutations. That returned in a little over 90 seconds. I cranked it up to 6000 permutations. Stalled again.

I thought it might be some kind of limitation with rspec and I was content to keep my test runs to a sample of about 5000. But I decided that sampling the first 5000 generated boards every time wasn’t that interesting. So I wrote a little more code to randomly pick the sample.

My tests started hanging again.

My Tests Found a Bug! (But I Didn’t Believe It at First.)

Curious about why my tests would be hanging, I decided to pick a sample out of the middle of the generated boards by calling:

  permutations[90000..90999]

The tests hung. I chose a different sample:

  permutations[10000..10999]

No hang.

I experimented with a variety of values and found that there was a correlation: the higher the starting number for my sample, the longer the tests seemed to take.

“That’s just nuts,” I thought. “It makes no sense. But…maybe…”

In desperation, I texted my friend Glen.

I was hoping that Glen would say, “Yeah, that makes sense because [some deep arcane thing].” (Glen knows lots of deep arcane things.) Alas, he gently (but relentlessly) pushed me to try a variety of other experiments to eliminate RSpec as a cause. Sure enough, after a few experiments I figured out that my code was falling into an infinite loop.

Once I recognized that it was my code at fault, it didn’t take long to isolate the bug to a specific condition that I had not previously checked. I added the missing low-level test and discovered the root cause of the infinite loop.

It turns out that my code had two similarly-named variables, and I’d used one when I meant the other. The result was diabolically subtle: in most situations, the puzzle solving code arrived at the same outcome it would have otherwise, just in a more roundabout way. But in a few specific situations the code ended up in an infinite loop. (And in fixing the bug, I eliminated one of the two confusing variables to make sure I wouldn’t make the same mistake again.)

I never would have found that bug if I hadn’t been running my code through its paces with a large sample of the various input permutations. So I think it’s appropriate to have discovered the bug, thus demonstrating the value of high-volume auto-generated tests, while writing about the mechanics of auto-generating tests with RSpec.

In the meantime, if you would like to play with my slider puzzle sample code and tests, I’ve released it under Creative Commons license and posted it on github. Enjoy! (I’m not planning to do much more with the sample code myself, and can’t promise to provide support on it. But I’ll do my best to answer questions. Oh, and yes, it really could use some refactoring. Seriously. A bazillion methods all on one class. Ick. But I’m publishing it anyway because I think it’s a handy example.)

Feb 152007

A quick update on the Bay Area TD-DT Summit I mentioned in a previous post.

The response has been overwhelming. So overwhelming that I discovered today that I hadn’t responded to everyone that sent a request for invitation. I think I’m caught up now though. So if you sent a request for invitation already, and haven’t heard a peep from either me or Chris McMahon, please send it again. I’ll do the appropriate grovelling for having misplaced your request, and we’ll see if we can make room.

Speaking of making room, I should mention that at this point, I think we’re full.

Actually, we passed “full” a while ago. Our sponsor who is generously donating space outdid themselves by allowing us to expand the gathering. But there comes a point where we have to say, “this is big enough,” and “there will be other gatherings.”

So if you haven’t contacted us yet about the TD/DT summit, but meant to, don’t despair. This will be the first of many, I suspect. There’s a vibrant community of incredibly cool folks interested in how testers who code and developers who test can learn from each other and improve their craft. And I think we’re all just beginning to find each other.

Jan 192007

Jason Huggins has very kindly pointed me to two more places where Developer-Testers/Tester-Developers (DT/TD) hang out. Interestingly, both were in London.

  • Google hosted LTAC (the London Test Automation Conference) in September 2006. Antony Marcano mentioned LTAC on his blog. And the LTAC talks are available on Google Videos. And there’s a mail list.
  • ThoughtWorks sponsored CITCON (the Continuous Integration and Testing Conference), in Chicago and also in London. I had trouble finding online content relating to the conference. The citconf.com site points back to the thoughtworks.com main site, and most of the blog entries I found said, “looks like it will be fun, I’m going/I might go/you should go.” But no one I found said, “I went, it was great, and here’s what happened…” I found pictures though. And there’s a mail list.

(What makes London the hotbed of DT/TD activity? Hmmm.)

Anyway…

Chris McMahon and I have been talking about pulling together a small peer conference in the SF Bay Area. And the more we talked, the more excited we got about it. So today we said: “let’s just do it and see what happens.” And we set a date. So…without further ado…I’m pleased to announce…

Bay Area Developer-Tester/Tester-Developer (DT/TD) Summit
Saturday, February 24
A peer-driven gathering of developer-testers and tester-developers to share knowledge and code.

Location: SF Bay Area, exact location TBD. If you have space you’re willing (and authorized) to lend us, we’d like to talk to you.

This is a small, peer-driven, non-commercial, invitation-only conference in the tradition of LAWST, AWTA, and the like. The content comes from the participants, and we expect all participants to take an active role. We’re seeking participants who are testers who code, or developers who test.

Our emphasis will be on good coding practices for testing, and good testing practices for automation. That might include topics like: test code and patterns; refactoring test code; creating abstract layers; programmatically analyzing/verifying large amounts of data; achieving repeatability with random tests; OO model-based tests; and/or automatically generating large amounts of test data.

These are just possible topics we might explore. The actual topics will depend on who comes and what experience reports/code they’re willing to share.

If we can get donated space, the cost to participate will be $0. If we can’t get donated space the cost will be a nominal fee (~$50) intended to help us defray expenses.

Participants will be responsible for their own travel expenses.

Proposed Agenda:

  • Timeboxed group discussions: “Essential attributes of a tester-developer and developer-tester (differences and similarities)” and “What tester-developers want to learn from developers; what developer-testers want to learn from testers.”
  • Code Examples/Experience Reports (we figure we have time for 3 of these)
  • End of day discussion: Raising visibility for the role of a DT/TD, building community among practitioners

If you’re interested in participating, send me an email answering these questions:

  • Which are you: a tester who codes or a developer who tests?
  • How did you come to have that role?
  • What languages do you usually program tests in?
  • What do you hope to contribute to the Bay Area DT/TD summit? Do you have any code or examples that you’d like to share? (Please note that you should not share anything covered by a non-disclosure agreement.)
  • What do you hope to get out of the Bay Area DT/TD summit?

My own goals with doing this are:

  • Learn from others better ways of programming automated tests
  • Meet others in the DT/TD role
  • Build community
Jan 172007

I spent this last weekend at AWTA, the Austin Workshop on Test Automation. Our official topic was Open Source Testing Frameworks. Before the meeting, I figured we’d discuss experience reports about how folks used Watir/Selenium/xUnit/CruiseControl/etc. and Ruby/Perl/Python/etc. to cobble together totally automated, lights out, acceptance testing solutions.

We did discuss topics like that. Bob Cotton showed us some very cool code integrating SeleniumRC and rspec. Jeff Fry showed some excellent Ruby code that generated tests on the fly. Some folks got together and spiked an integration of Selenium and Watir. (Brian Marick dubbed it “Mineral Watir.” Oooh. Bubbly.) Paul Rogers and David Crosby both spent time pairing with me to improve my Web 2.0 testing example, digging into JSUnit, Selenium, and the unittest.js library from Prototype. (Yes, I’ll get around to posting code sometime soon.)

And we did talk about how any given Open Source test automation approach intevitably involves integrating multiple pieces.

But what I didn’t expect to happen is that the discussion would turn to how developers are becoming testers and testers are becoming developers. In retrospect, I should have known it would happen. The subject is bound to come up when a bunch of developers who are passionate about testing hang out with a bunch of testers who spend a lot of their time writing code.

And it seems we weren’t the only ones thinking about this topic. Chris McMahon kindly alerted me to Steve Rowe’s excellent post about Test Developers.

So this has me thinking.

Tester-developers and developer-testers aren’t new. But we seem to be a hidden sub-community. We sometimes have titles like “Tool Smith,” but more often we have titles that look like most other tester or developer titles: “Senior Test Engineer” or somesuch.

But the role is growing.

I think that’s partly because large, successful companies are leading the way. Microsoft has a special job class and title: “Software Development Engineer in Test.” And I have heard unsubstantiated rumors that interviewing for a test position with Google involves answering questions about algorithm design.

I also think Agile is contributing to the growth in the role. Agile teams tend to blur the distinction between jobs, particularly developers and testers. Agile developers are test infected, and Agile testers tend to spend time mucking about in code. As Agile grows, so the hyphenated tester-developer/developer-tester role grows.

So, if the number of people doing both test and development growing, where are we gathering as a community?

One gathering place is online communities.  The agile-testing mail list is chock full of folks who both test and develop.

Another natural place for a community to gather is at a conference.  AWTA was a gathering place on a small scale. But when I look at the major conferences, they’re pitched to Developers (SDWest), Testers (STAR), or Everyone (Agile, Better Software).

I want more opportunities to share experiences, write code together, and learn from each other.

So, if you’re a tester-developer/developer-tester, where do you go to share test code, talk about test refactoring strategies, or exchange notes on test frameworks and scripting languages?
Oh, and if you have a blog related to ( testing && development ), please post a link in the comments!