This post started out as a quick little entry about a cool parlor trick you can do with RSpec to make it work for auto-generated test data. But in the middle of writing what was supposed to be a simple post, my tests found a subtle bug with bad consequences. (Yeah for tests!)
So now this post is about auto-generated tests with RSpec, and what I learned hunting down my bug.
Meet RSpec
In case you haven’t encountered RSpec before, it’s one of the Behavior Driven Development developer test frameworks along with JBehave, EasyB, and others.
Each RSpec test looks something like this:
it "should be able to greet the world" do
greet.should equal("Hello, World!")
end
I used RSpec to TDD a solution to a slider puzzle code challenge posted on the DailyWTF.
Auto-Generating LOTS of Tests with RSpec
So let’s imagine that you’re testing something where it would be really handy to auto-generate a bunch of test cases.
In my particular case, I wanted to test my slider-puzzle example against a wide range of starting puzzle configurations.
My code takes an array representing the starting values in a 3×3 slider puzzle and, following the rules of the slider puzzle, attempts to solve it. I knew that my code would solve the puzzle sometimes, but not always. I wanted to see how often my little algorithm would work. And to test that, I wanted to pump it through a bunch of tests and give me pass/fail statistics.
I could write individual solution tests like this:
it "should be able to solve a board" do
@puzzle.load([1, 2, 3, 4, 5, 6, 8, 7, nil])
@puzzle.solve
@puzzle.solved?.should be_true
end
But with 362,880 possible permutations of the starting board, I most certainly was NOT going to hand code all those tests. I hand coded a few in my developer tests. But I wanted more tests. Lots more.
I knew that I could generate all the board permutations. But then what? Out of the box, RSpec isn’t designed to do data driven testing.
It occurred to me that I should try putting the “it” into a loop. So I tried a tiny experiment:
require 'rubygems'
require 'spec'
describe "data driven testing with rspec" do
10.times { | count |
it "should work on try #{count}" do
# purposely fail to see test names
true.should be_false
end
}
end
Lo and behold, it worked!
I was able then to write a little “permute” function that took an array and generated all the permutations of the elements in the array. And then I instantiated a new test for each:
describe "puzzle solve algorithm" do
permutations = permute([1,2,3,4,5,6,7,8,nil])
before(:each) do
@puzzle = Puzzle.new
end
permutations.each{ |board|
it "should be able to solve [#{board}]" do
@puzzle.load(board)
@puzzle.solve
@puzzle.solved?.should be_true
end
}
end
Sampling
Coming to my senses, I quickly realized that it would take a long, long time to run through all 362,880 permutations. So I adjusted, changing the loop to just take 1000 of the permutations:
permutations[0..999].each{ |board|
it "should be able to solve [#{board}]" do
@puzzle.load(board)
@puzzle.solve
@puzzle.solved?.should be_true
end
}
That returned in about 20 seconds. Encouraged, I tried it with 5000 permutations. That took about 90 seconds. I decided to push my luck with 10,000 permutations. That stalled out. I backed it down to 5200 permutations. That returned in a little over 90 seconds. I cranked it up to 6000 permutations. Stalled again.
I thought it might be some kind of limitation with rspec and I was content to keep my test runs to a sample of about 5000. But I decided that sampling the first 5000 generated boards every time wasn’t that interesting. So I wrote a little more code to randomly pick the sample.
My tests started hanging again.
My Tests Found a Bug! (But I Didn’t Believe It at First.)
Curious about why my tests would be hanging, I decided to pick a sample out of the middle of the generated boards by calling:
permutations[90000..90999]
The tests hung. I chose a different sample:
permutations[10000..10999]
No hang.
I experimented with a variety of values and found that there was a correlation: the higher the starting number for my sample, the longer the tests seemed to take.
“That’s just nuts,” I thought. “It makes no sense. But…maybe…”
In desperation, I texted my friend Glen.
I was hoping that Glen would say, “Yeah, that makes sense because [some deep arcane thing].” (Glen knows lots of deep arcane things.) Alas, he gently (but relentlessly) pushed me to try a variety of other experiments to eliminate RSpec as a cause. Sure enough, after a few experiments I figured out that my code was falling into an infinite loop.
Once I recognized that it was my code at fault, it didn’t take long to isolate the bug to a specific condition that I had not previously checked. I added the missing low-level test and discovered the root cause of the infinite loop.
It turns out that my code had two similarly-named variables, and I’d used one when I meant the other. The result was diabolically subtle: in most situations, the puzzle solving code arrived at the same outcome it would have otherwise, just in a more roundabout way. But in a few specific situations the code ended up in an infinite loop. (And in fixing the bug, I eliminated one of the two confusing variables to make sure I wouldn’t make the same mistake again.)
I never would have found that bug if I hadn’t been running my code through its paces with a large sample of the various input permutations. So I think it’s appropriate to have discovered the bug, thus demonstrating the value of high-volume auto-generated tests, while writing about the mechanics of auto-generating tests with RSpec.
In the meantime, if you would like to play with my slider puzzle sample code and tests, I’ve released it under Creative Commons license and posted it on github. Enjoy! (I’m not planning to do much more with the sample code myself, and can’t promise to provide support on it. But I’ll do my best to answer questions. Oh, and yes, it really could use some refactoring. Seriously. A bazillion methods all on one class. Ick. But I’m publishing it anyway because I think it’s a handy example.)
The subject of how much to automate, and the related topic of how to calculate the ROI for test automation, comes up on a regular basis. In fact, it popped up on a couple of the mail lists I read recently.
Usually there’s at least one person arguing that test automation is expensive and that there are situations in which it just doesn’t make sense, so we should automate selectively. We should pick and choose, they say. We should automate wisely. We should automate only those tests where the investment is justified. The rest will stay manual, and that’s only sensible, they say.
I understand their concern.
In some contexts, particularly where there is a legacy code base that was created without automated tests, the cost to create and maintain each automated test is extraordinarily high.
Further, the value of those tests is often less than it could be.
The value in any test is in the information that it provides. But when many of the test failures are because the tests, and not the code, are wrong, the information provided by the whole suite of tests is deemed unreliable and untrustworthy. Information only has value to the extent that we can trust it.
Thus, the automated tests in that kind of context are both insanely expensive and low in value. Some years ago this was the norm. In many organizations, sadly, this is still the norm.
But it doesn’t have to be that way.
Successful Agile teams typically follow at least a subset of the XP development practices like TDD and Continuous Integration. Oh, sure, you can be Agile without doing TDD.
But teams that do practice TDD and ATDD wind up with large suites of automated tests as a side effect. (Yes, it’s a side effect. Contrary to what some people think, TDD is a design technique, not a testing technique. We do TDD because it leads to clean, well-factored, malleable, testable code. The automated tests are just a nice fringe benefit.)
Moreover, the resulting set of automated tests is so much more valuable because the test results are typically reliable and trustworthy. When the tests pass, we know it means that the code still meets our expectations. And when the tests fail, we’re genuinely surprised. We know it means there’s something that broke recently and we need to stop and fix it.
And we get that information frequently. Developers writing code execute the unit tests every few minutes, and are thus able to tell almost immediately when they’ve broken something that used to work. Similarly, the Continuous Integration system executes the unit and acceptance regression tests with every code check in. All those automated tests result in incredibly fast feedback.
Anyone who has taken an economics or business class is probably aware of the time-value of money. Net Present Value says that money in your hand today is worth more than that same amount of money in your hand tomorrow.
The same is true of information. Information sooner is worth more than the same information later. Automated tests executed through a continuous integration system give us a lot of information fast, and that has enormous value.
So Agile teams that have solid engineering practices in place typically find it that each incremental test costs very little to create and maintain, and the value of those tests are huge because the information is reliable and it’s delivered so quickly. In such a context it no longer makes sense to debate the ROI for a single given test. In the time we have the debate, we could just write the test.
People who are accustomed to living in the first context, where automation is hard to create, painful and costly to maintain, and doesn’t offer all that much value, often find it hard to imagine such a context. From their perspective, ROI is so very uncertain because the price is high and the value is low.
But instead of refining our ROI arguments, I suggest changing the equation. Adopt development practices that lower the cost of automation and increase the value so much that we just don’t ever have to argue about whether or not a given test is worth automating.
Are you an open source test automation tool aficionado? I am. That’s why I’m organizing OSTATLI, a small, non-commercial, invitation-only gathering next Thursday in my office in Pleasanton to provide an opportunity for us to express our mutual love of open source test automation tools.
Participants will bring their laptops, loaded with their favorite tools, and we’ll spend a day messing around with various tools to see what each one can do. I’ll provide beverages and wifi.
Want to come play? My office is small, so participation is limited. But getting an invite is easy: just email me.
At the AA-FTT workshop last October, I did this lightning talk titled “A Place to Put Things.”
In it, I propose standardizing on places to put different kinds of information associated with automated functional tests.
It seems to me that one of the key success factors for the xUnit family of unit testing frameworks is that they gave us just 5 places to put code related to unit tests: setup, test, teardown, suite setup, suite teardown. That simple organization has a powerful focusing effect, enabling (or, perhaps, forcing) developers to narrow their attention down to just the code needed to create one little itty bitty unit test at a time.
Functional testing frameworks have no such common, standardized structure.
FIT has given us something close with the notion of a test in natural language in a table and fixture code to hook that test to the software under test. If we extend that model a little to include the idea that we may well be testing against an external interface, like a web interface, where a driver, like Watir or SeleniumRC, would be handy, we end up with 3 big categories of things:
- Tests: scenarios describing the actions and expectations, expressed in natural language with keywords
- Fixtures: code that connects the keywords in the test to actions in the software under test
- Drivers: libraries like Watir, SeleniumRC, Perl’s Win32::GUI, etc. that know how to address external interfaces such as Web interfaces, thick client GUIs, command line interfaces, soap/XML calls, etc.
That’s the direction I think test automation tools in general are headed, and it’s an important evolution with profound implications. However, I’m still figuring out how to explain how this structure differs from what traditional tools offer, and the significance of those differences.
At the very end of the “A Place to Put Things” video, there’s a little exchange between two of the participants in the workshop. A woman’s voice says, “She just pulled that together in like a minute!” That’s Jennitta Andrea speaking. She was the co-organizer of the AA-FTT workshop.
In response, Ward Cunningham says, “Oh, no. I think she’s been pulling that together for the last 3 years.”
Ward’s right. And I’m still pulling together my ideas and figuring out how to articulate them. So while you don’t see much evidence of it on my blog, I am actually spending a fair amount of time writing, and rewriting. More soon. I hope.
…and it isn’t cooperating?
I sympathize.
I recently fought my way through the process of automating a test to reproduce a bug on a legacy(*) web application that had no IDs on any of the elements that I wanted to address. And I thought it might be helpful to capture some of my lessons learned here in case they help someone else. Also, I’m putting this here so I remember what I did.
Lesson 1: SeleniumRC Rocks!
I’ve been extolling the virtues of SeleniumRC for quite a while. This project gave me the perfect opportunity to refresh my Selenium skills. And I’m delighted to report that SeleniumRC is even better than I remember it being. First, I can write my tests in my programming language of choice (Ruby). Second, it has a wide variety of ways to locate these pesky non-ID’d elements. Third, every time I ran up against a road block, I discovered that the smart folks who make Selenium had already anticipated the problem and found a solution.
Lesson 2: Selenium Server Flags can Solve Common Execution Problems
In my particular case, the app I was testing didn’t play nicely in IFrames. This is a problem: by default Selenium runs the web app in an IFrame in the same browser window where it displays its own status. Fortunately, it turns out that there is a -multiWindow flag to solve exactly this problem. I solved the IFrame problem by running the Selenium Server like so:
java -jar selenium-server.jar -multiWindow
There are a variety of other Selenium Server flags that address other common problems. See the Server Command Line Options documentation for a full list.
Lesson 3: ‘Permission Denied’ Errors Probably Mean the App Violates the ‘Same Origin Policy’
Once I’d gotten to the point where I could launch the app, I started encountering very puzzling Permission Denied errors. I vaguely recalled that such errors probably meant there was some problem with the domain names changing and browser security and cross-site scripting something-or-other.
So I checked the domains. Sure enough, the home page was at “www.example.com.” From there, when you log in, it goes to “app.example.com.” Bingo! The domain was changing in the middle of my test. I experimented a little and discovered there was no way around it: the app was going to redirect to a different domain no matter what.
It turns out I’m not the first person to have this problem. Fortunately, Selenium has a strategy for addressing the issue: experimental browsers. I tried the chrome browser for testing on FireFox and it worked perfectly.
Lesson 4: Firefox Rocks!
At this point I could launch the app and log in, but now I had another problem. After the login page, all the things I needed to click, check, or otherwise manipulate were buried deep in convoluted HTML. I realized that figuring out how to address these things was going to be non-trivial.
The most basic strategy for discovering the locator for an element is to view the HTML source. You can view the source for the whole page, but Firefox has a great feature that allows you to see just the source for just a selection. To use it: highlight a selection on the page, then right-click. One of the available menu options is View Selection Source. Choose it, and you get a window with just the relevant HTML.
However, if you’re dealing with something complex, viewing the source isn’t enough. You really need to look at the Document Object Model (DOM). The best way I know to do that is with the DOM Viewer included with the Web Developer plugin by Chris Pederick.
Web Developer also includes a feature that lets you see all the attributes for a given element. From the Information menu, choose “Display Element Information.” Now you can get the attributes for any element just by clicking on it. I love that feature.
Finally, the XPath Checker plugin by Brian Slesinsky is a very helpful tool for figuring out how to address those pesky non-ID’d elements. More on xpath in the next section.
Lesson 5: xpath Is Now My Good Friend
One of the hallmarks of legacy web apps is the annoying lack of IDs on important elements. Of course, web apps aren’t the first place where a lack of IDs is problematical. I recall struggling with Windows apps that lacked field IDs back in the 1990s.
The good news is that this is a much more tractable problem in web apps than in Windows apps. Xpath to the rescue!
Let’s take just one example. I needed to verify the text associated with a particular image. The image served as a kind of custom bullet in a bulleted list, so there were several identical ones. The text itself did not appear in the DOM immediately next to the graphic. Rather, its parent was a peer to the parent element of the image. (Got that straight? Yeah, me either. Seriously, this one took me a while.) In brief, the HTML around this thing looked kinda like this:
<div>
<span>
<img src='/path/to/images/check.gif'>
<span>
<font>
item 1
</font>
</div>
<div>
<span>
<img src='/path/to/images/check.gif'>
<span>
<font>
item 2
</font>
</div>
Mind you, the HTML didn’t look that clean. There was a lot of other random stuff in there, and every tag had a gazillion attributes, and there were hard-coded styles everywhere. But I digress. And I’m probably whining. I’ll stop that now.
So it turns out the only way I could grab the text associated with the second bullet in the list was with the following Selenium command:
get_text("xpath=(//img[contains(@src,'check.gif')])[2]/../../font")
Let’s all say it as a group: “EWWWWW!”
But let’s also appreciate that doing such a thing is actually possible. Selenium has a wide range of locator styles, and even allows you to add your own locator strategies. (I haven’t needed to do that yet, so I’m not quite sure how to use the feature, but I noticed from the documentation that it’s there.)
(For more on using xpaths with Selenium, I found the Help with XPath article on the openqa.org site useful. It’s hard not to like an article with headings like “How the $%^@$ do I locate an element?”)
Mind you, the world might be a better place if we couldn’t write such code. Every time I figure out how to automate tests against an untestable application, I feel a twinge of guilt. By automating tests against an untestable interface, I become an enabler of more untestable interfaces. For the sake of improved collaboration and more testable applications, perhaps it’s better if those of us who automate tests avoid resorting to using our xpath superpowers except in the service of wrapping legacy apps with tests so they can be refactored for testability.
But once again, I digress.
The point I really want to make is that SeleniumRC gives us the power to automate tests even against icky hard-to-test legacy apps, and to do it with real programming languages (pick your favorite: C#, VB.Net, Perl, PHP, Ruby, Java). And that means we can write maintainable automated tests using good programming practices. And *that* means we can automate regression tests for faster feedback. And ultimately, *that* means we can make changes to the legacy app to improve testability and maintainability.
So rock on, SeleniumRC. And huge thanks to everyone who’s ever worked on SeleniumRC or Selenium. Also, huge thanks to the Selenium community as a whole. When I went looking for answers to my questions I found numerous blog posts and forum messages with tips and tricks. This post would not have been possible without such a community that’s so open about sharing knowledge.
* Here I mean “Legacy” as Michael Feathers defines it: code that lacks automated unit tests. Web developers, please take note: if you write good unit tests for your web app, including JSUnit tests for the JavaScript bits, the application will be MUCH more testable and the QA people will stop whining at you so much. return to footnote reference
Several people have asked me recently why I’m not a fan of the traditional test automation tools for Agile projects. “Why should I use something like Fit or Fitnesse?” they ask. “We already have <insert Big Vendor Tool name here>. I don’t want to have to learn some other tool.”
Usually the people asking the question, at least in this particular way, are test automation specialists. They have spent much of their career becoming experts in a particular commercial tool. They know how to make their commercial tool of choice jump through hoops, sing, and make toast on command.
Then they find themselves in a newly Agile context struggling to use the same old tool to support a whole new way of working. They’re puzzled when people like me tell them that there are better alternatives for Agile teams.
So if you are trying to make a traditional, heavyweight, record-and-playback test automation solution work in an Agile context, or if you are trying to help those other people understand why their efforts are almost certainly doomed to fail, this post is for you.
Why Traditional, Record-and-Playback, Heavyweight, Commercial Test Automation Solutions Are Not Agile
Three key reasons:
- The test-last workflow encouraged by such tools is all wrong for Agile teams.
- The unmaintainable scripts created with such tools become an impediment to change.
- Such specialized tools create a need for Test Automation Specialists and thus foster silos.
Let’s look at each of these concerns in turn, then look at how Agile-friendly tools address them.
Test-Last Automation
Traditional, heavyweight, record-and-playback tools force teams to wait until after the software is done – or at least the interface is done – before automation can begin. After all, it’s hard to record scripts against an interface that doesn’t exist yet. So the usual workflow for automating tests with a traditional test automation tool looks something like this:
- Test analysts design and document the tests
- Test executors execute the tests and report the bugs
- Developers fix the bugs
- Test executors re-execute the tests and verify the fixes (repeating as needed)
- …time passes…
- Test automation specialists automate the regression tests using the test documents as specifications
Looking at the workflow this way, it’s surprising to me that this particular test automation strategy ever works, even in traditional environments with long release cycles and strict change management practices. By the time we get around to automating the tests, the software is done and ready to ship. So those tests are not going to uncover much information that we don’t already know.
Sure, automated regression tests are theoretically handy for the next release. But usually the changes made for the next release break those automated tests (see concern #2, maintainability, coming up next). The result for most contexts: high cost, limited benefit. In short, such a workflow is a recipe for failure on any project, not just for Agile teams. The teams that have made this workflow work well in their context have had to work very, very hard at it.
However, this workflow is particularly bad in an Agile context where it results in an intolerably high level of waste and too much feedback latency.
- Waste: the same information is duplicated in both the manual and automated regression tests. Actually, it’s duplicated elsewhere too. But for now, let’s just focus on the duplication in the manual and automated tests.
- Feedback Latency: the bulk of the testing in this workflow is manual, and that means it takes days or weeks to discover the effect of a given change. If we’re working in 4 week sprints, waiting 3 – 4 weeks for regression test results just does not work.
Agile teams need the fast feedback that automated system/acceptance tests can provide. Further, test-last tools cannot support Acceptance Test Driven Development (ATDD). Agile teams need tools that support starting the test automation effort immediately, using a test-first approach.
Unmaintainable Piles of Spaghetti Scripts
Automated scripts created with record-and-playback tools usually contain a messy combination of at least three different kinds of information:
- Expectations about the behavior of the software under test given a set of conditions.
- Implementation-specific details about the interface.
- Code to drive the application to the desired state for testing.
So a typical script will have statements to click buttons identified by hard-coded button ids followed by statements that verify the resulting window title followed by statements to verify the calculated value in a field identified by another hard-coded id, like so:
field("item_1").enter_value("12345")
button("lookup_item_1").click
field("price_1").verify_value("$7.00")
field("qty_1").enter_value("6")
button("total_next").click
active_window.verify_title("Checkout")
field("purchase_total").verify_value("$42.00")
The essence of the test was to verify that ordering 6 items at $7 each results in a shopping cart total of $42. But because the script has a mixture of expectations and UI-specific details, we end up with a whole bunch of extraneous implementation details obfuscating the real test.
(If you’re nodding along, thinking to yourself, “Yup, looks like our test scripts,” then you have my sympathies. My deep, deep sympathies. Good, maintainable, automated test scripts do not look like that.)
All that extraneous stuff doesn’t just obscure the essence of the test. It also makes such scripts hard to maintain. Every time a button id changes, or the workflow changes, say with a “Shipping Options” screen inserted before the Checkout screen, the script has to be updated. But that value $42.00? That only changes if the underlying business rules change, say during the “Buy 5, get a 6th free!” sale week.
Of course, there are teams that have poured resources, time, and effort into creating maintainable tests using traditional test automation tools. They use data-driven test strategies to pull the test data into files or databases. They create reusable libraries of functions for common action sequences like logging in. They create an abstract layer (a GUI map) between the GUI elements and the tests. They use good programming practices, have coding standards in place, and know about refactoring techniques to keep code DRY. I know about these approaches. I’ve done them all.
But I had to fight the tools the whole way. The traditional heavyweight test automation tools are optimized for record-and-playback, not for writing maintainable test code. One of the early commercial tools I used even made it impossible to create a separate reusable library of functions: you had to put any general-use functions into a library file that shipped with the tool (making tool upgrades a nightmare). That’s just EVIL.
Agile teams need tools that separate the essence of the test from the implementation details. Such a separation is a hallmark of good design and increases maintainability. Agile teams also need tools that support and encourage good programming practices for the code portion of the test automation. And that means they need to write the test automation code using real, general use languages, with real IDEs, not vendor script languages in hamstrung IDEs.
Silos of Test Automation Specialists
Traditional QA departments working in a traditional waterfall/phased context, and automating tests, usually have a dedicated team of test automation specialists. This traditional structure addresses several forces:
- Many “black-box” testers don’t code, don’t want to code, and don’t have the necessary technical skills to do effective test automation. Yes, they can click the “Record” button in the tool. But most teams I talk to these days have figured out that having non-technical testers record their actions is not a viable test automation strategy.
- The license fees for traditional record-and-playback test automation tools are insanely expensive. Most organizations simply do not have the budget to buy licenses for everyone. Thus only the anointed few are allowed to use the tools.
- Many developers view the specialized QA tools with disdain. They want to write code in real programming languages, not in some wacky vendorscript language using a hamstrung IDE.
Thus, the role of the Test Automation Specialist was born. These specialists usually work in relative isolation. They don’t do day-to-day testing, and they don’t have their hands in the production code. They have limited interactions with the testers and developers. Their job is to turn manual tests into automated tests.
That isolation means that if the production code isn’t testable, these specialists have to find a workaround because testability enhancements are usually low on the priority list for the developers. I’ve been one of these specialists, and I’ve fought untestable code to get automated tests in place. It’s frustrating, but oddly addictive. When I managed to automate tests against an untestable interface, I felt like I’d slain Grendel, Grendel’s mother, all the Grendel cousins, and the horse they rode in on. I felt like a superhero.
But Agile teams increase their effectiveness and efficiency by breaking down silos, not by creating test automation superheroes. That means the test automation effort becomes a collaboration. Business stakeholders, analysts, and black box testers contribute tests expressed in an automatable form (e.g. a Fit table) while the programmers write the code to hook the tests up to the implementation.
Since the programmers write the code to hook the tests to the implementation while implementing the user stories, they naturally end up writing more testable code. They’re not going to spend 3 days trying to find a workaround to address a field that doesn’t have a unique ID when they could spend 5 minutes adding the unique ID. Collaborating means that automating tests becomes a routine part of implementing code instead of an exercise in slaying Grendels. Less fun for test automation superheroes, but much more sensible for teams that actually want to get stuff done.
So that means Agile teams need tools that foster collaboration rather than tools that encourage a whole separate silo of specialists.
Characteristics of Effective Agile Test Automation Tools
Reviewing the problems with traditional test automation tools, we find that Agile teams need test automation tools/frameworks that:
- Support starting the test automation effort immediately, using a test-first approach.
- Separate the essence of the test from the implementation details.
- Support and encourage good programming practices for the code portion of the test automation.
- Support writing test automation code using real languages, with real IDEs.
- Foster collaboration.
Fit, Fitnesse, and related tools (see the list at the end of the post for more) do just that.
Testers or business stakeholders express expectations about the business-facing, externally visible behavior in a table using keywords or a Domain Specific Language (DSL). Programmers encapsulate all the implementation details, the button-pushing or API-calling bits, in a library or fixture.
So our Shopping Cart example from above might be expressed like this:
Choose item by sku 12345
Item price should be $7.00
Set quantity to 6
Shopping cart total should be $42.00
See, no button IDs. No field IDs. Nothing except the essence of the test.
And by writing our test in that kind of stripped-down-to-the-essence way makes it no longer just a test. As Brian Marick would point out, it’s an example of how the software should behave in a particular situation. It’s something we can articulate, discuss, and explore while we’re still figuring out the requirements. The team as a whole can collaborate on creating many such examples as part of the effort to gain a shared understanding of the real requirements for a given user story.
Expressing tests this way makes them automatable, not automated. Automating the test happens later, when the user story is implemented. That’s when the programmers write the code to hook the test up to the implementation, and that’s when the test becomes an executable specification.
Before it is automated, that same artifact can serve as a manual test script. However, unlike the traditional test automation workflow where manual tests are translated into automated tests, here there is no wasteful translation of one artifact into another. Instead, the one artifact is leveraged for multiple purposes.
For that matter, because we’re omitting implementation-specific details from the test, the test can be re-used if the system were ported to a completely different technology. There is nothing specific to a Windows or Web-based interface in the test. The test would be equally valid for a green screen, a Web services interface, a command line interface, or even a punch-card interface. Leverage. It’s all about the leverage.
Traditional Tools Solve Traditional Problems in Traditional Contexts. Agile Is Not Traditional.
Traditional, heavyweight, record-and-playback tools address the challenges faced by teams operating in a traditional context with specialists and silos. They address the challenge of having non-programmers automate tests by having record-and-playback features, a simplified editing environment, and a simplified programming language.
But Agile teams don’t need tools optimized for non-programmers. Agile teams need tools to solve an entirely different set of challenges related to collaborating, communicating, reducing waste, and increasing the speed of feedback. And that’s the bottom line: Traditional test automation tools don’t work for an Agile context because they solve traditional problems, and those are different from the challenges facing Agile teams.
…
Related Links
A bunch of us are discussing the next generation of functional testing tools for Agile teams on the AA-FTT Yahoo! group. It’s a moderated list and membership is required. However, I’m one of the moderators, so I can say with some authority that we’re an open community. We welcome anyone with a personal interest in the next generation of functional tools for Agile teams. We’re also building lists of resources. In the Links section of the AA-FTT Yahoo! group, you’ll find a list of Agile-related test automation tools and frameworks. And the discussion archives are interesting.
Brian Marick wrote a lovely essay on An Alternative to Business-Facing TDD.
I discussed some of the ideas in this article in previous blog posts, most notably:
- Functional Test Tools: the Next Generation (part 1 of 2)
- Functional Test Tools: the Next Generation (part 2 of 2)
- What Problem Would Next-Generation Functional Testing Tools Solve?
- And While I’m at It, I Want a Platypus Too!
A small sampling of Agile-friendly tools and frameworks:
- Ward Cunningham’s original Fit has inspired a whole bunch of related tools/frameworks/libraries including Fitnesse, ZiBreve, Green Pepper, and StoryTestIQ.
- Concordion takes a slightly different approach to creating executable specifications where the test hooks are embedded in attributes in HTML, so the specification is in natural language rather than a table.
- SeleniumRC and Watir tests are expressed in Ruby; Ruby makes good DSLs.
Are you the author or vendor of a tool that you think should be listed here? Drop a note in the comments with a link. Please note however that comment moderation is turned on, and I will only approve the comment if I am convinced that the tool addresses the concerns of Agile teams doing functional/system/acceptance test automation.
“They never give us enough time to automate our tests, and then they complain at us that we don’t test fast enough!” J. shook her head. “And when I want to hire more people to help automate, they tell me I have too many people already! Management blames me because testing takes too long, but they won’t support me in fixing the problem. What’s wrong with them!?!”
J. is a QA manager in an organization that’s adopting Scrum. She’s frustrated, and understandably so. From her point of view, she’s being squeezed in all directions. The developers are producing releasable code every month. But for her team to run a regression test cycle – mostly manual – takes 6 weeks. That’s too long. Just one test cycle exceeds the sprint length by 2 weeks. J. feels tremendous pressure to reduce the time it takes to test the software. Yet at the same time, she feels like she’s not getting any support to do the one thing she can see that will help reduce the test cycle time: automate the regression tests.
I’ve had some visibility into J.’s situation for some time now. J.’s team has been trying – and failing – to automate the regression suite for the last two years. They aren’t making any headway because as soon as they get one script working, another one breaks. The automation is brittle, error-prone, and incredibly expensive to create and maintain. That’s in part because they’ve been using a cumbersome commercial tool that doesn’t support creating maintainable tests. It’s also because the user interface was not designed with test automation in mind. Many UI elements don’t have IDs, and the ones that do use automatically generated IDs that change with each build. In short, the combination of the tool and the software under test
equals a test automation nightmare. It’s no wonder J.’s team is not making headway.
Yet J. persists. Doing more of the same kind of test automation that’s already failing doesn’t make much sense to me, but she disagrees. “We just need more time!” she says.
The problem is that J. is still thinking in terms of silos. She thinks all testing tasks must be done by QA people using specialized QA tools. It simply would not occur to her to suggest that development help automate tests. Nor does she suggest that developers and testers collaborate on making the UI more testable. Instead, she says, “QA can’t go that fast. Slow down.”
J. doesn’t want to acknowledge that test automation created by a siloed QA team working in isolation to reverse-engineer existing software and automate tests against an untestable UI using proprietary tools accessible only to a few select team members is guaranteed to be incredibly expensive both to create and to maintain, and also ridiculously fragile. In short, her approach just isn’t going to work.
Unfortunately, J.’s story is likely to have an unhappy ending – at least for J. and her team. Her strategy of trying to get development to slow down, and telling management that they can’t release monthly, is backfiring. The development team is already bypassing QA for small changes and getting good results. But J. is undeterred. My past observations tell me that no matter what the reaction of the people around her, she will keep doing the same thing and expect different results.
But maybe, just maybe, by telling J.’s story here, I can help someone, somewhere.
So allow me to repeat the moral of this story:
When QA works in isolation, creating automated tests after the software is theoretically “done,” using proprietary tools that are available only to a select few team members, the results will be a fragile, unmaintainable mess.
For test automation to work well, it must be created in collaboration with the whole team and the resulting test automation code must be treated as code. That means it should be versioned with the source code, executed with each and every build, and created and maintained as part of the overall development cycle rather than as an afterthought.
And when a Test/QA group insists on keeping within their silo when the rest of the organization adopts Agile practices, they will end up bypassed and irrelevant as the rest of the organization finds ways to move forward without their help.
At the moment, I’m creating a little Acceptance Test Driven Development (ATDD) demo. I’m keen on Ruby these days, so I wanted to do it all in Ruby. And I wanted to use Fitnesse. And as it happens, Fitnesse supports RubyFIT. Or RubyFIT supports Fitnesse. Something like that. So I figured this would be a slam dunk.
I was wrong. It’s taken me the better part of a day. Now it could be because I can’t read directions terribly well. And that may also explain the odd assortment of leftover hardware I have from various Ikea purchases. But I did discover at least one gotcha I didn’t find documented anywhere else, so I thought I’d share.
First, major thanks to Cory Foy for his fabulous little tutorial. And thanks to maosd, whoever you are, for blogging some update notes. They helped a lot. And finally thanks to Ron and Chet for blogging about their RubyFIT/Fitnesse adventures.
Now, for the gotchas I ran into that made a 1 hour project into something more like 6 hours.
- Beware hiding the right side of your browser off-screen. The “Errors Occurred” icon in Fitnesse appears on the right hand side of the browser. And if it’s off screen, all you’ll see is a friendly “Assertions: 0 right, 0 wrong, 0 ignored, 0 exceptions” when something goes drastically wrong. Time wasted: 1 hour. Yes, I am an idiot.
- By default, mac host names are like host.local. So my iMac, “eshmac”, has a host name of “eshmac.local”. The first problem I ran into was that Fitnesse wanted to refer to the machine as “eshmac,” and my machine maintained “there’s nobody here by that name.” I tried to figure out how to get Fitnesse to let me customize the host name, but to no avail. So I decided to make “eshmac” a legitimate name instead. Now, I’m sure there are better ways to do this, but since I didn’t want to change the host name on the network, I just added a record for “eshmac” and pointed it to “127.0.0.1″ in the /etc/hosts file. Time wasted: 0.5 hours.
- As maosd indicates, you need to call the FitServer.rb file from the gem location. Conveniently enough, the path is documented in the README.txt file that comes with the RubyFIT gem. Inconveniently, that file is placed in the very folder I was looking for. The answer is that on Macs, gems get installed to /usr/lib/ruby/gems, and you can find the RubyFIT gem at /usr/lib/ruby/gems/1.8/gems/fit-1.1/. Time wasted: 1 hour.
- In Cory Foy’s tutorial, where it says, “You want to go one directory below the directory your class is in,” it really means one directory below. I had my !path variable set incorrectly, and could not figure out for the life of me why RubyFIT couldn’t find the fixture. Silly me. Time wasted: 2 hours – and waaayyy too many “puts” statements.
So, as a service to those who happen to be as documentationally challenged as I am, allow me to be excruciatingly precise about the naming thing with RubyFIT and Fitnesse. It’s enough of a gotcha that lots of people have already written about it. But here’s my attempt at explaining things.
Let’s say you want to call your test page FooTest. And you want to create a Division fixture. And you want to create a directory hierarchy under “/mine” to hold your fixture files.
Your Fitnesse page will contain:
...other stuff... !path /mine/ |Footest.Division| ...other stuff...
Your fixture code will look like:
... module Footest class Division < Fit::ColumnFixture # code goes here end end ...
And that code will live in the file /mine/footest/division.rb
Once again, note that the !path setting in Fitnesse is “/mine/”. For the record, the mistake I made was setting “!path /mine/footest/”. As I said, I’m documentationally challenged.
Matching capitalization does not seem to be important, but the names of the directories, files, modules, and classes sure are. And they all have to match up. So, your test page name must match your module name and that must match the directory name in which the file lives, and does not match the !path variable.
I’m sure I’ll run into more gotchas, but I figured I should document these while I’m thinking about them.
Some time ago, I wrote about how Jennitta Andrea (among others) fired my imagination about what could be possible with advances in functional test automation tools.
I’m delighted to announce that Jennitta is heading up the Agile Alliance Functional Testing Tools program. With her boundless energy and contagious enthusiasm Jennitta recruited Ron Jeffries and me to serve on the committee and obtained funding from the Agile Alliance for the program.
The project we’re working on first is a visioning workshop where we hope to bring together folks who have been working on advancing the state of the art of functional testing tools to pool ideas, share experiences, imagine the future, and build community.
Here’s the official call for participation:
Agile Alliance Functional Testing Tools Visioning Workshop
Call for Participation
Dates: October 11 – 12, 2007
Times: 8 AM – 5 PM
Location: Portland, Oregon
Venue: Kennedy School
Description
The primary purpose of this workshop is to discuss cutting-edge advancements in and envision possibilities for the future of automated functional testing tools.
This is a small, peer-driven, invitation-only conference in the tradition of LAWST, AWTA, and the like. The content comes from the participants, and we expect all participants to take an active role. We’re seeking participants who have interest and experience in creating and/or using automated functional testing tools/frameworks on Agile projects.
This workshop is sponsored by the Agile Alliance Functional Testing Tools Program. The mission of this program is to advance the state of the art of automated functional testing tools used by Agile teams to automate customer-facing tests.
There is no cost to participate. Participants will be responsible for their own travel expenses. (However, we do have limited grant money available to be used at the discretion of the organizers to subsidize travel expenses. If you would like to be considered for a travel grant, please include your request, including amount needed, in your Request for Invitation.)
Requesting an Invitation
If you’re interested in being invited to participate in this workshop, please send an email to ” testtoolworkshop@agilealliance.org” answering the following questions:
- What is your experience using functional tests as a way to specify functional requirements?
- What is your experience with automated functional testing tools on Agile projects?
- What do you hope to contribute to the workshop? Do you have any code or examples that you’d like to share? (Please note that you should not share anything covered by a non-disclosure agreement.)
- What do you hope to get out of the workshop?
Invitations will be issued by September 1, 2007 so that we can confirm hotel room requirements. Please send in your request as soon as possible, before the workshop fills up.
Pass This Along
If you know of someone that would be a candidate for this workshop, please forward this call for participation on to them.
Additional Background
Automated functional testing is an integral and essential part of Agile development. Many Agile teams use functional tests to codify the system requirements. Some also practice Acceptance Test Driven Development.
Agile teams have particular needs for automated tools that are not well served by traditional record-and-playback GUI drivers. As requirements specifications, functional tests must be readable: clear, succinct, and expressed in the language of the business domain. As an automated safety net, the tests must be maintainable: built with reusable domain specific testing language components, easy to change as the requirements change.
The good news is that tool support for automated functional tests has grown significantly in recent years. There is a large variety of commercial and open source testing tools/frameworks available that support Agile development practices. The FIT framework was a significant boost to the state of the art of automated functional testing, both in terms of the syntax of the specification (tables), the detailed test execution feedback (cell by cell), and the development/execution environment (desktop tools rather than development or specialized tools).
However, we believe that it’s time for another significant boost to the state of the art.
- We are lacking integrated development environments that facilitate things like: refactoring test elements, command completion, incremental syntax validation (based on the domain specific test language), keyboard navigation into the supporting framework code, debugging, etc.
- We need more expressive test specification languages, possibly integrating executable: text, tables, shapes, and colors together into a single test.
- We need specification languages that can describe user interaction in a readable and maintainable fashion.
- We need to be able to view/navigate the tests in multiple different ways in order to see how the pieces of the puzzle contribute to
the bigger picture of the domain/feature: organize tests based on their domain context; search for tests based on user-defined keywords (cross cutting concerns). - … and things that we haven’t even thought of that will take us out of the current box, and into a new level of effectiveness ….
The Agile Alliance Functional Testing Tools Program seeks to advance the state of the art by creating opportunities for people who are in a position to advance the state of the art to share information and ideas, and explore possibilities.
I’ve been saying for a long time that Open Source testing tools are our future. It seems at least one test tool vendor agrees with me.
I recently had the opportunity to speak with Rafi Benami of RadView. In April this year, Radview, long a vendor in the performance and testing tool industry, announced that they’re joining the open source revolution: they released their WebLOAD product under GNU General Public License (GPL). You can find it online at SourceForge.
They still offer a commercial enterprise class version of WebLOAD, “WebLOAD Professional.” The professional edition contributes to the company’s revenue stream, and also enables the company to serve a broader set of customers including those who need full support and services, or are still skittish about open source.
Of course, RadView isn’t the first company to adopt an Open Source business model. But they’re the first established, commercial, software testing tool vendor to do so that I am aware of. (By the way, if you know of other established commercial software test tool vendors who have gone Open Source, drop me a line in the comments.)
Interestingly, Rafi reported that the hardest part about open sourcing WebLOAD was making the decision to do so. Once the decision was made, the rest was just a matter of bundling up the product in a way that would work for the open source community.
But the decision? That was hard, he said. They had to figure out how RadView could offer their product for free and still make money.
I can only imagine some of the internal discussions that must have taken place. Rafi, being very professional, didn’t share the details of those confidential internal meetings. But I can still imagine the conversation:
So why open source? As Rafi explained it, RadView chose to open source WebLOAD to:
- Reconnect with the professional testing community.
- Leverage the power of the community to improve the offering. Or, as Rafi put it: “We contribute to the community; and the community contributes to us.” It’s a virtuous cycle.
And in order to foster that community spirit, RadView has created a WebLOAD community site.
As hard as the decision must have been, I think they’re already seeing the benefits. Rafi mentioned that RadView saw a lot of traffic come through their booth at STAREast. And many of the people who stopped by did so to express their admiration for the decision to open source WebLOAD, and also share horror stories of over-priced shelfware from competitors.
In short, RadView wants to build software that practitioners like, that they use, and that they have a stake in. Imagine that. A vendor that would rather sell their tools in the test lab than on the golf course. A vendor listening to the community of practitioners.
Go RadView! Hope your competitors are watching…
