Functional Test Tools: the Next Generation (part 2 of 2)

In Part 1, I discussed how Ward Cunningham, Brian Marick, and Jennitta Andrea’s ideas have inspired me. As I said in that post, their work has suggested possibilities to me. No longer will I be satisfied with a test tool that just lets me drive an application. I now want function test tools that represent tests graphically, execute tests from a variety of views, connect tests to code, and show me relationships between tests and features.

And being greedy, I want even more.

I want my test tool to understand enough about heuristics to suggest tests. I want it to integrate tightly with source control so my tests can beversioned with the code. I want it to work with continuous integration so my automated tests can be run as part of the continuous build process. I want it to…

Well, here’s my current list:

Support Various Graphical Representations

I want to express tests in whatever format makes the most sense for that type of test.

If I’m testing a state-full system, I want to create a state model in a table or bubble chart, then associate expectations with each state and transition, like Brian’s annotated assertions in his wireframes. If I’m testing business rules I want to create something like an entity-relationship diagram with expectations around conditions, constraints, and relationships between data elements. If I’m testing a process orworkflow I want to create something like Ward’s swim lane representation, or an activity diagram with swim lanes, and annotate it with expectations around timing, order of events, error conditions, and responses.

Further, I want a test tool that can shift between code and pictures, the way Ward’s Process Explorer does, and also allow me to create and update tests in both the graphical representation and the code view.

This suggests that the tool must support true round-trip editing: create a test in code, edit it in pictures; create a test in pictures, edit it in code; etc.

This also suggests that the test tool must have some understanding of states, transitions, entities, data elements, etc. so it can provide a mechanism to specify the actions and expectations associated with each.

Supply Available Test Actions

When creating/editing tests, I don’t want to have to guess if I should refer to an action as “login” or “log_in” or “enter_name_and_password.”

If there’s a test fixture that specifies available actions, the test tool should be able to prompt me with that list, intellisense style. And that intellisense should work in both the code view and graphical view.

This implies that IDEs must understand tests as something like code. And it may imply that test developers will end up using the same IDEs as code developers. That would be fine with me. Testers have had to deal with dumbed down pseudo IDEs from commercial tool vendors for far too long.

Give Business Users a Familiar-Feeling Editing Environment

FIT/FITnesse are brilliant in the way they support business users. Developers write the test code; domain experts write the tests. The tool doesn’t impose a one-size fits all view on the world. Developers work in theirIDE; domain experts work in HTML tables.

Similarly, I want the functional test tool to support business users adding and modifying tests in an environment that doesn’t require them to get their geek on. So while I just finished saying that a plugin for an IDE would be fine, I wasn’t speaking the whole truth. I think an IDE plugin would be fine for much of what I envision, but other editors will have to be allowed too.

Maybe, like FITnesse, the other editors could be web-based. That’s handy because it means nothing to install.

However, the graphical part is important. Engineers love wikis; business users don’t love them quite as much. They don’t love WikiWords, they don’t love hand crafting tables | with | pipes | delineating | cells, and they don’t love having to remember the formatting symbols (e.g. *bold*). Business users use Excel and PowerPoint and Word. Those are the interfaces they’re used to. I’ve met plenty of domain expert testers who are terrified ofIDEs or anything resembling code. These people have much to offer the project: they understand testing and they understand the domain. The next generation of testing tools should allow them to collaborate, using an interface to express tests that feels as natural to them as using Word, Excel, or PowerPoint.

Prompt Additional Tests Based on Heuristics

AgitarOne from Agitar does something amazingly cool for unit testing: it dynamically generates tests based on known failure patterns.

We need something similar for functional test tools. After all, functional testers use a wide range of heuristics in designing tests. Some examples include selecting none, some, or all out of a set of available options. Or having 0, 1, or many in a set of things, like search results. Or positioning something (like a cursor) at the beginning, middle, or end. For that matter, James Whittaker’s book How to Break Software is an entire catalog of test heuristics.

Over the years there have been some attempts at this. One was a commercial tool from Rational that, as far as I can tell, didn’t survive the merger with IBM. Another was an application explorer that once shipped free with Segue’s tool back when it was called Quality Works (before the Silk days), and later became a commercial version for which they charged. (I always found it ironic that the free version was better, at least in my opinion. Of course it doesn’t really matter since as far as I know, none of the versions are available these days. But I digress.) Now it’s been a long time since I looked at either of those tools, but my recollection is that each fell short because they did both too little and too much. Too little: the heuristics they understood were extremely limited. I recall a lot of button clicking and boundary testing of fields. Too much: they didn’t give me control. (Actually, that’s not true. Segue’s commercial product gave me a lot of control because it generated tests that I could then edit. However, the generated tests were so verbose and so useless that it wasn’t worth the effort to edit them.)

So, what I want is not yet another test code generator. I don’t want button-happy, mouse-clicking randomness. I want the tool to suggest tests. I want a little button, or a link, or or a right-mouse-context menu option, “Suggest Related Tests,” that will apply a wide variety of known heuristics to the current context and come up with a list that includes items like:

“Select none of the items in the Choose Locations to Map select box”
“Select all of the items in the Choose Locations to Map select box”
“Clear optional field Description and save”
“Click browser back button”
“Logout while in Edit Profile state”

It should then be easy to generate one or multiple tests, or at least the stubs for the tests. If it can’t generate tests it should at least suggest tests based on the context, sort of like a “hint” button on a puzzle program.

(To give you some idea of the range of heuristics I’m thinking about, here’s the Test Heuristics Cheat Sheet I distributed in various classes over the last year. And that’s just a beginning.)

Support Model-Based Testing

A natural side effect of understanding graphical tests, understanding heuristics, and allowing me to specify expectations is that the tool should be able to support model-based testing. That means that you should be able to specify a starting place, and the tool should be able to execute automatically and randomly generated tests until the cows come home, or until the application crashes so dramatically it’s unrecoverable, whichever happens first.

(And for those of you who object to “random” tests, arguing that tests should be repeatable, I’ll suggest that test repeatability is overrated.)

Allow Me to Specify the Essential; Vary the Incidental

In any given test, there are things I care about—actions or data that are essential to the thing I want to test—and things I don’t care about—incidental actions or data that don’t really matter except that you have to choose something. For example, if you’re testing filling in addresses in a customer form, you might want to verify that you can enter Canadian postal codes (e.g. “M5H 3N5″). You need the rest of the address, but you don’t particularly care what it is.

In a real life example, I once tested a program that had a hot key combination for every menu option. You would think that the hot key and the corresponding menu option would go through the same path in the code, right? So while we verified that the right screen came up when you pressed the hot key, we didn’t perform all our tests two times, once using the menu and once using the hot keys. Late in the release cycle we discovered that in at least one instance our assumptions were incorrect. The code behaved differently depending on how you navigated to a give screen. The culprit, it turned out, was copied and pasted code.

That experience taught me to vary things, like navigation or data, even when I’m not explicitly testing them. And that’s the kind of variation I want a test tool to support.

Integrate with Source Control

The tests and all associated artifacts should be in plain text, not a binary format. This is necessary to support integration with source control, including merging and diffing.

Thus, while graphical representations are important, the emerging vision I see is a test tool that, at its heart, is driven by code even when showing the code in a graphical format. And that code should be in the same code base as the code under test, labeled, branched, and versioned along with the rest of the code.

Execute Tests from Any View at Any Time

One of the incredible powers of FITnesse is that a business stakeholder can execute the automated tests from any computer with access to the FITnesse wiki with just the click of a mouse.

This capability means more than just being able to execute tests at any time. It also means you don’t have to have multiple artifacts to represent the same thing.

Consider FIT. Executing tests defined in tables at any time is cool, but it’s not particularly new. People defined tests in tables for years before FIT came along. So tables weren’t new. Even executable tables weren’t all that new; data driven testing using data from files has been around for quite a while. But FIT was innovative because it allowed you to specify and execute your tests using the same artifact. You didn’t specify your tests, then automate them, and end up with two representations of the same test to maintain. Given an appropriate fixture, your test specification is executable from the minute you write it. In fact, FIT and FITnesse support exploratory automation: automate a little, run the test, get ideas for new tests, specify those, run some more, etc.

That specify-and-execute-tests-anytime-anywhere capability gives stakeholders instant feedback about the state of the project relative to their expectations any time they want it.

Imagine the power of being able to draw a diagram with a set of expectations, then say “Now, tell me which of these expectations are and are not met by the current version of the code.” That’s what I want the next generation of test tools to do for me.

Support Dynamic Assembly of Test Suites

I want to be able to dynamically choose a set of tests to execute based on some criteria. That set becomes a test suite.

Sometimes that criteria is an attribute of the test: it exercises some aspect of the system like database access or error handling. Thus I want to be able to choose a set of tests by specifying an attribute, like “actions include save” or “checks include error.”

Sometimes that criteria is that the test is part of a set I’ve designated as “smoke tests” or “stress tests” or “happy path tests” or somesuch. Thus I want to be able to tag tests, and have those tags become an attribute of the tests: “tags include smoke_test” or “tags include happy_path.”

The end result of this should be that I can set up a continuous integration task to run all tests that should be included in a suite. For example, I might have a task “execute_functional_tests where tags include smoke_test” or “execute_functional_tests where last_result was fail.”

Save Results

Each time I execute a test, the tool should automatically save a record of the results, including timing. If I execute a suite, the tool should save a record of the suite criteria as well as the tests that were included. I don’t want these results to be versioned automatically, but I want the option of checking the results in.

Support Exploration

Ward’s Process Explorer supports exploration by allowing you to see on-the-fly generated and formatted partial pages rendered by the server in response to an AJAX request. I love that, and I want more things like it.

I want to be able to see what pages or dialog boxes look like within the context of a test by hovering a mouse over that point in a test.

Furthermore, I want to be able to drive the application under test to a good starting point for manual explorations with the click of the mouse. So, for example, if I have a full state model, I’d like to be able to drive the application under test to a given state, then begin manual exploration from there. (Watir and Selenium, by the way, are good for this. Because they drive Web applications through the browser, you can begin manually exploring wherever your automated tests leave off.)

Support True End-to-End Tests

Although you can write your FIT and FITnesse fixtures to test end-to-end, in practice developers usually write test fixtures to exercise the logic of the application, bypassing the UI, and using an in-memory database.

And yet, true end-to-end testing reveals information that logic testing alone cannot reveal. It can tell you when the UI is not wired up correctly to the logic, it can tell you when the UI doesn’t provide the expected feedback when a user provides data outside an allowed range, and it can tell you when there’s a problem with the round trip from theUI to the database, then back to the UI again.

I will readily admit that automated end-to-end testing poses numerous challenges related to data and deployment. Testing end-to-end means all the parts of the system are set up and ready for testing, in a known good state, and can be returned to that known good state at any time. Just getting systems ready to test manually can be a challenge. So there will be challenges associated with setting up and cleaning up data and configurations. I don’t expect a tool to doBippity-Boppity -Boo magic. Rather, I expect the tool to create a structure around deployment that suggests ways to think about the problem to make it easier.

Here’s one example of a way this might work.

Once upon a time I was testing a distributed system using Ruby and Watir. Configuring all the parts of the system for testing was a real pain. However, Ruby DRb provides a way of controlling distributed machines. We used DRb to kick off Ruby setup scripts on the various machines involved in our end-to-end tests.

Our approach took quite a while to set up, in part because we had to think through all the various issues.

What if the functional test tool included small, lightweight, remote cross-platform-compatible listeners you could install on every box that needs to be configured? And what if those listeners understood how to access databases, set up files in file systems, change values in the Windows registry, set UNIX environment variables, etc. You’d provide the necessary information in a centrally controlled configuration file. Better yet, you’d have the option of providing the information graphically in a deployment diagram annotated with configuration details. In short, I want the tool to provide a framework for end-to-end configuration so I only have to provide the details. But that framework has to be extensible since it can’t possibly predict every configuration task.

Connect Tests to Code

Tests should never break because of routine maintenance on the underlying code.

Yet traditional test automation guarantees that tests will break because there’s no connection between the tests and the code. Evenrefactoring-friendly IDEs don’t automatically update tests unless those tests are considered part of the code in the codebase.

This is also true with traditional, commercial test automation solutions. In fact, test automators using commercial tools typically spend much of their time trying to keep scripts and GUI maps up-to-date with the latest code. What a waste. And even though FIT/FITnesse tests tend to be more maintainable, even they suffer from bit rot if no one maintains the connections between the tables and the fixtures.

I want functional test tools to understand the connection. So if I make a simple change, like changing the name of a method or variable in a test fixture that’s referenced in one or more tests, the test tool should automatically update the name in all affected tests. And if I make a more complex change, likerefactoring a UI, the test tool should at the very least flag all the places in the test code that reference the UI.

Support Navigating from Failures to Code

When an automated test fails, the functional test tool should make it easy to fix the problem. To do that, it should make the related code just a click away. So test_foo fails, you click on a “Show Me the Code” link, and *pouf*, your IDE opens to the right line in the right file where the expectation was not met, and with a full stack trace available so you can trace back through to figure out where the problem really occurred.

As a tester working with traditional teams, I became too accustomed to not having the source code available. Having now worked with severalXP teams, that limitation seems silly. After all, if I notice a simple misspelling in a dialog box, should we really spend all the time necessary for me to file a formal bug report, have the bug triage committee evaluate, prioritize, and assign the bug, have the developer receive the bug and have to reproduce it and track it down, then mark it fixed, and then have me laboriously verify the bug fix? That’s hours of work to fix a typo. Doesn’t it make more sense to give me a link that says “Show Me the Code” so I can fix a typo?

Furthermore, as a developer, when I get a bug report, I have to figure out how to reproduce it, and then have to figure out where in the code the problem is happening. Wouldn’t it be easier if I could execute an automated failing test, see a stack trace, and go right to the source of the problem?

Support Drill Down and Click Through to Explore Relationships

Just like Ward’s Process Explorer allows me to dynamically navigate views to see the connections between actions, use cases, and results, I want to be able to navigate all these connections on-the-fly. That means I can see a glance how many tests involve a given action or condition, how many tests check for a given expectation, the pass/fail history of tests related to a given action or condition, pass/fail counts by feature, etc. All dynamically generated on the fly. All clickable. I want to be able to drill down, click through, and report on anything related to the tests and the execution results.

Support Iterating

Imagine the horror if a test tool only allowed you to create tests after you’d set everything up. The process would look something like…

1. Install the test tool and listeners on all machines in your test environment.
2. Create all your Test Actions in fixtures.
3. Setup all your Test Data.
4. Create all your Configuration Specifications.
5…N. Perform a bunch of other setup steps.
N + 1. Create your first test.

In other words, it would take like 4 months before you could write a test.

That’s insane and unacceptable. The test tool has to support iterative development. It has to support bouncing back and forth between specifying tests and specifying actions those tests execute. It has to support building test data and configurations gradually. It can’t barf if some piece of the puzzle is missing: it should do what it can and provide an accurate, precise, and actionable warning or error when a missing piece prevents it from executing a test.

Let’s imagine, for example, that a test refers to some action, like “login” that hasn’t been coded yet. The test execution engine should report “Cannot find action ‘login.’” And the test should be marked as “not automated,” not “fail.” The test didn’t fail, after all. It just didn’t execute.

Be Transparent

I want all these mechanisms to be as lightweight as possible. Just as xUnit provides a simple way to specify Setup and Teardown tasks, I want the functional test tool in envision to provide a place to put things, but not have heavyweight, ponderous, cumbersome wizards and screens that get in my way more than assist.

The bottom line is that I want my functional testing tools to support testing in a way that is so transparent, so natural, I don’t even notice I’m using a tool most of the time.

Good IDEs do this. They feel a little awkward at first, as you get used to their quirks, their shortcut keys, and how they present views on the code base. But watch an IntelliJ or Eclipse wizard for a few minutes and you’ll notice something: they don’t have to waste time bending the IDE to their will. They don’t have to waste mental processing time figuring out how to do what they want to do. Once they’ve learned the IDE, configured it to their liking, and become comfortable with the shortcut keys, the IDE just gets out of their way. The vast majority of their mental effort goes into the code, not into fighting their development environment.

I want test tools to do that too.

I’m Not Claiming This Wish List is Easy to Achieve

I’m convinced it’s technically feasible to create tools that will do everything I’ve listed here and more. I imagine the result will be a combination of anIDE plugin , some web-based editors that would allow anyone with server access to create and edit tests, and independently executable programs, like distributed listeners to assist with setup. Perhaps each part could be a separate open source project and the complete solution would involve deploying the full collection, much like anXP team that uses Eclipse+JUnit+JWebUnit+Ant+etc.. Or perhaps the result will something else entirely that I haven’t yet imagined.

But I’m not saying that creating tool will be easy. Nor am I saying that if we had such a tool, widespread adoption would happen overnight.

I’m just saying that this is the direction I think we’re headed: graphical, connected, transparent, and integrated. Integrated in multiple senses of the word: tests integrated with code, and the testing process integrated with the development process.

Testing has been isolated from development for too long. But testing cannot afford that isolation. It does not result in better testing, it results in inefficient testing. It results in spinning wheels. It results in frustration. It results in an enormous waste of time.

And I think that any step we can take toward making today’s tools more graphical, more connected, and more integrated brings us closer to the future.

Having written all this, I have some ideas about implementation…but those ideas will wait for another time. And, at this point, they’re just ideas. For that matter, I think I’ll try out my implementation ideas before I get myself too wound up Thinking Big Thoughts and not actually doing anything about them.

In the meantime, I’d like to know your thoughts…what else would you want in Functional Testing Tools: The Next Generation (FTT:TNG)?

Subscribe

Subscribe to our e-mail newsletter to receive updates.

6 Responses to Functional Test Tools: the Next Generation (part 2 of 2)

  1. Marc February 19, 2007 at 10:18 pm #

    I love this. I love the idea that testing tools and development is just as sophisticated as product development.

    If you decide to begin this implementation, *please* let me know so that I can help. I’d love to help implement this first rev of the next generation of functional testing tools :)

  2. Zach Fisher February 20, 2007 at 11:06 pm #

    Excellent post. I can’t count how many times I thought, “I wish such-and-such existed” and then *poof* it materialized out of thin Google. I have a feeling this implementation is not far away.

    Have you any experience with the new Microsoft Presentation Foundation? I’ve a feeling that something like this may be able to bend to your whims.

    http://en.wikipedia.org/wiki/Windows_Presentation_Foundation

  3. Eric Hacker March 2, 2007 at 9:56 am #

    This is an excellent post. I am approaching testing from a completely different perspective, that of network security functional testing. I am working on a tool that actually enables some of these capabilities to be implemented because there was nothing to meet my needs. It should be available at http://www.tcli.org within a week or two.

    Now, I’m not saying that my tool can do all of this, or really any of it, but it tries to do some of it. Even if I got it all wrong, a bad example is better than none. :)

    For example, the tool sends requests and responses between the agents. Responses use
    HTTP status codes, so one can readily differentiate between a 404 Not Found and 403 Forbidden. Ideally, a set of similar response codes could be developed for functional testing similar to how the SIP protocol has its own status codes that are compatible
    but not exactly the same as HTTP’s.

    TCLI also supports entering requests through a command line interface interactively, or through a test script and an RPC like request/response capability.

    It is modular, of course. I do security testing and need to do bad things sometimes, so it is written in Perl which is a good glue language and let’s me misbehave. Thus it uses Perl’s TAP Test::More testing framework on the back end.

    It is supposed to be transport protocol pluggable as well, but currently it only supports
    Jabber/XMPP.

    Again, it’s probably not the foundation for this wonderful system that you are looking for, but it might be a useful prototype of some of these ideas.

    Now back to my documentation, so that TCLI is somewhat understandable….

    Peace,

  4. Patrick Lightbody April 3, 2007 at 8:59 am #

    I just posted a follow up to this, specifically talking about continuous integration and the demands on infrastructure:

    http://blogs.opensymphony.com/plightbo/2007/04/next_gen_testing_tools_infrast.html

  5. Ben Simo April 20, 2007 at 12:06 am #

    Great list Elizabeth.

    I’ve made some progress towards some of the items with my Model-Based Test Engine (MBTE).

    I’ve got test sets, saving of results, automated exploration (although not as a setup for manual exploration), dynamic assembly based on test sets and oracle severity. The MBTE has graphical representation generation, but no way to dynamically go back and forth between the graphical and textual.

    Simplifying the generation of automated tests and linking of results back to the code under test are great ideas where there is significant potential for improvement.

    I’ll have to remember to come back to this list in a few years to see how automation has progressed.

    Thanks for some new ideas for my next MBTE implementation.

    Cheers,

    Ben Simo
    http://qualityfrog.com

  6. Jeff Brown September 30, 2007 at 12:34 am #

    I’m working to address some of these concerns with MbUnit Gallio. I don’t know about the IDE end of things but I’ve been giving a lot of thought to representation, reporting, management, scheduled execution and distribution of tests.

    A delightfully hard problem…