A Place to Put Things

At the AA-FTT workshop last October, I did this lightning talk titled “A Place to Put Things.”

In it, I propose standardizing on places to put different kinds of information associated with automated functional tests.

It seems to me that one of the key success factors for the xUnit family of unit testing frameworks is that they gave us just 5 places to put code related to unit tests: setup, test, teardown, suite setup, suite teardown. That simple organization has a powerful focusing effect, enabling (or, perhaps, forcing) developers to narrow their attention down to just the code needed to create one little itty bitty unit test at a time.

Functional testing frameworks have no such common, standardized structure.

FIT has given us something close with the notion of a test in natural language in a table and fixture code to hook that test to the software under test. If we extend that model a little to include the idea that we may well be testing against an external interface, like a web interface, where a driver, like Watir or SeleniumRC, would be handy, we end up with 3 big categories of things:

  • Tests: scenarios describing the actions and expectations, expressed in natural language with keywords
  • Fixtures: code that connects the keywords in the test to actions in the software under test
  • Drivers: libraries like Watir, SeleniumRC, Perl’s Win32::GUI, etc. that know how to address external interfaces such as Web interfaces, thick client GUIs, command line interfaces, soap/XML calls, etc.

That’s the direction I think test automation tools in general are headed, and it’s an important evolution with profound implications. However, I’m still figuring out how to explain how this structure differs from what traditional tools offer, and the significance of those differences.

At the very end of the “A Place to Put Things” video, there’s a little exchange between two of the participants in the workshop. A woman’s voice says, “She just pulled that together in like a minute!” That’s Jennitta Andrea speaking. She was the co-organizer of the AA-FTT workshop.

In response, Ward Cunningham says, “Oh, no. I think she’s been pulling that together for the last 3 years.”

Ward’s right. And I’m still pulling together my ideas and figuring out how to articulate them. So while you don’t see much evidence of it on my blog, I am actually spending a fair amount of time writing, and rewriting. More soon. I hope.

So You're Trying to Automate Tests for a Legacy Web Application…

…and it isn’t cooperating?

I sympathize.

I recently fought my way through the process of automating a test to reproduce a bug on a legacy(*) web application that had no IDs on any of the elements that I wanted to address. And I thought it might be helpful to capture some of my lessons learned here in case they help someone else. Also, I’m putting this here so I remember what I did.

Lesson 1: SeleniumRC Rocks!

I’ve been extolling the virtues of SeleniumRC for quite a while. This project gave me the perfect opportunity to refresh my Selenium skills. And I’m delighted to report that SeleniumRC is even better than I remember it being. First, I can write my tests in my programming language of choice (Ruby). Second, it has a wide variety of ways to locate these pesky non-ID’d elements. Third, every time I ran up against a road block, I discovered that the smart folks who make Selenium had already anticipated the problem and found a solution.

Lesson 2: Selenium Server Flags can Solve Common Execution Problems

In my particular case, the app I was testing didn’t play nicely in IFrames. This is a problem: by default Selenium runs the web app in an IFrame in the same browser window where it displays its own status. Fortunately, it turns out that there is a -multiWindow flag to solve exactly this problem. I solved the IFrame problem by running the Selenium Server like so:

java -jar selenium-server.jar -multiWindow

There are a variety of other Selenium Server flags that address other common problems. See the Server Command Line Options documentation for a full list.

Lesson 3: ‘Permission Denied’ Errors Probably Mean the App Violates the ‘Same Origin Policy’

Once I’d gotten to the point where I could launch the app, I started encountering very puzzling Permission Denied errors. I vaguely recalled that such errors probably meant there was some problem with the domain names changing and browser security and cross-site scripting something-or-other.

So I checked the domains. Sure enough, the home page was at “www.example.com.” From there, when you log in, it goes to “app.example.com.” Bingo! The domain was changing in the middle of my test. I experimented a little and discovered there was no way around it: the app was going to redirect to a different domain no matter what.

It turns out I’m not the first person to have this problem. Fortunately, Selenium has a strategy for addressing the issue: experimental browsers. I tried the chrome browser for testing on FireFox and it worked perfectly.

Lesson 4: Firefox Rocks!

At this point I could launch the app and log in, but now I had another problem. After the login page, all the things I needed to click, check, or otherwise manipulate were buried deep in convoluted HTML. I realized that figuring out how to address these things was going to be non-trivial.

The most basic strategy for discovering the locator for an element is to view the HTML source. You can view the source for the whole page, but Firefox has a great feature that allows you to see just the source for just a selection. To use it: highlight a selection on the page, then right-click. One of the available menu options is View Selection Source. Choose it, and you get a window with just the relevant HTML.

However, if you’re dealing with something complex, viewing the source isn’t enough. You really need to look at the Document Object Model (DOM). The best way I know to do that is with the DOM Viewer included with the Web Developer plugin by Chris Pederick.

Web Developer also includes a feature that lets you see all the attributes for a given element. From the Information menu, choose “Display Element Information.” Now you can get the attributes for any element just by clicking on it. I love that feature.

Finally, the XPath Checker plugin by Brian Slesinsky is a very helpful tool for figuring out how to address those pesky non-ID’d elements. More on xpath in the next section.

Lesson 5: xpath Is Now My Good Friend

One of the hallmarks of legacy web apps is the annoying lack of IDs on important elements. Of course, web apps aren’t the first place where a lack of IDs is problematical. I recall struggling with Windows apps that lacked field IDs back in the 1990s.

The good news is that this is a much more tractable problem in web apps than in Windows apps. Xpath to the rescue!

Let’s take just one example. I needed to verify the text associated with a particular image. The image served as a kind of custom bullet in a bulleted list, so there were several identical ones. The text itself did not appear in the DOM immediately next to the graphic. Rather, its parent was a peer to the parent element of the image. (Got that straight? Yeah, me either. Seriously, this one took me a while.) In brief, the HTML around this thing looked kinda like this:

        <img src='/path/to/images/check.gif'>
        item 1
        <img src='/path/to/images/check.gif'>
        item 2

Mind you, the HTML didn’t look that clean. There was a lot of other random stuff in there, and every tag had a gazillion attributes, and there were hard-coded styles everywhere. But I digress. And I’m probably whining. I’ll stop that now.

So it turns out the only way I could grab the text associated with the second bullet in the list was with the following Selenium command:


Let’s all say it as a group: “EWWWWW!”

But let’s also appreciate that doing such a thing is actually possible. Selenium has a wide range of locator styles, and even allows you to add your own locator strategies. (I haven’t needed to do that yet, so I’m not quite sure how to use the feature, but I noticed from the documentation that it’s there.)

(For more on using xpaths with Selenium, I found the Help with XPath article on the openqa.org site useful. It’s hard not to like an article with headings like “How the $%^@$ do I locate an element?”)

Mind you, the world might be a better place if we couldn’t write such code. Every time I figure out how to automate tests against an untestable application, I feel a twinge of guilt. By automating tests against an untestable interface, I become an enabler of more untestable interfaces. For the sake of improved collaboration and more testable applications, perhaps it’s better if those of us who automate tests avoid resorting to using our xpath superpowers except in the service of wrapping legacy apps with tests so they can be refactored for testability.

But once again, I digress.

The point I really want to make is that SeleniumRC gives us the power to automate tests even against icky hard-to-test legacy apps, and to do it with real programming languages (pick your favorite: C#, VB.Net, Perl, PHP, Ruby, Java). And that means we can write maintainable automated tests using good programming practices. And *that* means we can automate regression tests for faster feedback. And ultimately, *that* means we can make changes to the legacy app to improve testability and maintainability.

So rock on, SeleniumRC. And huge thanks to everyone who’s ever worked on SeleniumRC or Selenium. Also, huge thanks to the Selenium community as a whole. When I went looking for answers to my questions I found numerous blog posts and forum messages with tips and tricks. This post would not have been possible without such a community that’s so open about sharing knowledge.





* Here I mean “Legacy” as Michael Feathers defines it: code that lacks automated unit tests. Web developers, please take note: if you write good unit tests for your web app, including JSUnit tests for the JavaScript bits, the application will be MUCH more testable and the QA people will stop whining at you so much. return to footnote reference