It sounds like Joe Stump is having a bad time of it right now.
Soon after, new players found serious problems that prevented them from playing the game. In response, the company re-submitted a new binary to Apple in July. As of this writing, the current version of Chess Wars is 1.1.
The trouble started with patch release #2. Apparently, even six weeks after Joe’s company submitted the new binary (release number 3 for those who are counting), Apple still hasn’t approved it.
Eventually Joe got so fed up with waiting, and with seeing an average rating of two-and-a-half out of five stars, that he wrote a vitriolic blog post [WARNING: LANGUAGE NOT SAFE FOR WORK (or for anyone with delicate sensibilities)] blaming Apple for his woes.
That garnered the attention of Business Insider who then published an article about the whole mess.
Predictably, reactions in the comments called out Joe Stump for releasing crappy software.
I should mention here that I don’t know Joe. I don’t know anything about how he develops software. I think that there’s some delightful irony in the name of his company: Crash Corp. But I doubt he actually intended to release software that crashes.
Anyway, Joe submitted a comment to the Business Insider article defending his company’s development practices:
We have about 50 beta testers and exhaustively test the application before pushing the binary. In addition to that the application has around 200 unit tests. The two problems were edge cases that effect [sic] only users who had nobody who were friends with the application installed.
I’m having a great deal of trouble with this defense.
Problem #1: Dismissing the Problems as “Edge Cases”
The problems “only” occur when users do not have any Facebook Friends with the application. But that’s not an aberrant corner case. This is a new application. As of the first release, no one has it yet. That means any given new user has a high probability of being the first user within a circle of friends. So this is the norm for the target audience.
Joe seems to think that it’s perfectly understandable that they didn’t find the bugs during development. But just because you didn’t think of a condition doesn’t make it an “edge case.” It might well mean that you didn’t think hard enough.
Problem #2: Thinking that “50 Beta Testers” and “200 Unit Tests” Constitutes Exhaustive Testing
Having beta testers and unit tests is a good and groovy thing. But it’s not sufficient, as this story shows. What appears to be missing is any kind of rigorous end-to-end testing.
Given an understanding of the application under development, a skilled tester would probably have identified “Number of Friends with Chess Wars Installed” as an interesting thing to vary during testing.
And since it’s a thing we can count, it’s natural to apply the 0-1-Many heuristic (as described on the Test Heuristics Cheat Sheet). So we end up testing 0-friends-with-app, 1-friend-with-app, and Many-friends-with-app.
So even the most cursory Exploratory Testing by someone with testing skill would have been likely to reveal the problem.
I’m not suggesting that Joe’s company needed to hire a tester. I am saying that someone on the implementation team should have taken a step back from the guts of the code long enough to think about how to test it. Having failed to do that, they experienced sufficiently severe quality problems to warrant not one but two patch releases.
Blaming Apple for being slow to release the second update feels to me like a cheap way of sidestepping responsibility for figuring out how to make software that works as advertised.
In short, Joe’s defense doesn’t hold water.
It’s not that I think Apple is justified in holding up the release. I have no idea what Apple’s side of the story is.
But what I really wanted to hear from Joe, as a highly visible representative of his company, is something less like “Apple sucks” and something much more like “Dang. We screwed up. Here’s what we learned…”
And I’d really like to think that maybe, just maybe, Joe’s company has learned something about testing and risk and about assuming that just because 50 people haphazardly pound on your app for a while that it’s been “exhaustively” tested.