Archive for January, 2007

Creating Methods On the Fly…And Bugs in Google Phonebook

January 27, 2007

I wrote the following Watir script as a demonstration of how to generate new methods on the fly in ruby.

See (at the time of this writing) there’s a bug in Google Phonebooks, where if you do a search that generates pages of results (e.g. “rphonebook: j smith, ny“) and then quickly click to some of the later pages, you will see:

Google Server Error

I wanted to write a script that would randomly generate search strings – some that would yield many results, like “j smith, ca” and others that would yield few if any, like “z glinkiewicz, ak”.

I wanted the script to be able to easily run an arbitrary user-defined number of iterations…and I wanted each iteration to have it’s own assertion, so that if one failed the rest would still run. In ruby’s test::unit (which Watir gets it’s assertions from) each assertion needs it’s own method…and that led me to generating methods on the fly.

My inspirations to play with semi-random automated tests were Chris McMahon and Paul Carvalho, and I got help with my syntax through a speedy answer to my question from Brett Pettichord on wtr-general.

Here’s the script I wrote. Feel free to question me about why I did what I did or to propose improvements (I consider myself a beginning automator). In the mean time, without further ado…here’s the code:

#   This script creates any number of randomized Google phonebook searches,
#   then quickly cycles through each page of results, looking for server errors.
#   Written to reproduce and explore a bug that I found in the Google phonebook,
#   and to show a watir script of mine that's completely non-proprietary and can
#   be run against publicly available software.

#$LOAD_PATH.unshift File.join(File.dirname(__FILE__), '..') if $0 == __FILE__
require 'test/unit'
require 'watir'

class TC_GooglePhoneBook < Test::Unit::TestCase
  include Watir
  $count = 15   # set number of iterations here

  def setup
    $ie =

  # data arrays
  first_initial = ('a'..'z').to_a
  last_name = ["allen","brown","glinkiewicz","johnson","jones","mason","ross",
  state = ["ak","az","ca","fl","ma","mi","mt","nv","ny","wa"]

  $count.times do |count|

    fi = first_initial[rand(first_initial.length)]
    ln = last_name[rand(last_name.length)]
    st = state[rand(state.length)]
    method_name = :"test_#{count}_#{fi}_#{ln}_#{st}"  #dynamically create test method
    define_method method_name do
      search_string = fi +" "+ ln +" "+ st
      $ie.form( :name, "f").text_field( :name, "q").set("rphonebook: #{search_string}")
      $ie.button( :name, "btnG").click
        i = 1
        while $ :text, 'Next').exists? do
          $ :text, 'Next').click
          i = i + 1
          assert_no_match( /Server Error/, $ie.text, "Page #{i} contains a server error." )
        end #do
    end #method
  end #N.times do count

  def teardown

end #TC_GooglePhoneBook

Incentives for Developers and Testers

January 27, 2007

There’s a new Google Testing Blog. In the comments for their first post, Michael asks innocently:

I have a simple question as a former-programmer, now business guy, between dev and QA.

Why not put pay performance targets on both sides as incentives per testing release?

In other words, each QA person gets $10 for each bug they find (up to 10, or whatever). Each development person gets $10 for the number of bugs not found under a certain target (10, or whatever).

Seems like an easy way to get people more motivated about the whole testing process.

Sounds almost common-sensical…but it’s a disastrous idea.


Be Careful What You Wish For
Incentives often work in perverse ways. Tell Tony Tester that you’ll pay him for every bug he logs, and he’ll clog your bug tracking system with niggling items, and go to pains to turn one bug into ten bug reports. Tell Connie Coder that you’ll pay her for not having bugs logged against her, and she’ll spend half her day arguing why these three issues are features, not bugs, and these four are really in Carry’s code, not hers, and…you get the idea.

Do I think that everyone is really this petty? No, but you are incentivizing pettiness here, and incentives can have a frightening power. Which leads to the next problem with this suggestion:

Replacing Intrinsic Motivation with Extrinsic Motivation Harms Your Team
This may not be true for everyone – I’ve met a few folks who swear that they work best when there is dollar goal they are shooting for. A good deal of research shows that:

  • Intrinsic motivation is more valuable to an organization than extrinsic motivation, and that
  • Extrinsic motivation tends to erode intrinsic motivation.

My experience as a worker, team member, and manager has been that the best folks work for some combination of:

  • Personal pride in a job well done,
  • Commitment to a vision,
  • Commitment to the team (or to someone on the team),
  • The enjoyment of the task itself, and
  • A desire to learn and grow.

That’s not to say that salary is irrelevant…but I think it’s relevance is often misunderstood. If one believes in the project, likes the team, and understands that there’s just not much money in the company right now, many folks will happily give their all, knowing full well that they could make more money somewhere else. If you don’t believe me, look at how hard school teachers and non-profit workers tend to work.

On the other hand, if that same person finds out that the person next to them is making 20% more for roughly the same job, their intrinsic motivation may be completely destroyed. Why? It’s not the amount they are being paid per se, because that hasn’t changed. It is that the salary differential is a sign of disrespect for their work, for unfairness in the workplace, or for dishonesty in management.

I would also add that I suspect salary is a strong factor in employee retention, but I doubt that it plays much of a role in employee motivation…other than the negative impact of undercutting intrinsic motivation by making someone feel undervalued or taken advantage of.

Bug Databases Are Tools to Understand Projects, Not to Evaluate Workers
Lastly, I believe that bug tracking databases are at their best when they help to provide insight into the state of a software development project. I believe that if Bug DBs are used to evaluate employee performance, they will necessarily begin to be gamed…and before long the information that they should provide will be obscured and distorted by folks trying to protect their jobs, to save face with management, or to maximize that next bonus.

And one of the last incentives I want to create is for someone to manipulate the data I use to get a handle on how the project is doing.

Another Testing Blog

January 18, 2007

I’ve read and been inspired by many great testing blogs for a while now, but have held off so far on contributing myself. Why start now?

Recently I’ve:

  • Had a very fun job search,
  • Chosen a job as the first test engineer at an Agile startup, and now
  • Just got back from a very stimulating AWTA,

…All of which have finally gotten the thoughts bouncing about in my head to boil over into this blog.

I’m looking forward to the regular practice of writing, and especially to hearing your thoughts, comments, and suggestions.


Thoughts About All Pairs Testing

January 18, 2007

Danny Faught started a thread on All Pairs testing on the Software Testing list. In it, Jared Quinert asked

In what kind of situations have people found all-pairs useful? When can we be comfortable that the theory of error that underlies all-pairs is likely to hold true? I find that it’s not that often for me. Am I just over-exercising my tester sense of doubt?

Here’s an example of how I applied All Pairs at a previous company. This was a C++ app that ran on a handful of OSs and DBs. It’s a fairly small and straightforward matrix, but it illustrates some of my thinking about All Pairs testing. The matrix below is NOT what we really did, but from memory I think it’s not far off:

Sample All Pairs Matrix

A few things to note:

  1. Tools to randomly generate an All Pairs matrix are A Good Thing, but I would generally use the generated matrix as a starting point, modifying it as appropriate. In this case, I had a small enough set of options that I had no reason to auto-generate it. Instead, it made sense to think about my context (what combinations do more of our customers use? What do our highest paying customers use?) rather than fill it randomly.
  2. Many of the combos we didn’t test at all were unsupported or impossible combinations (e.g. Linux+MSSQLServer) , but
  3. One glaring exception was Solaris 10 on Oracle — a platform we supported but never tested on.

Not testing on one of my supported platforms? Why would I do such a thing? And then go ahead and admit it in public? For a long time we were behind the times on Solaris, supporting older versions but not the latest. There wasn’t much clamor for it – not enough to be losing sales – but a few adventurous customers asked about upgrading to it. In reading Sun’s docs, I found the strongest statement I’ve ever seen about backwards compatibility – guaranteeing that if any software that ran on 9 didn’t work on 10, that Sun would release a patch to fix the incompatibility in 10. At first we just told a few eager customers, “we don’t support 10 yet, but here’s a link to what Sun says. If you want to experiment, feel free and let us know what happens” Several of them chose to try it. Six months later we still didn’t have a Solaris 10 server in house, but we had a good deal of feedback from those customers that all was well from their perspective, we did some further research leading to more encouraging reports about Solaris 10’s backwards compatibility, and finally we decided we were confident enough to declare public support.

Now, I’m sure that there are folks who would say that that was a foolish risk to take. Looking back I’m still happy with the decision…for our particular product, in our particular market, with our particular time and budget limitations.

It’s important to remember that deciding what (not) to test is always risk management. All Pairs is often a useful technique, but of course it doesn’t guarantee that an application will work on all the other platforms. In contrast to the application I’m describing, a friend of mine tests a tool that works at a very low protocol level on a staggering list of OSs, and he’s learned from experience that he really has to run every test on every platform. In my context All Pairs (and Sun’s track record of backward compatibility) mitigated the risk sufficiently that I chose to put my limited time into other tests – and I believe we caught more significant bugs as a result.