How to avoid the very dangerous ALWAYS-GREEN test


When a test passes the first time it’s ever run, a developer’s reaction is “Great! Let’s move on!”. Well this can be a dangerous practice as I discovered one cold rainy day.

It was a cold rainy day (kind of common in Dublin), I was happy enough with my test results being all shiny green, when I decided to do some exploratory testing. To my surprise I discovered that an element on a web page that had always been there before, was gone, departed, vanished!

First reaction was to say, where the hell is it? I run some investigation and I saw the cause of it, no worries, it got knocked out by the last change, easy fix. The worst feeling had yet to come, in fact when I went to write a test for that scenario I saw that there was already an existing one checking for exactly that specific element existence… WHAT? The damn test had passed and was staring at me in its shiny green suit!

When we write automated tests be it a unit test, acceptance or any other type of test it is extremely important that we make it FAIL at least ONCE.

In fact, until you make a test FAIL, you will never know if the damn bastard passes because the code under test is correct or because the implementation of the test itself is wrong.

A test that never fails is worse than having no test at all because gives the false confidence that some code is tested and clean while it might be completely wrong now or any other cold rainy day in Dublin after a re-factor or a new push and we will never know because IT WILL NEVER FAIL.

If you don’t follow what I’m talking about, have a look at this example:

Take a Web app and say I want to verify that one field is visible in the UI at a certain stage.

What I do is to build automation that performs a series of actions and at the end I will verify whether I can see that field or not.

To do this I will create a method isFieldVisible() that returns true or false depending on whether the field is visible or not, so that I can assertTrue(isFieldVisible(myField));

When this test passes I am only half way there because I need to demonstrate that when the field is not visible isFieldVisible() does return false, otherwise my test might never fail

To do this I write a temporary extra step in the automation that hides the field and then run the same assertion again

assertTrue(isFieldVisible(myField));

At this point I expect the assertion to fail, if it doesn’t it means that I just wrote a very dangerous ALWAYS-GREEN test

What if I did write a very dangerous ALWAYS-GREEN test? What do I do now?

I must change the code (the test code, not the app under test code) until the test FAILS, when it fails for the first time and the original test is still green I can be sure that the test can fail and will fail in the future if after a re-factor or any other change that introduces regression, rainy day or not.

At this point you might argue that, rather than simply changing the test to make it fail and revert it to the original test, we should write the negative test and execute it as part of the automation.

It is an interesting point and the answer depends on the specific situation. In some cases a negative test can be as important as the original test and it is necessary for covering a different path in the code, but this is not always the case and we will have to make an informed call every time.

Example1 – When writing a negative test makes sense:

I want to verify that when I hit the “Customer Feedback link” my “Company Search box” can still be seen by the user.

I write the following test:

To make it fail I will add an extra temporary step:

If the original test was green then this test MUST FAIL (otherwise we have written the very dangerous ALWAYS-GREEN test)

At this point I notice that this is a valid scenario and I can write a test for it (if I don’t have it already).

The test will be

I have positive and negative scenarios covered. The negative scenario verifies that the “Hide Search Box link” functionality works as expected.

Example2 – When writing a negative test does not make sense:

I want to verify that after performing a search for a company and getting search results back, the value of the latest search is persisted in the search box.

I write the following test:

 

To make it fail I remove the second and third step

 

If the original test was green then this test MUST FAIL (otherwise we have written the very dangerous ALWAYS-GREEN test)

At this point I look a the tests and realise that there is no point in adding a negative test like the above (the one with 2 steps only) because if we don’t actually type ” Jackie Treehorn corp.” in the field, it is very unlikely that Jackie Treehorn or Jeff Lebowsky or any other cool character will suddenly appear in a search box magically so I decide that a negative test is not required

To recap:

1. When you write a test you MUST be able to make it fail to demonstrate its implementation is valid in particular if it is a cold rainy day.

2. If while you make the test fail you realise that this represents a new valid scenario to be tested then write the scenario and a separate test with the negative assertion, it might come useful on one hot sunny day.

Advertisements

9 thoughts on “How to avoid the very dangerous ALWAYS-GREEN test

  1. Great post.
    It makes me think.

    I suggest another way to ensure that the test is red when the feature is broken: write the test before writing the feature itself. In other words, apply BDD/TDD with early testing.

    I agree with you that you “need to demonstrate that when the field is not visible isFieldVisible() does return false“, but I’m not sure that writing a “temporary extra step in the automation that hides the field” is the most economic thing to do.

    I’d rather write the test for isFieldVisible() before the existence of the field. That “test” will be a declaration of intent: you will be saying “With this red test I’m demonstrating that the software is lacking a feature“. Then you (or anyone else!) can take as a target to make that test green. That is, to develop that feature.

    I think proceeding like this you will force each test to born red, become green and eventually become red again whenever the feature breaks.

  2. Arialdo, we actually do write the test first, but the problem still applies, let me try to explain:

    1) I write a test that checks for the visibility of an element on a web page using my brand new method isFieldVisible() but unfortunately I make a mistake in the test method’s implementation and such method will always return “true” 😦
    2) I execute the test and it fails as there is no software to test yet
    3) I then implement the code of the application under test and run the test again
    4) The test is green the first time is run against existing software
    5) Can I say that the test passed?

    NO!

    The test is green the first time I run it but the reason it is green is that isFieldVisible() returns always true.
    If I don’t hide the field and verify that in that case isFieldVisible() returns “false” I will never know whether the first time the test passed for real or passed because of a test error.
    If instead I hide the field, run the test that checks that the field is invisible I will notice that such test will fail and BINGO I will understand that the first test never really passed and I can investigate the issue.

    I am a great fan of BDD/TDD and my team practice them with great success but in this specific case writing the test first does not help 😦

  3. The second part of the citation may also be of some help, and contain a suggestion to solve to problem you described:

    Don’t just test your code, but test your assumptions as well. Don’t guess: actually try it. Write an assertion to test your assumptions (see Assertive Programming, page 122). If your assertion is right, you have improved the documentation in your code. If you discover your assumption is wrong, then count yourself lucky

    May be a possible solution is: add an assertion to ensure that the test is failing when an element exists but it is not visible (after all: this is what you did, isn’t it?)

  4. I use Capybara for testing. Its API is very nice and it hides all complexity related to checking if element is visible, exists, etc. Such helper methods aren’t needed in my code. Why do I need to make tests like in this article?

    • Hi Andrey, I am not familiar with Capybara but besides the fact that the tool you use might help you write good test code, there is still a possibility of creating a very dangerous always green test. Being able to make a test fail at least once simply ensures that that test “can fail”, that is more valuable than relying on tests that might never fail and give you false confidence.
      Imagine that in the latest version of Capybara there was a new bug and the method isFieldDoodlable() returned always true no matter what because of this bug. If you were writing an assertion against a field wanting to check that it was “doodlable” and didn’t try to make the test fail your test might fail but you won’t know because it would be green. Now I know this is a strange case and more than likely Capybara or other tools won’t have these issues in their helpers, but how about our testing code? That could be buggy, couldn’t it?

      The reason I wrote this article comes from having experienced a few very dangerous always green tests in my life. I saw tests that wouldn’t fail even if you set your computer on fire 🙂

      • I think it doesn’t make sense to write tests for Capybara outside Capybara. If you think Capybara may broke, you should add additional tests to it, not write them in your test code for your application.
        What always green tests have you seen besides that? I’m interested as I’m not convinced by your idea making tests fail

        • That’s fine Andrey, as I said I am not familiar with Capybara and it might not make sense to follow my approach in that context. As per my “very dangerous always green tests” I have seen a few and only found out after the bugs were discovered in production. My tests passed but the test code was wrong, this gave me confidence to release and only after investigation I saw that the culprit was my test code. If I made them fail at least once this wouldn’t have happened, that’s all, no more, no less 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s