When something works, share it!

When I joined PaddyPower in October 2012 I was asked to improve quality without affecting throughput.  I studied the teams for a couple of months and I came up with this model based on Gojko Adzic’s Specification By Example and a white paper on ATDD from Elisabeth Hendrickson.

One year after, the bugs are a distant memory and cycle time has been almost halved, so I thought sharing the approach might be useful to somebody out there.

Here we go!

Acceptance Test Driven Development

Acceptance Test Driven Development is about people, communication, collaboration and delivering business value.

ATDD is a software development methodology based on enhanced communication among its actors.

ATDD uses high communications processes and tools to help developers write tests in a business language understandable by all actors. Such tests help developers focus on what to code. The tests can be automated and represent the blueprint of the application being developed. The tests live with the code never getting out of date. Any deviation between tests and application is communicated to the team in the form of a failing test. ATDD sits very well with many agile software development approaches including Scrum and Kanban and with XP engineering practices.

The Actors

ATDD Actors
ATDD Actors

Activities Artifacts and Goal

Acceptance Test Driven Development can be described using 4 activities (Discuss, Distill, Develop, Demo) 4 artifacts (User Story, Examples, Tests, Working Software) and 1 goal (Business Value). Each activity takes as input an existing artifact and produces as output an evolved version of such artifact until the goal is reached as explained in the Business Value Evolution train picture below:

Business value Evolution Train
Business value Evolution Train

The starting artifact is a User story, during the Discuss activity such artifact is transformed in Examples, in the Distill activity we transform the Examples into Tests, Tests are transformed in to Working Software during the Develop activity and Working software demonstrates Business Value at the Demo activity.

For this process to be successful it is fundamental that none of the activities is performed in isolation (by a single actor) . As many actors as possible should participate to any of the activities, specific actors’ requirement for specific activities are described later in this document.

One Source Of Truth

The first 3 artifacts describe requirements, the fourth is the representation in software of such requirements and the last is the final business goal. The Tests will represent the one point of truth about what it is turned into software and business value. Once the Demo is completed the User story and the Examples can be disposed and the only source of truth will be in the Tests. This is due to the fact that in the transformation the team will have learned a lot more about the deliverable than what it is described in the user story or the Examples. The tests will be updated during all activities to quickly adapt to changes and new discoveries, it is not necessary to retrofit such changes into User Stories or Examples. Tests stay forever and describe the application.

It is imperative that the Tests are at all time visible to all the actors. This means for example that having them buried into source control is not an option because business people don’t use source control.

Actors on the Train

The picture below expands on the Business Value Evolution Train by showing which Actors participate in which activities.

How it Works
How it Works

Discuss

Required Artifact: User Story – We need a business requirement to start from. This doesn’t necessarily need to be in the format of a user story, and it can be expressed in any format. What is needed is a business value to be delivered.

Required Actors: At least 3 members of the Development team should participate, ideally a  mix of testers developers and business analysts. Either one between Product Owner or Business Analyst need to participate to this activity.

Format: Meeting with access to a whiteboard

How it works: The Business Analyst has previously developed the user story through his conversations with the Product Owner, he will be able to explain to the other actors the user story’s business value. He will also be able to explain the conditions of satisfaction. Shared understanding of goals will guarantee the real goal is attained and not a consequence of somebody’s assumption

The conditions of satisfactions will be translated into examples, for example if the user story is about giving free delivery for customers that buy 5 books or more the initial examples might be:

5 books free delivery
4 books paid delivery

Questions will be asked and other examples might come out, for example

5 books outside Ireland paid delivery (if the product owner decides to give free delivery only for home users)
4 books and 2 CDs paid delivery
5 books and 1 washing machine free book delivery and paid w/m delivery

and so on…

By the end of the meeting the examples very likely will describe more scenarios than we thought when reading the user story for the first time and by trying to create concrete examples, few questions over the applicability of rules will be raised. If all questions have not been answered and the help of the Product Owner is required, the Business Analyst will get the answers from the Product Owner and have a quick catch up with the other actors to define the last examples. Having the product owner at the Discuss activity can make it more effective as most of the questions will be answered during the activity.

If the questions have unlocked some large uncovered areas and cannot be addressed by the Product Owner, more analysis will be required before we can proceed with the next activity.

Outcome1: Examples – Notice as the examples cover all the aspects of the user story plus those aspects that were not covered in the user story. If we add a 2 liner with the business value to the examples we have an improved version of the user story. The original user story is now out of date and can be expended.

Outcome2: The team have a common understanding of the business value of the user story

Outcome3: The discuss activity might highlight that the user story is too big to be delivered, in this case the activity will produce a list of user stories and the examples for the first one that is taken into development.

Distill

Required Artifact: Examples

Required Actors: Two members of the development team need to participate. At least one Developer needs to participate, ideally the developer that will be designing the code for this item. The second member should possibly be a tester or a Business Analyst, if they are not available, a second developer would do.

Format: Pair programming

How it works: Now that we have the examples written down, we can transform them into tests in a format that works with our test automation framework. There are a variety of test automation frameworks that support defining the tests in advance of the implementation including Jbehave and Cucumber. Tests will be written using the Given When Then format. Tests will cover all the examples that were identified as result of the Discuss activity. Extra tests could be added based on the improved understanding of the business goal.

Outcome1: Tests – The Tests cover all the aspects of the examples plus those aspects that were not covered in examples that were uncovered while writing the tests. The tests will also contain the 2 liner with the business value contained in the examples. Again we have an improved version of the examples.

Outcome2: The tests will be written in English so that every actor is able to understand and give feedback. The examples are now out of date and can be expended. The Tests represent the blueprint (documentation) for what we will eventually deliver. The tests will be highly visible and easily accessible at any time.

Develop

Required Artifact: Tests

Required Actors: Two Developers need to participate.

Format: Pair programming or Single developer writing code + Code Review

How it works: When implementing the code, the developers are following a test-first approach, they execute the tests and watch them fail. They will write the minimum amount of code required to get the acceptance tests Green. Once the acceptance tests are green he will manually verify that everything hangs together and will call another Developer or a Tester to perform Exploratory Testing. Once exploratory testing is completed and any defects fixed the user story is done and working software is ready to be delivered. While coding the developer might identify scenarios that were not identified earlier and add tests for them. Such tests need to be added to the previous set and shared with the rest of the actors. If the new scenarios identified represent a large amount of work a decision might be made that pushes the new uncovered scenarios to a subsequent user story or we could decide to deliver them.

Outcome: Working software + more comprehensive tests

Demo

Required Artifact: Working Software

Required Actors: The Product Owner and 2 members of the development team. Possibly the developer that wrote the software and the other actor that performed the exploratory testing.

Format: Meeting with large monitor

How it works: Before organising a Demo the development team needs to be sure the user story adheres to the definition of done. One very good practice is to create a demo script in which the demo facilitator writes down the steps to follow in order to demonstrate the user story business value to the product owner.
The demo should be an occasion for the development team to be proud of what was delivered.

The product owner will be able to use the Tests to validate all the required functionality has been delivered. At the end of a successful Demo, the product owner will accept the original User Story through the business value demonstrated by running the tests.

Outcome1: Business value

Benefits:

1) Shared understanding
2) Front load issue resolution => reducing defects detected at exploratory testing
3) Tests, for their nature, are specific and unambiguous. Using tests as requirements, developers will create their code based on specific and unambiguous requirements.
4) Developers can avail of the specificity of the requirements and avoid over-engineering
5) The tests are not written in a technical language and can be reviewed/discussed/improved by anybody with some business domain knowledge. Everybody can participate in designing the application by collaborating in defining the tests.
6) The tests are the one source of truth for the application’s behaviours. The acceptance tests live with the code and are executed in CI, they will always be up to date unless the application diverges. The tests are the application documentation. With new functionality come new tests or old tests get adapted to fit the new behaviours.
7) Increased communication builds trust and alliances between the actors
8) Increases transparency; the tests content and results are shared, easily accessible and have high visibility among all actors.

Pleasant Side Effects:

1) Acceptance tests are automated and as such they grow into a comprehensive regression suite
2) Can help Impact analysis. If we want to assess the impact of a change on our application we can simulate that change and verify what pre-existing behaviours it affects by reading the acceptance test results.
3) In the not so uncommon scenario of a re-write, if the original application was written using ATDD it is now possible to recreate the old functionality that needs to be transferred by reusing the Acceptance Tests

Acceptance Tests add no value and can be counter productive when…

1) When we write Acceptance Tests for anything but new deliverables
2) When we write Acceptance Tests to create a regression suite of an existing application
3) When we confuse Acceptance Tests with the tool for writing Acceptance Tests
4) When we write Acceptance Tests that do not give fast feedback
5) When we write Acceptance Tests that are brittle and difficult to maintain
6) When we write Acceptance Tests  that contain variables and parameters
7) When we write Acceptance Tests  that contain unnecessary detail
8) When we write Acceptance Tests  that do not focus on what the change is but on how to exercise such change
9) When we write Acceptance Tests  that test the application using the UI when the change is not in the UI
10) When we write Acceptance Tests in isolation and we hide them in source control
11) When we write Acceptance Tests as workflows/scripts

Finally, if you want to learn how to write better acceptance tests have a look at https://mysoftwarequality.wordpress.com/2012/12/14/how-to-transform-bad-acceptance-tests-into-awesome-ones/

Advertisements

How I stopped logging bugs and started living happy

Because there is not such a thing as a best practiceBug-flyer-with-letters_2col

ATTENTION: This won’t work for everybody, all I claim is “it works in my context” and you might try to use it at your own risk.
WARNING: If you believe in best practices that should be applied to every context go away now, you might be harmed by this post

One day, a few years ago, I had a conversation with an inspirational man, he was at that time my CTO. He was talking about zero tolerance to bugs and how beneficial is to remove the annoyances of bug management from the development of software. I was listening but at that time I didn’t grasp completely the concept, because by then, I had never seen neither envisioned a situation where a development team could produce zero or for that matter even close to zero bugs. Since then a few years have passed and I have worked hard on his vision. Today I can say, yes he was right and a development team can deliver value with 0 bugs, I even learned that the project team doesn’t have to spend half of its time playing with a bug management tool to file, prioritise and shift waste.

Let me give you some context around my project and current practices:

Two co-located cross functional agile teams each with 4 developers, one tester, one business analyst, one product owner one dev-ops guy and a team lead also known as kanban facilitator (18 people)

Kanban Board
Kanban board visualizing our process. (Discuss, Distill, Develop and Demo are stolen from the great work of Elizabeth Hendrickson)

1. We use Kanban to visualize our process and we have the ability to deliver every time we complete a user story if we wanted.

2. User stories are small and we create them so that we can deliver them in less that 3 days

3. User stories are vertical, i.e. no big bang integration is required between the teams, each team works on the full codebase that spans across multiple applications

4. We follow an hybrid between ATDD and Specification By Example that I have specifically developed for our context

5. Developers code review every line of code before every push, we also do some pair programming mainly to train junior developers

6. Developers maintain a healthy 95% or more unit test code coverage, we also enforce all of Sonar coding rules

7. Some of the developers practice TDD

8. We use Jbehave and Thucydides to automate ~100% of our acceptance tests

Our ATDD approach
Our ATDD approach

9. We have an internally developed “platform on demand style” build and deployment pipeline that runs all our functional tests and performance/load tests that check variations against a known baseline

10. Every automated test runs after every push to trunk in such pipeline

11. Every push to trunk gets deployed automatically and can be manually promoted to exploratory testing stage and beyond

12. After all the acceptance tests pass for a user story, we do exploratory testing on it

13. After exploratory testing is complete, we demo to our product owners that subsequently do user acceptance testing before accepting the  story

14. If a bug is found during exploratory testing or user acceptance testing the developer(s) that worked on that user story, drop everything else they might be doing and fix the bug, the card will NOT progress if a valid bug  is not fixed.

15. If a build becomes red, the developer who causes the instability drops everything else and fixes the issue straight away. Build must be always green.

Project Wall
Project Wall

16. When a developer fixes a bug, he writes automated tests that cover such path

You noticed I mentioned bugs in 14 and 16. Yes, of course, we are humans and we make mistakes, this doesn’t mean that we need to celebrate the mistake and make it visible by logging it in bug tracking tools that we will need to run and maintain among with the waste stored into them when the poor bug has the life span of an unlucky butterfly.

When i find a bug while exploratory testing I simply go to the developer and tell him, look pal I think there might be an issue, come to my desk and I’ll show you. If he needs to go home and can’t fix it straight away, no problem, I will stick a red post it on the user story card on the physical Kanban board with 2 words describing the bug so nobody will forget. BTW that card is not going anywhere until the bug is fixed and buried. The red post it goes into the bin after the death of the bug.

The development approach we follow, allows us to have a very good understanding of the business value we deliver as we do group discussions to derive examples for each user story, when you add a bunch of excellent developers that follow good engineering practices you will find out that the bugs you discover when exploring the software are very few, once you act upon them immediately, no excuse, bugs become something that doesn’t really

More Story Wall
More Project Wall

bother you.

In this situation, logging and managing bugs is simply waste and we all live happy with no bug ping pong between developers and testers, no bug prioritization meetings, no bug triage meetings, no bug statistics, no need for bug trends to identify product release dates, and guess what we deliver bugless software that delights our customers.

EDIT: I am not alone! Have a look at the great work from Katrina here http://katrinatester.blogspot.co.uk/2014/11/different-ideas-for-defect-management.html

What’s with the tools obsession?

toolsYou can’t avoid them; they are in every discussion: tools, tools, tools and more tools.

There is no question on bugs that doesn’t get an answer like “open a bug in Jira and bla bla bla…”, every topic on performance have a Loadrunner or Jmeter here and there, no functional testing discussion seems to be worth its while if somebody doesn’t mention Selenium or QTP.

If you suggest somebody to talk and discuss the issues they have, somebody will jump in with the tool that will solve the issue straight away.

The CVs I review are festered with tools, some people with 3 year work experience claim they can use more tools I have ever used in almost 20 years.

For some strange reason, people believe that if you want to be agile you must use tools, when somebody clearly said “Individuals and interactions over processes and tools”.

Do not let the tools replace the conversations!

A Test Challenge and a bit of fun

WARNING: This blog post might contain strong language

It all started yesterday evening, when I saw @mheusser Testing Challenge on Twitter.

Test Challenge

As usual my first question started with “why”. I believe that to be able to develop/test any feature (and make sure it is useful), we need to understand the real business value that we are delivering, hence the question “why?”.

reply

I was very disappointed when I saw @mheusser’s answer because he didn’t give me any context to understand the real business value gained by the customer when sorting by PO. So due to the fact that @mheusser wasn’t going to give me any relevant information and the fact that I couldn’t be bothered asking him again I decided to employ my favorite Product Owner: Justin McDoitnow.

First I found out from Justin that PO actually meant Product Order and not Polymorphic Ostrich like I initially thought.

Second, I asked him what he meant with sorting by PO and he goes “Are you stupid? I mean they are in alphabetical order for f***s sake, and get going that we are late for the release!”

Good old Justin, always charming.

Numbers, letters? “Only letters”

At this point I didn’t ask what alphabet was used or if a special collation had to be applied because I didn’t want him to beat me up right now and suggested as valid the English alphabet with 26 small characters [a to z] no spaces allowed; Justin seemed happy and also confirmed with me that sorting works only one way, ascending order [a before b].

I then called the lead developer Zack Hack, and asked him if they had reused any open source library for doing the sorting and he goes “No, Gus, we didn’t because we analysed what was available and they were all bad. But hey! We implemented this brand new algorithm that performs much better than anything that has been written before, it rocks! Let me tell you how we do it! So we take…”. I obviously put Zack on mute and let him reach developer climax on his own. The information was enough, as usual the wheel was reinvented yet again, and it had to be tested.

So I started testing the first scenario.

Scenario1: Let’s make sure Zack didn’t f**k up the alphabet

I populated the database (or whatever persistence layer you like) with the following POs

zzzzzzzzzz
yyyyyyyyyy

……….

aaaaaaaaaa

Logged on to the app, clicked on sort and verified that the order was, guess…

aaaaaaaaaa

bbbbbbbbbb

……….

zzzzzzzzzz

OK Zack didn’t screw the English alphabet, that’s a good start.

Now I only verified sorting based on the first letter, but that’s not enough, Zack and his new “very useful algorithm” could have screwed up the sorting of the second character or the third or the fourth, etc.

I need a new Scenario.

Scenario2: Let’s make sure Zack’s algorithm can sort POs with at least one common character in early position

That’s not as easy now, I make a quick calculation and discover that the possible combinations of 10 elements from a pool of 26 where the order is not relevant is 1.83579396e+8

I call Justin and tell him that I need 3 light years to complete testing and he threatens to fire me so I decide to try to identify a pattern that might help me reduce the amount of test cases to a small number. Pen and paper, try, fail, retry and eventually after 10 minutes I notice an interesting behaviour applied to certain set of characters, have a look:

initial test data

Did you see anything? I knew you would, well done, yes the diagonal has the first 10 letters in order and under the diagonal the letters are repeated starting from the second to the last. Why did I do this?

I did it because it allows me to do something special: First the 10 POs (horizontal lines) are already in “order” according to our rules. Hence pushing the “sort” button would leave them as they are and I can assert that the first PO stays on top.

This first test does nothing more than Scenario1 didn’t do with all the a’s and the z’s, but watch what happens if I move the first character of the first PO to the last position and I push up the first character of the second PO to be the first character of the first PO:

after_first_moveAt this point if I push the imaginary “sort” button the alphabetical order of the second letter will be compared and I expect the second PO to go to the top (c>b) hence the second PO to be the first in the order.

What if I now move the second character of the second PO to the last position and I push up the third character of the third PO to be the second character of second PO?

 

Ehm… I think I see a pattern… Every time we do our move and we click sort we compare a character at position n and the n(th) PO becomes the first in the order.

This could take me a couple of hours to test manually through the user interface or just about half an hour to automate it.

after_second_move

I have a quick peek into the code and to my horror I discover that the whole multi-tier application is a monster single Java class of 13,256,459 lines of unreadable code. This means that any change, even unrelated to “sort PO” might break it! Definitely automate now!

OK Eclipse up and write some really dirty code that does the job:


public class TestSort {

public static void main(String[] args) {

char[][] testData = {

{ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j' },

{ 'b', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j' },

{ 'b', 'c', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j' },

{ 'b', 'c', 'd', 'd', 'e', 'f', 'g', 'h', 'i', 'j' },

{ 'b', 'c', 'd', 'e', 'e', 'f', 'g', 'h', 'i', 'j' },

{ 'b', 'c', 'd', 'e', 'f', 'f', 'g', 'h', 'i', 'j' },

{ 'b', 'c', 'd', 'e', 'f', 'g', 'g', 'h', 'i', 'j' },

{ 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'h', 'i', 'j' },

{ 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'i', 'j' },

{ 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'j' } };

for (int i = 0; i < 10; i++) {

if (PushSortAndReturnPoIndexOfTopRecord(testData, i) == i) {

System.out.println("PASS");

     } else {

System.out.println("FAIL: PO at position " + i

+ " was ordered incorrectly");

     }

char temp = testData[i][i];

testData[i][i] = testData[i][9 - i];

testData[i][9 - i] = temp;

     }

 }

private static int PushSortAndReturnPoIndexOfTopRecord(char[][] myData,

int rowNumber) {

int index = 0;

// Seed DB with 10 records
// Call the sort function
// returns the index of the row that is now at the top

return index;

     }

}

I will look at refactoring this piece of sh** ehm, code and add it to our CI system once I have some time, for the moment it helped me prove what I wanted.

Of course my code is verifying the order only looking at the PO field and the other fields might have been messed up. I do a quick manual check and see that this doesn’t happen, for the moment this is enough for me, I will add this extra check to the automation while I re-factor the code.

While I was writing the test code I had a couple of thoughts:

  1. Can 2 POs be identical? Justin says “NO!” and calls me a moron because supposedly I was meant to know it.
  2. Are we enforcing this unique constraint on the database or we just hope it doesn’t happen?

I’m going to answer question 2 myself, just pull up a SQL client, and run an insert with a PO that already exists, luckily I get “ORA-00001: unique constraint (XYZ.PK_PurchaseOrder) violated”.

Good!

I had some strange feeling and I asked Justin what would be the worst thing that could happen if the customer couldn’t sort by product owner for some reason on in the worst scenario if the sort returned wrong results. Justin face turned white and he started rambling about “huge loss of revenue, loss of customers, loss of testers limbs etc…” Apparently the company relies on sorting by PO so much that any malfunction could be fatal.

So, I ask, if it is so important, say a customer has just sorted by PO and a new PO comes through soon after and it should be seen by the customer in the top 20 displayed on screen, what happens if he doesn’t see it? Justin goes “if it happens we’re f****d”.

How long is too long for a PO to be displayed after it has been added?

“5 seconds”.

Ok, new test. I seed the DB with 20 items and sort. All good, they are all sorted in right order, but we knew that already. Now I add a new PO seeding it so that it should be the first after sorting. I don’t sort and I wait for 5 seconds. It appears at the top on the screen! Yes, the guys have thought about this, but I am not sure how the system will support the load with many PO’s already present and many added every second, let me see…

Hey Justin, how many PO’s do we have in our DBs in production, and how many new PO’s do we expect per day? Justin thinks for a while and goes “At the moment about 100,000 but we need to be able to scale to up to a Billion and in 3 years we are projecting sales in excess of 10,000 items per day hence potentially 10,000 POs per day”.

I seed a db with ~1B POs (LOL), sort them by PO, go for a coffee, a spin on the bike then return and see if the POs are sorted. When I return, I write some code to add 20 new PO’s each second while sorting by PO, then I pack my stuff and go on a 3 weeks holiday, I’ll come back just in time for the test to be completed.

Thanks to @mheusser for the test challenge.

How to transform bad Acceptance tests into Awesome ones

So you want to learn how to write good acceptance tests? There’s only one way, let’s write some.

This is a practical example that is designed to help beginners write clear and easily maintainable acceptance tests.

Our System

We are BOG, “Bank Of Gus” and we have a Loan Approval Processing System that takes in input some data regarding the applying customer and his loan requirements, as output it returns either Accept (the customer will be given the loan) or Reject (the customer will not be given the loan).

The marketing manager wants to start selling a new Holiday Loan and produces the following user story:

As a Customer
I want to borrow money from the bank
So that I can go on Holiday and enjoy myself

Acceptance Criteria:
In order to get the Holiday Loan approved
1) The Customer must be 18 or older
2) The Customer’s salary must be > €20,000.00
3) The Customer Time in employment must be >= 6 months
4) The Loan amount < (Customer salary)/5

The Loan Application Form (UI) exists already, the Loan Application Form calls a REST service that is what we are now updating to allow for this new product. The UI is also ready and able to display our outcome, a big green Approved or red Rejected string based on what our service returns.

The Loan Application Form looks something like this:

I am eager to start writing acceptance tests, and I start writing down the first one without thinking much, (please don’t get bored by the first test, I promise it gets better, this is the worst one you’ll see)

I’m going to use a rich celebrity for my first test, let’s try to make things interesting.

first_test

Auch… 16 steps for ONLY ONE TEST, by the time I do all the necessary scenarios with boundary analysis I am going to have a document the size of the iTunes licence agreement and this is only the start!

HINT #1: Focus on “what” you are testing and not on “how”

First of all, do I really need to say that I go to a page and that I fill each field and that I push a button? That’s the “how” I use the app, it is not necessarily “what” I do with it. The “what” is “a customer is applying for a loan”.

Could I generalize and get to a concept of a customer applying for a loan? YES
Do I really need to fill the web form to exercise the code I am writing/testing? NO
Can I abstract it, use a test double and call directly my code? YES

Focus on what you are testing, you are not testing the UI, you are testing the loan approval logic! You don’t need to exercise it through the UI. You would exercise it through the UI only if you were testing the UI.

Ok let’s use a test double. I create a mock with the data as per example above and will use it for testing, but, it’s not making writing the test any easier Sad

I could do something like

second_test

Besides the fact that I abstracted the how (the customer entering strings and clicking buttons) with the what (the customer applying for a loan) I still have a very messy test with loads of detail and quite difficult to read and maintain.

It looks slightly better but not good enough, I couldn’t even fit all the data on one line and I took the lazy option of adding ellipses, but in the real world ellipses don’t work, they can’t be automated, imagine repeating this for all the scenarios I need to cover, it’s a disaster, what am I going to do?

HINT #2: Eliminate irrelevant Detail

Do I really need to know the name of the customer to decide if I want to approve his loan? NO
Do I need to know his sex? NO
Shall I continue asking rethorical questions? NO

The only important variables for designing the logic of my application are the ones described in the acceptance criteria, look back: Age, Salary, Time in employment, Loan amount

OK this looks promising, let me try to write the original test using only those.

third_test

This looks definitely better, it exposes only the parameters that have an impact on the loan approval logic, it is more readable and while reading it I have some idea of how the system will work, that’s better isn’t it?

OK let’s write all the scenarios to comply with the acceptance criteria using boundary analysis, equivalence partitioning and other test techniques.

fourth_test

Auch again… I haven’t even started looking at the cases where the loan will be rejected and I have already 4 very similar tests that will bore to tears the Product Owner, so much that he won’t speak to me for a month, what can I do?

HINT #3: Consolidate similar tests with readable tables

I know of a very useful way of writing tests that are very similar without repeating myself over and over and make the readers fall asleep. It’s called scenario outline and I’m not going to explain in words what it does, I’m just going to show it to you because I know that looking at it you won’t require any explanation.

sixth_test

Wow, this looks much better! One test of 3 lines and examples that cover all the possible scenarios! Do you remember when you needed 16 lines of unnecessary detail to describe only the first line in the examples above? This is certainly a an improvement, more readable, more maintainable and all around 100 times better than the original one.

Also, look at it closely; it gives the business an amazing power of using this test in the future to make changes! Imagine that we end up in a credit crunch (again) and the banks want to tighten the way they lend money. So they decide to increase the minimum salary to 30,000 and the minimum time in employment to 12 months for example.

A quick copy and paste + small refactor and we get:

seventh_test

That’s quite powerful isn’t it?

Now if I was a tester and I wanted to be picky I would tell you that there are plenty of scenarios that have not been tested and a full decision table should be created to give good coverage.

Yes you guessed, I am a picky tester, let’s build the decision table for the Credit Crunch scenario.

HINT #4: Use decision tables and boundary analysis to get high coverage (DON’T DO THIS! It is an anti pattern and I am leaving it here as an example of something I learned to avoid along the way)

How do I build a decision table?
First you need to know what your variables are and what “interesting values” need to be considered.

What’s an “interesting” value? They are all the values a variable can take that might make the logic fail. Generally they are Boundary values.

Ok back to the Credit crunch requirements:

2) Customer salary must be > €30,000.00
3) Customer Time in employment must be >= 12 months

The salary variable, for example has 3 interesting values: 29,999.99, 30,000.00, 30.000.01
Respectively, left boundary, boundary and right boundary (some observers might say that 0 and -1 could be interesting values as well, I agree but for the purpose of this exercise we won’t consider them).

How about time in employment, the interesting values are: 11, 12, 13

OK I have 2 variables each with 3 “interesting” values (or dimensions)

I can immediately calculate the amount of tests I need to get 100% coverage with all possible combinations of “interesting” values.

NumberOfTests = dim(salary)*dim(time_in_employment) = 3*3=9

9 test cases will cover all possible paths using all combinations of “interesting” values.

Let’s build the decision table, and guess what? It can be expressed as a test!

eight_test

1 test, 3 steps, 9 examples, 100% boundary analysis coverage, in English, readable, maintainable, clearly expressing the business value delivered, what do you want more?

One last thing; you might be in a situation where decision tables with many variables with large dimensions will require hundreds or even thousands of test cases. If these tests are run at the unit level I wouldn’t worry too much about the run time but if for instance you are testing some Javascript logic and are able to do so only through the UI this will require a long time to execute and not represent a good Return On Investment.

What can you do? There are many ways of reducing the amount of tests to run still maintaining relevant coverage. One technique is called pairwise testing, it is very straight forward and uses tools to quickly identify the highest risk tests that should be included and eliminate the ones with less risk associated. Pairwise testing is outside the scope of this document, If you are interested in knowing more about it, check this out! http://bit.ly/T8OXjZ

Test Automation, help or hindrance?

On Slow Vs Fast, Co$tly Vs Cheap and stating the not so obvious

Automation testing is a must for agile teams that want to continuously deliver business value. Does test automation give value to agile teams? Automation testing gives value if it satisfies (at least) two important principles:

1)      Provide fast feedback to developers (SPEED)

2)      Be less expensive than manual regression testing over the application lifetime(CO$T)

SPEED is extremely important. Time is money. People don’t like waiting for something to happen while losing money, developers are no exception. Knowing that we will soon know if our code change worked or not, helps us re-factor that old piece of code that was unmanageable. If you will only know tomorrow that you have broken something, it might be very difficult to fix it because maybe 10 other people have pushed their changes after you and who knows who broke what? Imagine if it took you 1 day to compile your code, would you make that small optimization change? Be honest…

Fast tests give teams great benefit because they tell us straight away “well done!” you’re on the right path, or “hang on you made a mistake, fix it before it’s too late!”.

There are no two ways about it, slow tests are BAD. Developers hate running them because it is a pain, you either wait for them to complete and tell you how you did or you ignore them and go ahead with other changes. Both approaches are bad, while you wait for feedback you are losing money by not being able to code (time=money), if you make other changes you risk burying the “thing” you just broke under more broken code and guess what? You lose money!

CO$T is quite a big issue, isn’t it?

How do you like tests that are brittle and break as soon as something changes in the application user interface? VERY CO$TLY.

How about tests that take ages to run because are highly coupled and necessitate of the full End to End (E2E from now on) test environment to complete? SLOW & CO$TLY!

Slow because they rely on so many systems, as a consequence they keep on breaking but 80% of the times it’s a false positive because some System_XYZ that the test is using somewhere to provide some data was down or Database_ABC was accessed at the same time by another user that messed up the test data. Damn! Rerun the suite again, TOMORROW you will know if it passes, maybe, hopefully, unless something else is broken 😦 SLOW, CO$TLY and worse than everything they become a hindrance to developers because they not only give no value but are counterproductive by wasting their time.

An automation strategy based on E2E tests run through the User Interface that follow the full application workflow has FAILURE written all over. Why? Because E2E UI driven automated tests are SLOW and CO$TLY. Developers hate them and when they catch the odd bug they might not even investigate and follow-up correctly because “…ah sure, it must have been something in the environment, like in the last 35 failures, damn test harness!”. They slowly become noise in the background and after a while nobody cares about them, they are abandoned and the automation effort deemed a failure.

But, hang on, we can avoid this.

Don’t write slow, highly coupled, UI driven, brittle, costly E2E automated tests, do yourself a favour, just don’t.

Yes but we need them, how do we show we verify the acceptance criteria?

Each individual system in a complex architecture can be built to adhere to acceptance criteria, individually. Let’s focus on each system and target our efforts there first. Let’s also automate integration points but let’s not forget about what we are testing when doing integration: we are testing the interfaces only, not the functionality of the other system we integrate to!

On Slow Vs Fast, Co$tly Vs Cheap and stating the not so obvious

Let me introduce you to Augusto’s 4 golden rules of Fast and Cheap automation testing. (yeah, that’d be me)

First – identify the application under test and focus: write a lot of tests that run against an individual system, they are much faster than coupled tests, they are also much cheaper because they won’t bother you with false positives. Remember, each system on its own can satisfy business acceptance criteria, it might take some time to formulate the acceptance tests describing business value but it is indeed possible. The business logic to be tested resides in the individual systems; test it where it is, not through another system.

Second – go under the hood:, unless you are testing specifically the user interface, write tests that are run against a system by using a service layer rather than the user interface itself. They are much faster and extremely more maintainable (not affected by UI changes). Go under the hood! Focus on the logic to be tested not the steps that are required to get to a certain state. Use mocks and stubs, invest in building such support tools tailored to your needs, they pay off oh yes they do.

Third – when integrating, focus on interfaces, write integration tests between pairs of systems; focus on testing the interfaces between the systems only, do not duplicate the testing you have already done on the individual systems. There is nothing worse than trying to test the business logic in SystemB by using SystemA, don’t do it! Test SystemB business logic in SystemB with fast tests as part of golden rule #1. Integrate SystemA and SystemB to verify that the communication between the 2 systems is not broken, test only that the communication works, do not test the functionality of either of the systems at this stage (you have already done it in rule#1).

Fourth – use your brain and do not duplicate slow process: if you can’t help it and you want to write E2E tests, limit drastically their breadth to cover the happiest paths you can think of, make sure you run them in a dedicated environment to avoid test data corruption and involuntary resource contention. Automating E2E testing in complex systems is a bad idea; use exploratory testing on new features instead.

Mike Cohen a few years back came up with a test automation pyramid that describes how the automation effort should be distributed. Unit tests represent the majority of tests, immediately after there are tests run through a Service layer and at the top we have only a few E2E tests run through the User Interface. I love Mike Cohen’s pyramid, thanks Mike!

Automation Test Pyramid

To recap, I have illustrated my rambling in a “Dummy Test Automation Strategy For A Simple Multi System Architecture”. I am interested in your feedback, good or bad, please go for it, tell me what you think!