The wrath of the mighty metric

27 Dec

Reasons why some software delivery teams don’t give a damn about their customers

It feels like a century ago, but once upon a time, less than a century ago, I was leading a traditional test team in an organization where 3 separate teams of Business Analysts, Developers and Testers were delivering software in an incremental iterative death march style. Each of the 3 teams had its own leads and managers and each of the 3 teams was measured by specifically tailored metrics. My team’s efficiency was to be measured based on the mighty DDI Defect Detection Index calculated as DDI = (Number of Defects detected during testing / Total number of Defects detected including production defects)*100. The DDI had to be greater than 90% otherwise our team would have been deemed inefficient, bonuses dropped and the test team itself branded as a bunch of losers.

Yes you guessed right, the other 2 teams were measured in a similar way, their efficiency was also based on number of defects, the lowest the better.

God I am glad this is only in the past. Even remembering this makes me sick in the stomach. Sick like every time a production defect was detected, sick like every time a defect that our team detected was rejected, sick like every time I had to go to the triage meetings and inevitably have an argument either with the BA lead or the DEV one because that defect that we found was not seen as a possible improvement for the product but as a threat to some team’s metric. I’m not even going to describe to you the awful discussions that followed the acceptance of a defect as valid when a decision had to be made on whether the defect was due to bad requirements or bad code.finger-pointing

The funny thing was that no matter which was the efficient team and which were the inefficient ones, the software delivered was the same, no change whatsoever, the customers were constantly quite unhappy. The real value that the metrics gave to the department was the ability to point fingers based on numbers. They say numbers never lie, maybe numbers don’t lie but how many lies can we tell to fabricate numbers?

Since then many things have changed in my professional life and today I don’t have to fight stupid battles to fabricate numbers in order to define efficiency so I can, funnily enough, use my time more efficiently.

Why calculating confrontational metrics doesn’t work? The problem is in the fact that we are humans; if you attach prestige and monetary value to a metric, the metric becomes the goal of the team and the battle can begin. The test team doesn’t care how useful the product delivered is, all they care is opening as many defects as possible so that the mighty DDI doesn’t go under 90%, if this means opening defects that are absolutely no harm to the customer but only to the development team and the schedule it doesn’t matter. The same logic can be applied to development and the BA teams that will spend their time obfuscating their requirements and defending their code from the stupid defects opened by the test team. All this creates a climate of tension, distrust and hostility. Nobody really cares whether the customers are happy as long as the individual teams metrics solemnly declare their efficiency and fingers can be rightly pointed :-( .

teamwork

The funny thing is that it is very easy to resolve this problem and put the focus back on the customer. Create a cross functional self organising team able to analyse, develop, test and deliver a complex software project and judge the team on how well they satisfy the customer needs. The team lives as one, produces quality as one, delivers customer value as one, succeeds as one or fails as one. The goal of the team matches the goal of the company and failure or success of the team determines failure or success of the company, it’s called agile team, try it out!

How to transform bad Acceptance tests into Awesome ones

14 Dec

So you want to learn how to write good acceptance tests? There’s only one way, let’s write some.

This is a practical example that is designed to help beginners write clear and easily maintainable acceptance tests.

Our System

We are BOG, “Bank Of Gus” and we have a Loan Approval Processing System that takes in input some data regarding the applying customer and his loan requirements, as output it returns either Accept (the customer will be given the loan) or Reject (the customer will not be given the loan).

The marketing manager wants to start selling a new Holiday Loan and produces the following user story:

As a Customer
I want to borrow money from the bank
So that I can go on Holiday and enjoy myself

Acceptance Criteria:
In order to get the Holiday Loan approved
1) The Customer must be 18 or older
2) The Customer’s salary must be > €20,000.00
3) The Customer Time in employment must be >= 6 months
4) The Loan amount < (Customer salary)/5

The Loan Application Form (UI) exists already, the Loan Application Form calls a REST service that is what we are now updating to allow for this new product. The UI is also ready and able to display our outcome, a big green Approved or red Rejected string based on what our service returns.

The Loan Application Form looks something like this:

I am eager to start writing acceptance tests, and I start writing down the first one without thinking much, (please don’t get bored by the first test, I promise it gets better, this is the worst one you’ll see)

I’m going to use a rich celebrity for my first test, let’s try to make things interesting.

first_test

Auch… 16 steps for ONLY ONE TEST, by the time I do all the necessary scenarios with boundary analysis I am going to have a document the size of the iTunes licence agreement and this is only the start!

HINT #1: Focus on “what” you are testing and not on “how”

First of all, do I really need to say that I go to a page and that I fill each field and that I push a button? That’s the “how” I use the app, it is not necessarily “what” I do with it. The “what” is “a customer is applying for a loan”.

Could I generalize and get to a concept of a customer applying for a loan? YES
Do I really need to fill the web form to exercise the code I am writing/testing? NO
Can I abstract it, use a test double and call directly my code? YES

Focus on what you are testing, you are not testing the UI, you are testing the loan approval logic! You don’t need to exercise it through the UI. You would exercise it through the UI only if you were testing the UI.

Ok let’s use a test double. I create a mock with the data as per example above and will use it for testing, but, it’s not making writing the test any easier Sad

I could do something like

second_test

Besides the fact that I abstracted the how (the customer entering strings and clicking buttons) with the what (the customer applying for a loan) I still have a very messy test with loads of detail and quite difficult to read and maintain.

It looks slightly better but not good enough, I couldn’t even fit all the data on one line and I took the lazy option of adding ellipses, but in the real world ellipses don’t work, they can’t be automated, imagine repeating this for all the scenarios I need to cover, it’s a disaster, what am I going to do?

HINT #2: Eliminate irrelevant Detail

Do I really need to know the name of the customer to decide if I want to approve his loan? NO
Do I need to know his sex? NO
Shall I continue asking rethorical questions? NO

The only important variables for designing the logic of my application are the ones described in the acceptance criteria, look back: Age, Salary, Time in employment, Loan amount

OK this looks promising, let me try to write the original test using only those.

third_test

This looks definitely better, it exposes only the parameters that have an impact on the loan approval logic, it is more readable and while reading it I have some idea of how the system will work, that’s better isn’t it?

OK let’s write all the scenarios to comply with the acceptance criteria using boundary analysis, equivalence partitioning and other test techniques.

fourth_test

Auch again… I haven’t even started looking at the cases where the loan will be rejected and I have already 4 very similar tests that will bore to tears the Product Owner, so much that he won’t speak to me for a month, what can I do?

HINT #3: Consolidate similar tests with readable tables

I know of a very useful way of writing tests that are very similar without repeating myself over and over and make the readers fall asleep. It’s called scenario outline and I’m not going to explain in words what it does, I’m just going to show it to you because I know that looking at it you won’t require any explanation.

sixth_test

Wow, this looks much better! One test of 3 lines and examples that cover all the possible scenarios! Do you remember when you needed 16 lines of unnecessary detail to describe only the first line in the examples above? This is certainly a an improvement, more readable, more maintainable and all around 100 times better than the original one.

Also, look at it closely; it gives the business an amazing power of using this test in the future to make changes! Imagine that we end up in a credit crunch (again) and the banks want to tighten the way they lend money. So they decide to increase the minimum salary to 30,000 and the minimum time in employment to 12 months for example.

A quick copy and paste + small refactor and we get:

seventh_test

That’s quite powerful isn’t it?

Now if I was a tester and I wanted to be picky I would tell you that there are plenty of scenarios that have not been tested and a full decision table should be created to give good coverage.

Yes you guessed, I am a picky tester, let’s build the decision table for the Credit Crunch scenario.

HINT #4: Use decision tables and boundary analysis to get high coverage

How do I build a decision table?
First you need to know what your variables are and what “interesting values” need to be considered.

What’s an “interesting” value? They are all the values a variable can take that might make the logic fail. Generally they are Boundary values.

Ok back to the Credit crunch requirements:

2) Customer salary must be > €30,000.00
3) Customer Time in employment must be >= 12 months

The salary variable, for example has 3 interesting values: 29,999.99, 30,000.00, 30.000.01
Respectively, left boundary, boundary and right boundary (some observers might say that 0 and -1 could be interesting values as well, I agree but for the purpose of this exercise we won’t consider them).

How about time in employment, the interesting values are: 11, 12, 13

OK I have 2 variables each with 3 “interesting” values (or dimensions)

I can immediately calculate the amount of tests I need to get 100% coverage with all possible combinations of “interesting” values.

NumberOfTests = dim(salary)*dim(time_in_employment) = 3*3=9

9 test cases will cover all possible paths using all combinations of “interesting” values.

Let’s build the decision table, and guess what? It can be expressed as a test!

eight_test

1 test, 3 steps, 9 examples, 100% boundary analysis coverage, in English, readable, maintainable, clearly expressing the business value delivered, what do you want more?

One last thing; you might be in a situation where decision tables with many variables with large dimensions will require hundreds or even thousands of test cases. If these tests are run at the unit level I wouldn’t worry too much about the run time but if for instance you are testing some Javascript logic and are able to do so only through the UI this will require a long time to execute and not represent a good Return On Investment.

What can you do? There are many ways of reducing the amount of tests to run still maintaining relevant coverage. One technique is called pairwise testing, it is very straight forward and uses tools to quickly identify the highest risk tests that should be included and eliminate the ones with less risk associated. Pairwise testing is outside the scope of this document, If you are interested in knowing more about it, check this out! http://bit.ly/T8OXjZ

Test drive your user story!

7 Dec

A Demo to me is like a test drive. Say I want to buy a second hand car and I see 3 of them I like in the saloon window, all nice and shiny, but I am not sure which one will have all I need. A good shop assistant asks me what my requirements are and suggests we try the first one.

car salesman

The car is clean, smells like new, the guy tells me about performance, fuel consumption, shows me how to operate the GPS, how to engage cruise control and so on.

I’m cruising along, I like it, and I’m definitely going to buy it!

On the way back to the saloon the car breaks down.

How come? the guy just told me that the car had been serviced and was perfect! He is saying something about a small issue with the injector that they are going to fix in 10 minutes when I get back to the saloon. Will I buy it? Probably not. But mainly, do I trust this guy and try any other of his cars? You know the answer. 

A demo is where we show our customers how we have implemented their vision and we demonstrate the newly added business value. A demo is also a point where IT development members should be proud of what they have achieved.  A Demo is an incredibly powerful point for gaining or losing trust and confidence from our customers.

So what do I do to do a good demo? This is my recipe:

Prepare a “Demo Script”. It’s a small document that steps through the paths that I plan to follow when demonstrating my user story to the customers. This helps me focus on the business value I want to express and will be a useful “test drive” when verifying everything is as per the script before I talk to our customers.
stevejobspresents2

It is absolutely fundamental that before the Demo I go through the script and verify everything is as expected. If the car salesman had taken the car for a spin, he would have spotted the injector fault, fixed it before the demo and probably sold me the car.

We MUST do the same with our user stories in the same test environment that will be used for the Demo, our customers don’t care whether it is an environmental issue or not, they know about complex software configurations as much as I know about injectors, very little.

You want the business users to trust you? Test drive your user story with a “demo script” and give a proud and confident Demo!

Can we use impact mapping to define a training strategy?

5 Dec

It was a cold December morning in Dublin, I was shivering and wet getting back to my desk after a cigarette break, when I asked myself, “can you use impact mapping to define a training strategy?” No idea I thought. There is only one way to find out…

This was my first attempt to use Impact Mapping, and I had been dying to try it out since I first heard about it at Agile Testing Days in Potsdam less than a month ago, let me tell you how it went.

I had previously scheduled a meeting with a colleague to decide our strategy to coach our development teams to use Acceptance Test Driven Development. The meeting was in the afternoon, I thought, let’s use this to try it out.

Here we are, me at the whiteboard my colleague sitting in front of me.

First question “Why” do we need to coach the teams on ATDD? This helped us identify the real business value we were chasing, in our case after a few tries and some close calls ended up being “Zero bugs detected in UAT”. Once we identified this, it opened the discussion on how to quantify the money value of the business value. This was an extremely interesting exercise and the discussion provided us with a clear view on how this will make the company money and how to address management questions on this specific topic, pretty neat!

We got the biggest value when we asked ourselves what behaviours we needed to change in our customers to achieve success. This first of all sparked discussion on who our customers were and we expanded our thinking about the development teams as a whole to looking at the different individuals, different roles and levels and we also included management in the mix. This in turn sparked a lot of discussion on what category we needed to influence first.

We asked ourselves “who can help us achieve this and who can hinder us?”. This was a penny dropping moment as we quickly came to the conclusion that our customers for the first delivery were not all the development teams but some key influencers as their behaviour change would influence other people’s behaviour as a consequence. This cut quite a lot of the ground work considering we have 7 development teams to coach. By thinking about who could have hindered us we also thought about some specific approaches to influencing this group of people.

Now we had a business goal with a money value associated and we had identified the customers and the behavioural change. It was time to test a few options. This was quite a straight forward operation and came really natural to us to assess our first deliverable was going to be a Workshop for the influential people we had selected.

The map drawn on the whiteboard wasn’t anything artistic to be honest but here you go.

ImageTraining strategy impact map

(In case you’re wondering the sinking boat was our visualization of the impact of lack of team ownership :-) )

Always focusing on the behaviour we needed changing we managed to get into more of the details and the content of the Workshop and in less than an hour we ended up with:

Clear understanding of our business goal, its value and how to measure success (0 bugs in UAT)

Clear view of what the max value min effort first deliverable was going to be (workshop to key influencers)

Clear view of who our customers were and how to approach them (influencers, hinderers + development teams)

A high level plan of the content of the workshop to be delivered

This was one of the most energizing and productive meetings I have had in a long time, we came out of that room with clear plans and a nice feeling we are going to succeed.

Did we get it all wrong?

24 Nov

On Success Measure Vs Bug count and a brand new approach to building Successful products

Back from Potsdam (Germany) where I attended “Agile Testing Days”, I now had 48 hours to reflect on what I saw and heard.

Gojko Adzic presented a concept that I believe could represent a paradigm shift not only in testing but in the whole software delivery approach.

Agile-testing-quadrants

He says that we all got it wrong when applying one of the quadrants of Agile testing because in quadrant Q3 we have been focusing on criticizing the Product based on our internal understanding of how to build a successful product and paying little or no attention to the final customers’ opinion on whether the product is useful and successful or not.

To visualize this, Gojko came up of with a model for software quality that mirrors the Maslow’s hierarchy of needs where the highest level in Maslow’s model (Self Actualization)

corresponds to Successful in Gojko’s Software Quality Model. In this model the lower levels are a necessity for the upper ones to be relevant, i.e. if a product is not Deployable and Functionally OK, we should not care whether it is performant and secure or if it is useful because obviously if we cannot deploy it, it won’t get the chance to perform and be useful, you get the idea.

Looking at the pyramid we immediately realize that as a software delivery team we can only assure the 3 bottom levels of the pyramid and to assure our product is Useful and Successful we need feedback from the final customer. We must involve our final customers in the feedback loop on our products, only they will really know if our product is useful and only they can make it successful or not. Gojko goes one step further and says that when measuring the levels we can apply a different level of focus. Maybe the bottom 2 levels should be delivered to be “good enough” moving up the pyramid we need to aim to “the more the better” as we get closer to Successful.

The most impressive part is yet to come and it is basically Gojko’s approach to measuring the Successful bit of the pyramid. He introduced a strategic planning technique based on 4 questions that he named Impact Mapping. Gojko says “An impact map is a visualisation of scope and underlying assumptions, created collaboratively by senior technical and business people“. In my opinion, the most revolutionary side of Gojko’s thinking is on his focus on behaviour change. In the third question he asks “How should our actors’ behaviour change?”. By focusing on this aspect we are able to visualize the impacts that we want to see as a result of our product/idea.

Using Impact Mapping we are able to visualize and test our assumptions in our path to success. By allowing assumptions testing, Impact Mapping helps find the shortest and cheapest path to product success, not bad at all…

Impact Mapping is a brand new approach and Gojko, says he doesn’t know yet if it will apply to every area of software delivery, it is up to the community now to test it, define applicability boundaries if any and improve it, you can count on me Gojko, I am up for it!

BTW, before you ask, yes I live in the real world.

Test Automation, help or hindrance?

12 Nov

On Slow Vs Fast, Co$tly Vs Cheap and stating the not so obvious

Automation testing is a must for agile teams that want to continuously deliver business value. Does test automation give value to agile teams? Automation testing gives value if it satisfies (at least) two important principles:

1)      Provide fast feedback to developers (SPEED)

2)      Be less expensive than manual regression testing over the application lifetime(CO$T)

SPEED is extremely important. Time is money. People don’t like waiting for something to happen while losing money, developers are no exception. Knowing that we will soon know if our code change worked or not, helps us re-factor that old piece of code that was unmanageable. If you will only know tomorrow that you have broken something, it might be very difficult to fix it because maybe 10 other people have pushed their changes after you and who knows who broke what? Imagine if it took you 1 day to compile your code, would you make that small optimization change? Be honest…

Fast tests give teams great benefit because they tell us straight away “well done!” you’re on the right path, or “hang on you made a mistake, fix it before it’s too late!”.

There are no two ways about it, slow tests are BAD. Developers hate running them because it is a pain, you either wait for them to complete and tell you how you did or you ignore them and go ahead with other changes. Both approaches are bad, while you wait for feedback you are losing money by not being able to code (time=money), if you make other changes you risk burying the “thing” you just broke under more broken code and guess what? You lose money!

CO$T is quite a big issue, isn’t it?

How do you like tests that are brittle and break as soon as something changes in the application user interface? VERY CO$TLY.

How about tests that take ages to run because are highly coupled and necessitate of the full End to End (E2E from now on) test environment to complete? SLOW & CO$TLY!

Slow because they rely on so many systems, as a consequence they keep on breaking but 80% of the times it’s a false positive because some System_XYZ that the test is using somewhere to provide some data was down or Database_ABC was accessed at the same time by another user that messed up the test data. Damn! Rerun the suite again, TOMORROW you will know if it passes, maybe, hopefully, unless something else is broken :-( SLOW, CO$TLY and worse than everything they become a hindrance to developers because they not only give no value but are counterproductive by wasting their time.

An automation strategy based on E2E tests run through the User Interface that follow the full application workflow has FAILURE written all over. Why? Because E2E UI driven automated tests are SLOW and CO$TLY. Developers hate them and when they catch the odd bug they might not even investigate and follow-up correctly because “…ah sure, it must have been something in the environment, like in the last 35 failures, damn test harness!”. They slowly become noise in the background and after a while nobody cares about them, they are abandoned and the automation effort deemed a failure.

But, hang on, we can avoid this.

Don’t write slow, highly coupled, UI driven, brittle, costly E2E automated tests, do yourself a favour, just don’t.

Yes but we need them, how do we show we verify the acceptance criteria?

Each individual system in a complex architecture can be built to adhere to acceptance criteria, individually. Let’s focus on each system and target our efforts there first. Let’s also automate integration points but let’s not forget about what we are testing when doing integration: we are testing the interfaces only, not the functionality of the other system we integrate to!

On Slow Vs Fast, Co$tly Vs Cheap and stating the not so obvious

Let me introduce you to Augusto’s 4 golden rules of Fast and Cheap automation testing. (yeah, that’d be me)

First – identify the application under test and focus: write a lot of tests that run against an individual system, they are much faster than coupled tests, they are also much cheaper because they won’t bother you with false positives. Remember, each system on its own can satisfy business acceptance criteria, it might take some time to formulate the acceptance tests describing business value but it is indeed possible. The business logic to be tested resides in the individual systems; test it where it is, not through another system.

Second – go under the hood:, unless you are testing specifically the user interface, write tests that are run against a system by using a service layer rather than the user interface itself. They are much faster and extremely more maintainable (not affected by UI changes). Go under the hood! Focus on the logic to be tested not the steps that are required to get to a certain state. Use mocks and stubs, invest in building such support tools tailored to your needs, they pay off oh yes they do.

Third – when integrating, focus on interfaces, write integration tests between pairs of systems; focus on testing the interfaces between the systems only, do not duplicate the testing you have already done on the individual systems. There is nothing worse than trying to test the business logic in SystemB by using SystemA, don’t do it! Test SystemB business logic in SystemB with fast tests as part of golden rule #1. Integrate SystemA and SystemB to verify that the communication between the 2 systems is not broken, test only that the communication works, do not test the functionality of either of the systems at this stage (you have already done it in rule#1).

Fourth – use your brain and do not duplicate slow process: if you can’t help it and you want to write E2E tests, limit drastically their breadth to cover the happiest paths you can think of, make sure you run them in a dedicated environment to avoid test data corruption and involuntary resource contention. Automating E2E testing in complex systems is a bad idea; use exploratory testing on new features instead.

Mike Cohen a few years back came up with a test automation pyramid that describes how the automation effort should be distributed. Unit tests represent the majority of tests, immediately after there are tests run through a Service layer and at the top we have only a few E2E tests run through the User Interface. I love Mike Cohen’s pyramid, thanks Mike!

Automation Test Pyramid

To recap, I have illustrated my rambling in a “Dummy Test Automation Strategy For A Simple Multi System Architecture”. I am interested in your feedback, good or bad, please go for it, tell me what you think!

SCRUM, from failure to success

27 Oct

This is a description of what made our previously failing SCRUM team a success within our organization. The lessons learned through failure were as important as the final success.

Before the “Revolution”: Our organization had been going through a transition to SCRUM for around one year. In the process of transitioning we had delivered a couple of software projects. Such projects had been seen as a failure by our Product Owner (PO from now on) for not delivering business value, and by the development team for failing to deliver a quality product.

The goal: The software development department needed a big success to justify continuing the transition to SCRUM and we were all determined to deliver a great product to our customers to demonstrate how much we had learnt from our own previous failures. The PO was extremely sceptical about continuing with SCRUM and didn’t refrain from showing their feelings.

The Plan: One SCRUM team was created to work on “Project Revolution”. Our goal was to deliver in a short period of time quality software that would exceed customer’s expectations and drive them back to embracing agile development.

The Focus: For the very first time we religiously adhered to SCRUM practices, we focused our efforts on Software Quality practices and built a solid relationship between the development team and the Product-Owner.

How we did it: We engaged with the PO from day zero and we tried to infect him with our enthusiasm for software development and quality. Before the start of the projects we gave a demonstration of our Quality plans and new software development approach: Acceptance Test Driven Development.  The PO showed interest in our approach beyond our expectations; he was blown away by the power of the tests ubiquitous language and clearly understood its potential value. He was also reassured by the demonstration of the Software Quality infrastructure we had built to harness the development of our application and took large interest in how acceptance tests were run and results reported. The development team discovered that something that seemed initially only a technical matter was very valuable to our PO.

SCRUM framework was followed rigorously by everybody including the PO that in previous projects was somehow cut away from the core of the SCRUM team. This time we really started to discover SCRUM’s real benefits of fast feedback and continuous improvement. The ATDD approach gave great benefits by allowing front-loading the discussions over incomplete/ambiguous requirements at acceptance test creation (the very start of development). We discovered quickly how front-loading such discussions would on one hand slow down development but on the other hand would allow us to develop only once and only what was really required rather than get to the final product by continuously fixing defects. Having a large bed of acceptance and unit tests gave the development team the confidence of refactoring freely and we were able to see the positive in the fast feedback of our builds.

Transparency: A full transparency policy was adopted, we were all part of one team, there were no secrets among us.

Collaboration brings success: Slowly the PO started gaining confidence in our work and when at the demos he started saying things like “This is a fantastic job guys!” or even “You’ve done in a 2 weeks sprint what in the past we were used to getting in 3 months and at a level of quality that is not even comparable”. It’s easy to understand that when our PO started trusting us, we were able to go even one step further and propose our alternative solutions to him. While in the past such solutions were categorically refused and a command and control approach was used by the PO, we were now at a stage where full collaboration was the norm and feedback was working both ways.

Fun: The product was delivered in time with excellent level of quality, the products’ business value exceeded the PO original expectations and best of all we all had great fun in developing it.

Project Revolution was an amazing experience.

 

Industry: Credit Information

Project Scope: Credit Information Management System

Technology: Java (Spring), Tomcat, HTML, jQuery, SOAP, Oracle ESB

Tools: JUnit, Cucumber, Selenium, Crucible, Sonar, Jenkins, Maven

Follow

Get every new post delivered to your Inbox.