Counterintuitive Facts in Software Development - the Testing Edition

May 16, 2023

The first edition of the “Counterintuitive Facts in Software Development” (you can find it here) created quite some interest and I received a lot of valuable ideas for new counterintuitive facts from the community on LinkedIn.

This why I would like to continue with a new edition, specifically around software testing.

I have added the names of the people who inspired me to include the topics in brackets, but note that these are not actual quotes and I am not trying to describe their thoughts - all opinions are entirely mine and I am the only one to disagree with.

This new list of facts seems to have a bit of a different quality than the first one, they are more specific to the software domain but again it is clear to me that the underlying issues are fundamentally related to complexity.

The necessary disclaimer: There are hardly any absolute truths in software development - context is always important. Nevertheless I think that the statements below hold for most of the situations.

So let’s jump right in:

1) You can’t inspect quality into a system (h/t Roland Eschenburg)

One of the more traditional views is that there are specific team members who are responsible to guarantee the quality of a software system and these people - commonly called quality assurance or testers- achieve their goal by analysing the built software more or less from the outside based on a specification that describes the intended behaviour of the system.

Setting up responsibilities like this is based on the idea that a separation of concerns would be the best way to objectively remove faulty behaviour from the system - in other words, the people who implement the functionality of the system should be different from the ones verifying it.

There is some value in this approach, developers will have bias and tend to verify functionality according to their understanding (and maybe not the intended one).

Nevertheless, everybody involved in delivering software is responsible for the quality. This starts with requirements definition (e.g. slice functionality into small pieces) and affects all other roles.

Testers can and should find issues but they are typically involved late in the delivery process. Fixing issues found at this stage is typically expensive and requires a lot of effort. In addition, in many cases it is pretty difficult to find complex underlying issues and quick fixes tend to be of a temporary nature. QA processes are done shortly before scheduled release dates so there is also high pressure to deliver something to production.

Real quality can’t be inspected in - it must be designed in and it is an intrinsic aspect of a system.

2) You can have too much test automation (h/t Marcos B)

Continuous Delivery is one of the essential practices of agile software development. It allows us to deliver fast and quickly in small batches. To be able to do this we must automate the steps of our delivery process as good as possible. Automating our testing process is key - how can we deliver quickly if we need lengthy and time consuming manual testing to prove that we are ready to roll out the latest changes to production?

Having good automated test coverage thus is crucial but we need to be aware that test automation is software code. With code comes complexity and a maintenance burden. Keeping a full set of automated test cases up to date and assuring the availability of proper test data is potentially a lot of effort.

We are bound by the law of diminishing returns: If we already have achieved a specific level of automation, efforts to improve will grow disproportionally.

If the maintenance effort for our automates tests is too high, teams will eventually neglect it and the opposite of the intended result will be achieved: test results that are not reliable and confidence in the quality of the software is decreasing.

It requires skills to avoid this: automated testing must be carefully designed around the most important business functionality. These automated test should assure that the crucial and most important features of the software are covered well and the team is confident that they work as intended before production release.

In other non-critical areas teams may decide to skip automation, reducing the overall amount of code that needs to be maintained.

Just to avoid any doubts: low level unit testing should have very high coverage and including unit tests should be part of any professional developer’s daily work).

I can’t help to note that in my experience as an engineering coach in bigger enterprise organisation I have rarely seen cases where too much test automation would be a problem. What I typically see is a significant lack in good test automation that is severely impacting the teams’ capabilities to deliver often and quickly.

3) Testing in Production is unavoidable

“Testing in Production” long has been some kind of half-funny joke describing the practice of releasing software to customers that is not properly tested and thus putting the burden of finding problems on the customer. This approach is obviously unacceptable and in most cases not tolerated by users any more.

But looking at it from a different angle it reflects an uncomfortable truth: we will never be able to find all defects and potential issues before a release.

There are a couple of reasons for this:

Fully replicating the production systems in our test environments is hardly possible. Although we must replicate our tech stacks painstakingly, we usually can’t replicate infrastructure completely in many cases. It is also impossible to cover all of the possible ways how users interact with our systems - they are unpredictable.

These issues get even more complex if we consider distributed systems like micro-service based solutions where we can run into all kinds of communication problems among the services. Failing connections between services may lead to cascading problems in downstream systems, asynchronous communication may cause race conditions and overall system behaviour becomes hard to predict. It is even more difficult if not impossible to test these cases in a controlled environment.

What we need is an application architecture that is built around concepts of resilience and fault tolerance and we need to have good insights into the runtime behaviour and health of our systems. Observability and monitoring are key to assure a satisfying user experience.

“Testing in Production” often it is the only way to really figure out if new functionality provides customer value. Patterns like A/B testing are widely used to collect hard usage data.

To sum it up: “Testing in Production” means that we accept that our test systems and procedures can’t model production behaviour completely. Based on a solid foundation of automated testing coverage, effort is spent on instrumenting our production systems so we can verify “real life” behaviour of our changes with only minimal impact on the majority of our customers.

I have written a bit more about this here.

4) TDD is not (mainly) about testing (h/t Craig Statham)

The last one for today is mainly about the wrong understanding of the concept of Test Driven Development (TDD). Including it may still be worthwhile because it is something that I have experienced many times.

A lot of people think that TDD is about writing the test before the implementation. Which actually is true but does not cover the essence of TDD. The whole point of TDD is not the tests but designing the software for testability from scratch - it is a design and implementation method an not a testing practice.

TDD forces the developer to think really hard about slicing the problem into the smallest testable parts - completely decouple from the actual implementation that will then be added to the code base in multiple refactoring iterations.

It is a natural choice as a method of working in the world of continuous integration as it focuses on the smallest possible changes and forces the developer to think hard about problems of coupling and cohesion.

TDD is great way to work - but doing it well requires a lot of practice.

Brevity is a virtue and I tried to be brief - a lot more can be said about each of these topics.

Testing software - or better - assuring the quality of our systems so that they provide value and a pleasant experience to the users is an important part of our delivery processes and in many cases also a bottleneck.

Agility without a strong focus on software quality is not possible and it clearly is the responsibility of the whole team to assure it.

If you have any additional ideas about counterintuitive facts around testing, don’t hesitate to add them to the comments!

Note: I have received a lot of inspiration for additional counterintuitive facts in the area of delivery processes and software architecture and will cover them in a follow-up edition.

Agility and Engineering

Discussion about this post