Rescue your end-to-end tests with service virtualization

John Davenport,

Continuous Delivery is generally considered as one of the most effective ways of reducing delivery times and improving quality, however adapting to its disciplines can be costly and difficult, particularly for large projects. The problems can escalate to become blockers when enterprises wish to adopt the methodology, but to reap the benefits of Continuous Delivery, you cannot compromise on its principles. One common issue is that traditional end-to-end tests introduce non-determinism (tests that work one day and fail another). In the context of continuous delivery at least, the majority view is that non-determinacy in tests must not be tolerated, but in practice the solutions are not easy to implement. In this article I shall examine some of the issues raised by experts and the solutions offered, and offer a simpler solution, namely ‘Service Virtualization’.

The problem with End-to-End tests

Automating end-to-end tests is a significant challenge. Steve Smith, a respected Agile and continuous delivery consultant, takes a provocative stance on the subject in End-To-End Testing Considered Harmful. In Steve’s opinion:

End-To-End Testing is an uncomprehensive, high cost testing strategy. An end-to-end test will not check behaviours, will take time to execute, and will intermittently fail, so a test suite largely composed of end-to-end tests will result in poor test coverage, slow execution times, and non-deterministic results.

For those wishing to bring continuous delivery into the enterprise, such opinions as Steve’s suggest we are failing to find solutions to problems. If end-to-end tests are considered a harmful then I fear many developers will avoid them and wilfully take short-cuts. We all ought to be aware of the importance of getting this right. RBS clearly did not when it suffered high-profile retail banking IT outages in 2012. Apart from all the other costs they also were subject to fines of £42m by the FCA (UK Financial Conduct Authority).

For organisations I have been involved with, end-to-end tests were considered essential in order to ensure confidence that all the existing functionality is still operable and no regressions have been introduced. The high business value placed on delivering a fully-functional product, combined with the difficulty of automating end-to-end tests in a reliable and repeatable fashion, means that when presented with this tradeoff, the majority of teams choose to sacrifice automation and give up on the notion of Continuous Delivery.

Manage your dependencies

On re-reading Steve’s blog, I saw this:

End-to-end tests check implementation against requirements by verifying a functional slice of the system, including unowned dependent services

That last phrase made me sit up. Do end-to-end tests really have to have unowned dependencies? Is that the source of the problem?

First, let’s consider other testing stages. In unit tests, developers generally use test runners that precisely control the environment that their code that is being tested in (if you want to read further, read Martin Fowler’s amusing blog on the origins of xUnit test frameworks). In performance testing, if the performance of a particular component is a concern, developers exclude other components to ensure relevant parameters, such as latency, CPU, network, database or memory can be easily measured. Similarly, integration tests will have specific purposes such as ensuring that two more or components can communicate correctly, but the test will be performed in a controlled environment. The same also applies to integration-level performance tests, as it is usually important to exclude components that would interfere with, or obscure, the measurements being made (for example, excluding a call to a third-party test environment that is know to be slower than the production system, or stubbing out a database when the test is intended to measure CPU requirements under a peak load).

In contrast, in the case of end-to-end tests, it seems that control of what is included is abandoned instead of being selectively managed. Everything is included whether it is is understood or not. Components are included over which no control can be exercised. The result seems inevitable. When problems occur there is a lack of understanding, and diagnosis is either long-winded or is not done at all. There is a danger that the project gets bogged down and attempts at automation grind to a halt.

Do not tolerate non-determinate tests

The same issues about including everything in end-to-end tests also emerged in Just Say No to More End-to-End Tests, a Google test blog post. The Google blog has been reviewed in detail by Bryan Pendleton in On testing strategies, and end-to-end testing and successfully debunked, so I won’t repeat that here. Read Bryan’s post instead. However, one additional issue for me is that Google got around the problems of end-to-end tests not passing by reducing the acceptance criteria to a 90% mark for a test to be considered as ‘passing’. In fairness, this has been Google’s approach for many years, but I believe it leads to a tacit acceptance of random failures and in turn an inherent instability in the test suite. The fundamental issue however is that Google had abandoned the approach they adopted throughout the rest of testing and included all dependencies in their end-to-end tests. You can see why they might do it with their massive resources, but if the result is flaky tests then it is a waste of resources in my opinion.

The Google test blog was also referred to by Adrian Sutton in Making End-to-End Tests Work. Adrian is a Senior Developer at the continuous delivery pioneers, LMAX Exchange. He reports that they have 11,500 End to End tests running in under 20 minutes and that they are:

invaluable in the way they free us to try daring things and make sweeping changes, confident that if anything is broken it will be caught

It is also very clear that LMAX does not rely completely on end-to-end tests but has a comprehensive set of other tests:

huge numbers of unit tests, integration tests, performance tests, static analysis and various other forms of tests.

LMAX is a mature Java development, but it does not have as much legacy development to deal with as most corporates do. They have put in a lot of effort to isolate tests from their unstable or intermittent elements, Adrian says:

We’ve been fighting back hard against intermittency and making excellent progress – we’ve recently added the requirement that releases have no failures and green builds are the norm and if there are intermittent failures it’s usually only one or two per run.

Unlike LMAX however, the reality of most enterprises is that there are many separate applications built over many years which lead to organizational control issues, and make control of the test process far more difficult. Test processes then become subject to delays as cross-functional teams will require considerable management co-ordination.

Ultimately we may all agree on the need for End-to-end tests, but we perhaps only disagree on how to achieve them. Martin Fowler in his blog Eradicating Non-Determinism in Tests describes the main issues and offers solutions. Many of the solutions proposed, and more besides, appear to have been successfully applied by LMAX, but some are complex and require specialist knowledge. Having to maintain additional code and test harnesses is something to be avoided, because every line of code requires maintenance and adds to the engineering burden that drags a development closer to becoming a legacy.

A services architecture suggests a simpler solution

In contrast to small projects and larger projects where continuous delivery is a success (LMAX, Netflix and Etsy for example), many large enterprises might have several hundred legacy applications. Given the degree of understanding of these systems and the costs, it is unlikely that test systems can be maintained to the requirements of a continuous delivery programme needing to turn round thousands of end-to-end tests in under an hour. Those systems present the greatest barrier of all to adoption of continuous delivery as they are a no-go zone for such methods. Furthermore, attempting to maintain test systems to the standards demanded by continuous delivery is a significant cost in terms of system resources and human resources.

However, it is fortunate that most enterprises which are actively developing ecommerce systems have adopted the Service Oriented Architecture (SOA) pattern to ensure that new systems are not closely coupled to the legacy systems. These fundamental properties of SOA can be exploited if we look a little deeper. First of all, whether a SOA is based on SOAP or REST, or if you would rather APIs or micro-services, they all conform approximately to the 10 Principles of SOA listed by Stefan Tilkov (the post is a little old, but I like it as Stefan is clear and concise, unlike a lot of other writers on the subject of SOA!). My summary is briefer still as I want to pick out the key properties that can be exploited in testing:

  • Explicit boundaries, meaning shared context or state is also avoided
  • Shared contracts or schemas, which are explicit and well-defined (and generally very slow moving)
  • Autonomous defined as “Services can be changed and deployed, versioned and managed independently of each other”. The service provider can introduce new features and consumers can choose when to upgrade.
  • Loosely coupled. What this mean varies, but in general properties of the service such as location, number of consumers, etc are irrelevant to the consumer.

These principles of SOA mean that you can consider the service provider and consumer are decoupled in almost all practical respects. SOA allows service providers to be switched without impact as long as they conform to the service interface. There may be some residual technical coupling to the transport mechanism (e.g. to http), but it is then a small step to simulate all the essentials of a service using a service virtualization approach.

In a Service Architecture, my experience is that many of the problems of non-determinism can be resolved more simply by simulating service behaviour with service virtualization and thus removing test dependencies.


So, in conclusion, end-to-end tests are too valuable to be discarded. However, non-determinacy in your end-to-end tests, and in any tests, is a serious problem. Do not be persuaded that lowering the success ratio for a test to be considered a ‘pass’ in order to tolerate intermittent errors solves the problem, as it undermines incentives to deal with the problems. Non-deterministic tests can be dealt with a variety of techniques, but it is a complex issue.

Effective testing strategies must to be factored into your architecture from the start. If your project has a Service Oriented Architecture or microservice-based architecture then you should consider using service virtualization to simplify your tests. The result should be that end-to-end tests can be automated with less effort, but perhaps more important than that, they will be considerably more reliable. This should help achieve the benefits of continuous delivery at a reduced cost.

My colleagues and I at SpectoLabs are building up a body of advice within this website on what development processes and techniques are needed in order to successfully adopt service virtualization.

We at SpectoLabs are also developing an open-source tool Hoverfly, to make it easier for developers to employ service virtualization throughout the delivery lifecycle. We are in the process of adding additional enterprise and scalability features, drawing on experience from our other open-source tool Mirage, which was developed for British Airways by OpenCredo.

Further reading