Anthony Steele

Bloggy

Decouple your unit tests

Most unit testing in C#, .NET and ASP code could be done better. Less coupled. And that would produce better outcomes, better code, and make your life as a developer easier.

Below I present a better way that you might already be aware of but probably would benefit from using more.

I was sold on the better, less coupled way not from a good logical argument but by working with these approaches. This is all from experience, I have seen these code testing styles, for years on end.

Preamble

Interfaces are overused, mocks are overused but the test pyramid is still relevant.

Most tests that we see in practice are too closely coupled to the code under test.

There is demo code in a GitHub repository for this blog post, so that the techniques can be worked through. But bear in mind that this code does nothing, it is merely a reworking of the sample ASP “weather forecast controller”. In itself, it is far to simple to need the testing done here. But it must be small to be a readable demo stand-in for a much larger application, that would have many more classes with multiple methods arranged in more layers.

The demo code is a .NET 8 ASP.NET Core Web Api. It has a controller, that calls a service that calls a repository that presumably does the data retrieval. So far, so familiar.

In a real application there would be sufficient complexity to make multiple layers a good idea, and multiple repositories that would have e.g. a concrete dependency on a database, along with other ways to get and send data to http services, message queues etc.

The one repository stands in for all of those. We cannot unit test this repository, so we must give it an interface, and then swap in a different implementation for tests. Then we can test the rest of the application without the real repository.

The same app is tested in multiple ways.

Isolated style

The first example has isolated, “Mock the interface” tests. Sadly this is the default, by far the most common style.

I have seen a co-worker get very good at this style. While I was saying “100% coverage isn’t necessarily, important or even always possible”, they turned in 100% coverage, even exception handling. And yet the app was still a terrible codebase to work on. Tech debt was the main impediment to process, but refactorings just didn’t happen.

A large body of tests in this style can be as much a tedious liability as it is a useful safety net.

The Isolated tests

    [Fact]
    public void Controller_Should_CallService()
    {
        var service = Substitute.For<IWeatherForecastService>();
        service.Get().Returns(new List<WeatherForecast>
        {
            new WeatherForecast
            {
                Date = DateOnly.FromDateTime(DateTime.Today),
                Summary = "Testy",
                TemperatureC = 20
            }
        });

        var controller = new WeatherForecastController(
            new NullLogger<WeatherForecastController>(), 
            service);

        controller.Get();

        service.Received().Get();
    }

    // etc

In WeatherForecastControllerIsolatedTests.cs, Each class is tested in isolation with mocks. The test coverage is high. Mocking code is everywhere, it is verbose and repetitive.

A mock repository is injected into the service, and then separately, a mock service is injected into the controller. Both service and repository must have interfaces for this. We verify the the correct method was called.

The code smells that we typically see include tests with dozens of mocks, verbose and repetitive mock setup to test very little logic, and even for extra insanity, key business logic in AutoMapper mappers, which are called in the middle of the code, but are not part of the test, but mocked for test.

The one good feature of the code above is that instead of declaring a mock logger, the code makes use of the pre-defined Null Logger that supports cases like this.

Sociable style

    [Fact]
    public void Controller_Should_ReturnExpectedData()
    {
        var forecastDataStore = CreateMockWeatherForecastDataStore();
        var controller = new WeatherForecastController(
            new NullLogger<WeatherForecastController>(), 
            new WeatherForecastService(forecastDataStore));

        var response = controller.Get();

        Assert.NotEmpty(response);
        Assert.Equal("Testy", response.First().Summary);
    }

    /// etc

This style is called “sociable unit tests” where an assembled subsystem is tested.

The Sociable tests

In WeatherForecastControllerSociableTests.cs we test an assembled stack of real controller, real service and mock repository.

The mocking code is extracted for re-use, which makes it less verbose.

This is better in some ways. We focus a bit more on testing outcome not implementation. It’s more about “do we return the expected value?” and less “did we call the expected method?”.

If this approach is taken, then we can refactor the service at will. The service does not even need to have an interface at all, and this extra declaration should be deleted. We are at the point that the foolish dogma of “always put an interface on every class” is actually useless.

You should also be able to “extract class” as a refactoring in some cases without changing tests, and still verify that nothing is broken.

Decoupled style

For the last, decoupled style, this was the last style that I saw. It took a while for me to get used to it when used as the main engine of unit testing.

I started by thinking “this is a lot of indirection to accomplish the same outcome”. And then a few months later I noticed that I had got into the habit of starting a feature task by first making a failing test. Without touching a single line app code I was able to add a test that “when new thing happens, then outcome should result.” It felt like starting a building work by putting up a scaffold that would support the construction, safely. The tricky part was done already when I had a failing test.

Test First

This was a breakthrough. “Test first” is good, or so I had always heard. Yet it wasn’t prevalent in most places where I have worked. They didn’t do it much. And I didn’t do it much. Why? Was this purely because I lacked the self-discipline or the innate skill? But instead it seems that a major factor was that the test style before now did not support test-first coding.

Consider this: if “refactoring” is changing code under test, then how well can you refactor if any change breaks tests? Tests should support change, not hinder it.

A large body of tests written in a coupled, mocked, “test after” style seems to make it much harder to even begin to test first, or to refactor.

But, with this decoupled approach, we have freedom to refactor the code under test liberally. When a service layer class is not doing anything except forwarding calls, and could be deleted? Go ahead, tests should still compile and pass. Split one large controller into two? No problem! Do away with controllers entirely and use minimal web API routes instead? Fine, it’s under test!

But is it Unit tests?

My view is yes. These outside-in tests are unit tests. Simple as that. However, this is mostly a semantic distinction, you can get the same value from them if you think otherwise. But you should understand what they do and don’t have.

But, the idea that they must be “integration” because they test multiple classes is IMHO a complete misunderstanding of what is being “integrated”. It’s referring to external dependencies in this case.

What is a “unit” in “unit tests” anyway? Views on what’s a “unit” that I have heard that I have sympathy for are:

Consider the definition of unit tests, where “A test is not a unit test if: It talks to the database; It communicates across the network; It touches the file system” (from “A Set of Unit Testing Rules”, Michael Feathers, 2005 ) - these decoupled tests meet all of those criteria by configuring the test host to replace such dependencies with fakes.

Consider the other definition of unit tests, where they are small, fast, cheap, numerous, reliable, can be run frequently, can be run concurrently: these decoupled tests meet all of those criteria too.

By reasonable and pragmatic definitions, these are unit tests.

Kent Beck:

“Unit tests are completely isolated from each other, creating their test fixtures from scratch each time.” the word unit in unit testing refers to the test itself: unit tests are isolated from other tests. Beck argues that “tests should be coupled to the behaviour of code and decoupled from the structure of code.”
The Unit in Unit Testing

Ultimately the only views are wrong and harmful are that “a unit test always tests a method on a class, in isolation” and that “every class has a matching test class, no more and no less is needed”.

So what are Integration tests?

Again, these are hard to define. and you will need to forge your own definition.

But, pragmatically, they are “Tests that come after unit tests”. i.e. Slower but less numerous, a bit higher on the test pyramid.

So if a test does meet the standard for a unit test, it lives there and isn’t an integration test. In my terms, Integration tests do these things that unit tests don’t - “A test is an Integration test if: it queries an actual database; communicates http across the network etc.”. In other words, it is an integration test if and only if it integrates one or more real dependency on an external service, be that as simple as file storage, or as complex as a SQL database or message queueing system. They are not “I/O-free”.

I like this definition since as with unit tests, it’s about things that matter to testing, and has absolutely nothing to do with the number of classes or other units of code organisation under test.

In the end, names aren’t universal, and this matters less than the ability to safely deploy good code. So your project or organisation might need to supply local definitions of names, i.e. what you do and don’t expect to see in each test layer. I present that as a choice that I think leads to good outcomes.

the demo app Decoupled

    [Fact]
    public async Task GetForecast_Should_ReturnData()
    {
        var response = await _testContext.GetForecastTyped();

        Assert.NotNull(response);
        Assert.NotEmpty(response);
        Assert.Equal("Testy", response.First().Summary);
    }

    // etc

We use the Test Host. Contrary to what that page says, this host is not just for “integration tests” that “include the app’s supporting infrastructure, such as the database, file system”. They can be mocked here.

public class TestApplicationFactory : WebApplicationFactory<Program>
{
    protected override void ConfigureWebHost(IWebHostBuilder builder)
    {
        builder.ConfigureServices(services =>
        {
            // this is where services that have concrete dependencies
            // are replaced by fakes/mocks

            RemoveService<IWeatherForecastDataStore>(services);
            services.AddSingleton<IWeatherForecastDataStore>(new FakeWeatherForecastDataStore());
        });
    }

The demo app has a TestApplicationFactory that replaces repositories with mocks - there is only one that needs to be replaced in this simple app, but it can be as many as needed.

I have never found this technique to be “too slow”.

We can create the TestApplicationFactory new for each test like this or abstract it into an injectable TestContext. In this case it is shared for even better performance in cases where this doesn’t affect the test.

There’s a bit of overhead to get it all set up, but this is a once-off cost so it matters less on larger apps. With a few helper classes it’s fairly transparent. Bear in mind that as your test suite grows, this layer becomes more worthwhile. So that the test is concise and more readable, and does not specify everything at the http request level.

You can do it all in the test method, including the WebApplicationFactory code, but this technique that doesn’t scale at all to many tests.

A technique that scales a bit better is doing it all in a test base class; but soon that class becomes far too large and lacks coherence. i.e. “favour composition over inheritance”.

You would be better off with separate helper classes for common code for wrappers and abstractions that you might find in a larger app’s test suite.

The only classes in the app that need interfaces in this design are the ones that need to be mocked for unit testing, because they have dependencies such as databases that can’t be unit tested. Other interfaces are noise, and can simply be deleted.

It’s clear that it tests more of the application as well: if you mess up your http configuration so that the route is wrong, or forget to register a necessary service, you’ll know it, whereas the other tests don’t get to that before actual “deploy and run tests on the deployed app” integration tests.

We also demo a FakeWeatherForecastDataStore. This is a fake implementation, instead of a mock using a mocking tool. It tracks calls with a CallCount property. This technique is also underused. In many cases, this is simpler, clearer code than the equivalent using mocking framework.

What I typically find that this kind of Fake is simpler, but a bit more verbose - more lines of code, but simpler lines of code - than the mocking framework equivalent. But then the mocking equivalent gets repeated multiple times in the codebase, adding up to far more lines of code than declaring a “Fake data store” or “In-memory repository” once.

I made another example of an in-memory data store here, with more complete Create, Read and Update operations.

public class FakeCustomerRepository : ICustomerRepository
{
    private readonly List<Customer> _data = new();

    public void Add(Customer customer)
        => _data.Add(customer);
    
    public Customer? Get(int id) 
        => _data.SingleOrDefault(c => c.Id == id);
        
    // etc
}

Typically these are backed by a List or Dictionary but it’s code, it can do whatever you code it to do.

End note

Decouple your unit tests. If you find yourself unable to do a simple “extract class” refactoring because it would both break existing tests and the new class would require new tests, they something is wrong: The app code is too coupled to the test code.

Use the Test Host for more of your tests. Push mocks to the edges off the app. Use them for the sinks where state leaves or enters the app.

Favour State-Based Testing over Interaction testing: favour test that test outcomes such as returning a result, enqueueing a message, or modifying a state in a data store correctly, over tests that test that the code calls the expected method.

State-based testing is always preferred. The primary reason is that it is less coupled to your code. So you can more easily change your code without changing the test.

Interaction vs State-Based Testing

I don’t advocate for these “decoupled” tests to be the only kind of test, just the default kind. About 80% of the test coverage can be decoupled, depending on the specifics of the app. They are a “Should” recommendation.

There will be business logic cases where you are better off dropping down to a class-level test and pumping many test cases into that subsystem, instead. Even then, these might be “sociable tests” that cover multiple classes, as the current exact subdivision of the code into classes is not the test’s concern since you’re testing the behaviour of the code. You should aim to minimise the inevitable coupling to the structure of the code.

I didn’t come to this position out of theoretical reasoning: This was given to me by a team already using it. And it worked for me, even better than I thought it would. And it could work for you too.

But if you set out testing afterwards with mocks, you will lock in that pattern.

Define your terms: document what you and don’t expect to see in each test layer. reach local understanding, even when it’s not possible across the industry.

The second-order conclusions are that like with Continuous Delivery, it’s the downstream effects that deliver the big benefits over time. And they support each other. If you can make that refactoring with confidence due to tests, then the next step is to deploy it continuously, and see it working through to production. Then you can incrementally maintain and increase quality.

And quality and feature productivity are not a trade-off, they correlate. i.e. you choose to be good at both, or you succeed at neither. See “Accelerate” for more on this.

And that semantic drift happens, such that over time the practice such as “unit testing” ends up substantially different, and easier. And often worse, not giving much of the original benefits.

Also that you can work for many years in “good practice” employers, with ever seeing first-hand what good really looks like, or even knowing that better exists. After all “we do testing” ticks the box.