Interfaces are overused

Interfaces are overused in a lot of C# code, and in some cases are not merely extra baggage, but are actively the wrong thing to do.

These interfaces tend to be created upfront, with the object, in advance of any need of them; and are wide, mirroring all the public parts of the object.

Apologies if you think this is obvious. However, I see this misuse of interfaces in everyday C# code in multiple companies; and mentioning it on twitter generated a lot of replies. In fact, the most engagement in months. So clearly there’s something to say here.

When do you want to have an interface

Over-use of interfaces comes about from rote over-adherence to an often useful pattern: the idea that a class has an interface and then gets mocked in unit tests. This is inseparable from questions on “how best to unit test”.

It is said that a class must have an interface “for testability”. This statement might be misleading: adding an interface to a class doesn’t make that class more testable, it makes other classes that use that class more testable, when they use the interface instead. This is a form of the Dependency Inversion Principle - AKA the D in SOLID.

To use the most common example, a CustomerStore class contains code that talks to a data store. It has methods to GetById, Save etc. The data store could be a SQL database, it could be AWS DynamoDB, or other, it doesn’t matter for this example. Can you unit test this database code? No. A test that involves a database is not a unit test. Although you can integration test it instead.

Decorating the CustomerStore class with an ICustomerStore interface will not change that class’s testability. What it will do is allow code that uses a CustomerStore to instead use the ICustomerStore and be tested with a substitute implementation. You can separate out code that is coupled to the database from the rest, which can now be tested without that database dependency.

// this is very simplified
interface ICustomerStore
{
  Customer GetById(int id);
  void Save(Customer customer);
}

You don’t add an interface in order to test the code, you add an interface in order to test with the code. To maximise the percentage of the code that can be unit tested by isolating the parts that cannot.

Such interfaces typically have “one and a half” implementations. There is the real implementation, in the example that is class CustomerStore: ICustomerStore { ... }. And for test, there might be anything from an in-memory version that e.g. stores data for later retrieval, to something fragmentary generated by a mocking framework, which only has a single canned response to one of the method calls.

This usage of interfaces is not actually how they were first thought of. When interfaces were first conceived, they were intended to replace abstract base classes, and capture cross-cutting concerns. e.g. The IDisposable interface shows the concern that the object needs a cleanup notification when it goes out of scope, nothing else, and is implemented in many different ways. IComparable allows sorting. And they each have only one method.

Allowing an object to implement multiple interfaces allows those concerns to be expressed as types. This is the Interface Segregation Principle - AKA the I in SOLID. Although IMHO, this is not an independent principle, it derives from the “Single Responsibility Principle” expressed in the language of type contracts: allow a class to have multiple (interface) types, each with a Single (contract) Responsibility.

By contrast to IDisposable, the ICustomerStore is not a cross-cutting concern at all, and it is rare for the CustomerStore to implement several interfaces. It is understood that ICustomerStore is tightly coupled to CustomerStore - it is merely the front door to that house.

However, for enabling unit tests, this use of interfaces is a very useful tool. And it is overused when applied by rote to everything.

When is it a bad idea?

Data Transfer Objects should never have interfaces to the data

A Data Transfer Object typically contains just data from some source: It is a row from the database. It is a response from a http service. The field names and data shapes match the requirements of serialisation. It is nothing but state, it usually has few or no methods.

e.g.

class CustomerResponse
{
  public string FirstName { get; init; }
  public string FamilyName { get; init; }
  public DateTime? DateOfBirth { get; init; }
  public Address Address { get; init; }
}

The data interface might look like:

// NB: this is bad code
interface ICustomerResponse
{
  string FirstName { get; init; }
  string FamilyName { get; init; }
  DateTime? DateOfBirth { get; init; }
  Address Address { get; init; }
}

There is no benefit to this kind of interface on these objects. An interface allows for another seperate implementation, but the entire implementation of a DTO is the data in it. You will never need different implementation, just different data; the data values are the whole thing, there’s nothing else to implement. You can test with this data already: Anything that you want to do in test by mocking the interface can be done more easily by simply newing up a DTO instance with the required data in it.

Value objects should not have interfaces to the data

A value object is a small strongly typed value, used to avoid “primitive obsession”.

A value object could contain values like e.g. an x,y,z co-ordinate (e.g. three integers), or keys such as an AccountId e.g. a string, but with validation rules: all account ids are strings, but only some strings are account ids. The characteristic is that that two different objects containing the same value should be considered equal. This is true for AccountId’s and 3-d points.

A DateTime is a good example of a value object that we might define if it wasn’t already built-in. Note that the System.DateTime struct actually implements five or six interfaces, but all of them are small and focused around cross-cutting capabilities: e.g. be a Dictionary key, or be sorted in a list. Most of these interfaces are implemented by a large number of unrelated types.

But none of these interfaces are going to be interfaces to the Date and Time data, e.g. there is no interface with a int Hour { get; } property! as with a DTO, it would serve no purpose.

A value object can be implemented as a struct, and are a good case for the C# 9 record type syntax. It’s hard to do more with less code than e.g.

 record ThreeDCoordinate(int X, int Y, int Z);

This gives you a value object with correct equality semantics, in very little code. Yes, it implements object overrides such as Equals and interfaces such as IEquatable. It is also immutable. Being immutable works very well with value objects: change the value, and it’s no longer the same object.

The value object pattern is IMHO very much underused in C#. I hope that this changes with record types. But this might be partly because of the assumption about how to implement it: I have seen more than one implementation of a value, floundering as a class with an interface to the data.

// NB: this is bad code
public interface IAccountId
{
  string value { get; }
}

This is pointless, this will get in your way. A value object can be unit tested just fine as is. e.g:

  var id1 = new AccountId("validvalue123");
  var id2 = new AccountId("validvalue123");
  Assert.True(id1.Equals(id2));

  var id3 = new AccountId("validother456");
  Assert.False(id1.Equals(id3));

  Assert.Throws<InvalidOperationException>(() => new AccountId("not valid"));

But the issue is not testing them, it is how to test with them. I have never seen a case where the interface adds a useful testability seam. Would you say “We need to pass in an ITimeSpan instead of a TimeSpan so that we can mock the result returned we add an hour? That is dysfunctional.

Value objects work much more like an int than like a CustomerStore. You just pass in the value that you want to test with. Sometimes you make an interface for how to get these objects, e.g. a ITimeSource (also called an IClock ) so that you can test with a clock that returns a specific time. This is an example of when a factory is a useful pattern.

Pure static functions are useful

“Just a function” is useful, where it’s a small amount of code, no or few dependencies, and “pure”, i.e. no state or side effects. e.g. If the code is

static bool IsBankWithExtraComplianceRules(string bankName)
{
    return 
      string.Equals(bankName, "MegaGloboBank", StringComparison.OrdinalIgnoreCase) ||
      string.Equals(bankName, "ConglomoFinCorp", StringComparison.OrdinalIgnoreCase);
}

Does that look simplistic? I have seen code much like that, but only after we have extracted common code and made it use consistent casing rules. It can itself be unit tested just fine - call it with various values, and check the result:

  Assert.True(BankHelpers.IsBankWithExtraComplianceRules("megaglobobank"));
  Assert.False(BankHelpers.IsBankWithExtraComplianceRules("highstreetbank"));

But the issue is how to test with this function, how to test code that calls it. You want to cover both true and false cases. Does this need to be put on a “service” with an interface for that? Almost certainly not. Where you can test all the outcomes by controlling the inputs to the method under test, you won’t need to mock the function in order to control the return value from it. Sometimes this is as simple as

  var complianceData = new ComplianceData
  {
    Bank = "megaglobobank",
    ...
  };

  var result = RunComplianceRules(complianceData);

  // assert expected outcome with extra compliance rules

Do you know that the IsBankWithExtraComplianceRules function was called? The fact that this function was called or not is not an interesting business outcome. The extra rules being run (or not) is. Test that this happens.

A test is not always scoped to a method or a class. You should be able to refactor by extracting the method, and the tests still pass and add value. Fine-grained, close-coupled unit tests are not always best.

When to extract static functions

When should you extract a function? Don’t Repeat Yourself is a good rule, but bear in mind that it is a guideline and not an absolute imperative, it can have costs as well as benefits. So it needs to be weighed against other concerns. It is most worthwhile when:

The function is used three or more times. Once is happenstance, twice can be co-incidence, but three times is a pattern that is likely to continue.
It is really the same thing, and is not accidentally identical. DRY refers to “Every piece of knowledge” not every block of code, and there are cases where one block of code could serve different roles. Or to put it differently, in SRP language, an extracted function should not have more than one “reason to change”.
It does not introduce much coupling. e.g. coupling between classes, between layers, between projects in a solution, between services and between binary packages. The further-afield the coupling, the worse. Do not extract a 2-line function into a NuGet package and share it across services “because DRY”. Coupling is a worse problem than repetition.

Be careful to collect like functions with like. A single static class called “Helpers” that does all kinds of different things is a code smell. You can usually tell that it’s an issue, because different parts have different dependencies, different concerns and different reasons to change.

In one company that I worked at, there was a library called “GeneralFunctions” that did everything from SQL databases to logging to string slicing. And it became a pain point. e.g. How did you update the ADO database driver? Go through GeneralFunctions, and be sure not to break any of the dozens of other projects using it, maybe just for string slicing but who knows, really. Or the other way around: not being able to change string utilities because it would break the database driver version was a real and self-inflicted problem.

In time, the motto became “Just say no to General Functions”, it being personified as an difficult old military man who really should have retired already.

If this needs to be NuGet packages, then it should be several of those: e.g. LoggingFunctions, DatabaseFunctions etc. This should then divide up the code according the the packages that it depends on. e.g. only DatabaseFunctions would depend on the ADO database driver.

Builders

Would you mock a StringBuilder in order to get the rest test value returned from the final concatenated string? Or would you just feed the correct strings in. Builders are designed to be used like this, there is no need to mock them to get this result.

Services

“Service” is one of those words that is often so general and overused as to not mean much. On classes that are not a data store or otherwise wrapping external code, and also contain business logic or other “doing stuff” code, consider not implementing interfaces if you can get away with it. Add an interface if it becomes necessary, not to pass code review or some inflexible rule.

Avoid testing implementation details, test behaviors. Often the number of classes involved is an “implementation detail” that you want tests not to change even while it is be refactored.

Consider if you can test with e.g.

  var mockStore = Substitute.For<ICustomerStore>();
  var controllerUnderTest = new CustomerController(new CustomerService(mockStore));

The question then is: does this give you better tests or not? is it easier to refactor under test? If not, then introduce the ICustomerService interface to solve a problem that you have.

Unit testing from outside

This suggests that e.g. testing using the ASP test host is a good place for a lot of tests. I think that this is as far from the code as you can go and still do unit tests without a lot of friction:

It’s still fast, but only if you set it up for that, not if you do full host setup and teardown on each test.
You test from the outside, so there is some work to do to get to your code- e.g. serialise requests going in and de-serialise responses coming out. This can be minimised by using helper and wrapper methods.
It’s possible to use mocks, but it takes some work and therefor is typically done for only a few key services (that contain external dependencies) rather than the usual explosion of mocks.

Tests that use dozens of mocks are an anti-pattern

Tests that over-use mocks are fragile, as they are tightly coupled to the code and break when any method signature or implementation detail changes. Therefor they discourage refactoring and make the code a lot less malleable. They are often hard to read, as the mock syntax is often complex and verbose. Exactly what is being tested is often hard to discern.

It propagates the idea (as seen on a CV) that “unit testing with NSubstitute” is an actual and desirable skill. A mocking framework is not a unit testing framework, and a particular unit testing framework is not the skill of unit testing.

Even when testing with interfaces, mocks are overused. It is actually easy to make a “fake implementation” of the interface that does some of the expected behaviour, e.g.

class FakeCustomerStore: ICustomerStore
{
    public List<Customer> Customers { get; } = new List<Customer>();

    public Customer GetById(int id) => Customers.FirstOrDefault(c => c.Id == id);
    
    public void Save(Customer customer)
    {
        Customers.Add(customer);
    }
}

The Customers list is public so that a test with a reference to the class type can inspect the contents, to verify e.g. if a customer was saved to the list.

This fake be as simple or as complex as you like, it’s just your code it doesn’t need a framework to set up a specific behaviour. It has the benefit over the usual mocking-framework style that once defined, FakeCustomerStore can be used any number of times with fewer total lines of code than specifying mock setup each time.

Sometimes you don’t need to mock a dependency when the fake or null object is built in. e.g. for Microsoft.Extensions.Logging.ILogger there is NullLogger.Instance.

Factory pattern

There’s whole lot of jokes about Java that it over uses the “Factory” pattern. e.g. “I had a problem, so I thought to use Java, now I have a ProblemFactory”. Well, the Factory pattern is a useful one, sometimes. e.g. the IClock mentioned above is a DateTime factory. The joke is that the common style is to do it always, without thinking, regardless of if it solves a problem or not - rote over-adherence to an often useful pattern. Much like interfaces in C#.

Fin

It’s good and appropriate to interface a class in some cases - especially isolating classes that encapsulate external dependencies, and can’t be unit tested in this definition so that their consumers can be unit tested.

However this idea gets overused, and should not be an inflexible rule. it becomes a rote exercise of “I’ve made a class, now I have to make the class’s interface”. Because of tradition, maybe? Adding interfaces to all “service” classes containing business logic is unnecessary, and adding interfaces to classes that contain merely data, just to expose that data, is going to make it less usable.

Anthony Steele

Bloggy