Our journey to F#: The effect of F# on our (unit) tests

On our journey from C# to F#, we were at a point where we changed our (unit) testing strategy. Our approach just didn’t feel right anymore. We struggled to do TDD because we didn’t get the tests to fail first. We wrote tests that didn’t increase our confidence and started thinking of them as waste.

This post is part of the F# Advent Calendar 2021 
Thanks to Sergey Tihon for organising the Advent Calendar.


Our software (TimeRocket) has an Angular/TypeScript front-end and a backend with a mix of C# and F# – we started with C# and built new sub-systems in F# since May 2020. The backend is layered with the WebApi calling the business logic (it’s a bit more complicated than that, but this will work for this blog post).

When starting with F#, we copied our (unit) testing strategy from our existing C# code:

We have two kinds of unit tests. The first kind of test is written around algorithms (algorithm tests) that are pure, so they are easy to test by calling the algorithm with some inputs and checking whether the outputs are as expected. Most algorithms are just a couple of lines of code. However, there are more complex algorithms that span several hundred lines of code. The second kind of test is written against the business logic layer (so-called operation and query specs). They simulate a single call from the WebApi – in response to a call by the client – to the business logic and check whether the expected result is returned, and all expected side-effects were executed (data persisted in the database, messages sent to the service bus, calls to other services). We write most of these tests using Test Driven Development.

We have had great success with this unit testing strategy in our C# code base: few defects, code that is easy to refactor without tests that are in the way because they are decoupled from the implementation details. In addition, our tests serve as documentation. The tests are easier to read and understand than the production code.

A critical aspect of our unit testing strategy is that we don’t optimize for zero defects. We optimize for a few defects and a system that is easy to fix when a defect gets into production. Our guideline is that we add tests as long as we are uncertain that the code is correct.

Struggling to get a failing test first

We started to question our unit testing approach in the context of F# when we struggled to get the tests to fail first. In TDD, a test should always first fail to make sure that you test something meaningful. But the F# compiler makes it sometimes hard to not fully implement stuff on the first try – it just won’t compile. Of course, we could get any test to fail, but taking an extra detour doesn’t make sense. So we asked ourselves, why do we write these tests again? The answer is that we write tests to have enough confidence to put the code into production.

The F# compiler gives us more confidence

The F# compiler removes a couple of uncertainties that the C# compiler can’t.

Discriminated Unions – closed type hierarchies – makes it easy to use pattern matching with the confidence that all cases are covered. This is not possible in C# where we only have open type hierarchies using interfaces. Additionally, we found modelling our domain with discriminated unions to be much simpler than with the possibilities given by the C# type system. Many possible error cases in our C# models are simply not representable in our F# models.

C# knows statements and expressions. In F# there are no statements, just expressions. So the compiler forces you to do something with return values (even if it is just unit). That makes it much harder to forget things by accident while coding. In C# we had quite a few tests that ensured that we did not forget to handle a return value (the danger comes from refactorings). In F#, these tests are unnecessary because we would have to actively |> ignore the result value.

But the most significant difference for us is computation expressions. They allow us to write business logic code that hides the wiring of asynchronicity (async/task) and error handling (result) – no if cascades or exception throwing as in C#. There is often so little noise that the business logic code is very obvious. And in obvious code, bugs can’t hide. So it made no sense for us to repeat the flow of some business logic in a test and check for every detail of how data is passed from function to function. Another consequence is that we don’t need the test and specs anymore as documentation. the F# production code is now easier to read than the tests and specs!

In the end, we can reduce the number of tests because the F# compiler gives us more confidence than the C# compiler. And as I wrote earlier, we write as many tests as needed to provide us with a comfortable level of confidence.

But we still write tests!

We still write algorithm tests and operation/query specs, but fewer of them.

Algorithm tests help us to build an algorithm step by step. That didn’t change with the switch from C# to F#. The steps are a bit bigger because F# gives us the confidence to take bigger steps. We only test for real domain-specific aspects of the algorithm, not for “did I write the correct syntax and did I pass the arguments correctly” – as we often did in our C# code because of statement based code.

And we still write our operation/query specs in an outside-in TDD style, but fewer of them as compared to our C# code. Less so to make sure that the code is correct but to provide us with focus on the behaviour we want to achieve. It just matches our thinking from the user’s perspective. Additionally, the operation/query specs give us the confidence that the composition and wiring of our system are working correctly. In our situation, this is complicated by the mix of C# and F#, and the modular architecture with several sub-systems that interact through loosely coupled interfaces. Modularization is excellent for having a flexible system, but it lowers confidence that it just works. The operation/query specs provide us with the needed level of confidence.

Finally, we can run our specs against an in-memory simulation of the database or against the real database. In-memory for a fast TDD cycle and with a real database for maximum confidence – SQL queries are still rather error-prone.


F# helps us to maintain the same level of quality with fewer tests. Fewer tests means that we gain development speed because we don’t have to write these tests and more importantly, we don’t have to maintain them.


A sample algorithm test:

let ``coordinate of latitudes with data = SelectedEmployees and employee is contained are returned`` () =
    let employeeId = "E" |> Guid.generate
    let coordinate = Example.create ()
    let latitudes =
            { Latitudes.Coordinate = coordinate ; Data = SelectedEmployees [ employeeId ] }

    let result = Latitudes.getCoordinatesForEmployee employeeId latitudes

    result =! [ coordinate ]

The test checks whether getCoordinatesForEmployee works correctly by returning all entries in latitudes (in the meaning of permissions) that are relevant for the specified employeeId. The code shown covers the case of explicitly specified employees. There are other tests for the other cases.

Line 3: Guid.generate is a helper function that generates an identifier of the correct type (here EmployeeId) consisting of a Guid with all Es (repeat the last character of the passed string).

Line 4: Example.create () is a helper function that creates sample data. As long as type inference can handle it, we don’t have to specify a type.

Line 10: call the function to be tested.

Line 12: we use Unquote to check the result.

A sample operation spec:

let ``set latitudes`` () =
    async {
        let bootstrapper = Bootstrapper.create ()

        let dimensions, values, activityStructure =
            ActivityStructure.parse "A[1=B[3;4=C[5;6]];2]"

        let! applicant = Setup.applicant' bootstrapper
        let employees =
            [ "E" ; "F" ] |> List.map Guid.generate<EmployeeGuid>

        // authorize
        do! Setup.permissionForApplicant' bootstrapper OperationNames.SetActivityStructureLatitudes

        // set latitudes for a coordinate
        let operationId = Example.create ()
        let setInstant = "24.11.2021 08:00" |> Application.parse
        let operationData =
                    ActivityStructureId = activityStructure.ActivityStructureId
                    Coordinate = (dimensions.["B"], values.["4"])
                    Data = LatitudeData.SelectedEmployees employees
                { Example.create () with
                    Application = setInstant
                    Applicant = applicant.UserId
            |> AsyncResult.assertSuccess

        // query latitudes for a coordinate
        let! result =
                (dimensions.["B"], values.["4"])

        // the latitudes are set
        result =! LatitudeData.SelectedEmployees employees

                (OperationName OperationNames.SetActivityStructureLatitudes)
            |> Check.operationLog
            |> Check.operationPresentation Check.Undoable.No [] []
            |> Check.run

This spec checks whether the system can save the latitudes for a coordinate in the tree of activities. When tracking activities in our system, you can link them to a coordinate in the tree of activities. Think of the tree as something like

  • Customers
    • Space Inc.
      • Projects
        • Space Rocket
        • Moon Lander
    • Sea Inc.
      • Projects
        • Deep Diver

Line 4: the Bootstrapper is the factory that can build the entire backend with simulated databases, service bus, SignalR etc.

Line 6: we create a tree (activity structure) with the help of a parser.

Line 9: create an employee that acts as the person invoking the operation later on (applicant).

Line14: set the permission that the applicant is allowed to execute the operation

Line 17-33: execute the operation

Line 34: assert that the operation succeeded (throws an exception otherwise)

Line 37-44: check that the data in the database is updated

Line 46-55: we use the Check module to check things that all operations have to support: logging, whether the operation supports “undo” and how the operation is presented in the list of the last operations.

Find all blog posts about our journey to F# here.

This blog post is made possible with the support of Time Rocket, the product this journey is all about. Take a look (German only).

About the author

Urs Enzler


By Urs Enzler

Recent Posts