What is that all about the repository anti pattern?

There are a lot of discussions going on about the repository pattern. I also started one at the last usergroup meeting together with Julie Lerman. Generally the generic repository is considered an anti pattern as stated by Greg Young, he offers a solution by applying a composition pattern to maximize code reuse. Mike from SapiensWork even calls it a complete anti pattern. Ayende calls it since a long time the new singleton. Ayende also suggests in his older articles to either use directly NHibernate’s ISession or RavenDB’s IDocumentSession. Complex queries should be placed into query objects according to his article. So do we really need a repository? This article tries to answer this question.


Let us look into how a generic repository would look like.

interface IRepository<T> {
IEnumerable<T> FindAllBy(IQuery query);
void Add(T item);
void Delete(T item);
...
}

This approach has the following drawbacks:

  • We need to instantiate for each entity a new repository of T
  • Like Greg states we are losing expressivness when we are using FindAllBy(new CustomerByNameQuery(“Daniel”))

What about when we apply Gregs composition approach?

Public class CustomerRepository {
    private Repository<Customer> internalGenericRepository;
    Public IEnumerable<Customer> GetCustomersWithFirstNameOf(string name) {
         internalGenericRepository.FindAllBy(new CustomerFirstNameOfQuery(name));
    }
}

This would be a bit more expressiv because we have meaningful methods which describe what actually is done instead of FindAllBy(new QueryObject(…)). But still there is one major drawback:

  • We need to instantiate a concrete class for each entity type we want to fetch data from.

Let us try even another approach. What about that repository:

interface IRepository {
IEnumerable<T> FindAllBy(IQuery query)
 where T : Entity;
…
}

We suddenly don’t need to instantiate a repository for each entity. We can fetch data from multiple entities with the same repository instance. But still it has a drawback:

  • It is missing expressivness

How could we overcome this issue? There is actually a solution with extension methods. We can use extension methods to cover the query objects behind the scenes and give this approach more expressivness.

public static class CustomerRepositoryExtensions {
   public static IEnumerable<Customer> GetCustomersWithFirstNameOf(this IRepository repository, string name) {
      return repository.FindAllBy(new CustomerFirstNameOfQuery(name))
   }
}

Now we can have the following:

public class Foo {
   public Foo(IRepository repository) {
      this.repository = repository;
   }
   public void Bar() {
      var customers = this.repository.GetCustomersWithFirstNameOf("daniel");
   }
}

That looks kinda nice. It is easy to test. We don’t need a real database for the Foo class. But stop! What about transaction management and where is the session in case of NHibernate or something similar? In case of NHibernate we can directly use ISessionFactory.GetCurrentSession() to get the current session and then start a transaction.

public class Foo {
   public Foo(ISessionFactory sessionFactory, IRepository repository) {
      this.sessionFactory = sessionFactory;
      this.repository = repository;
   }
   private ISession Session {
      get { return this.sessionFactory.GetCurrentSession(); }
   }
   public void Bar() {
      using(var tx = this.Session.BeginTransaction()) {
         var customers = this.repository.GetCustomersWithFirstNameOf("daniel");
         // modify customers
         tx.Commit();
      }
   }
}

Suddenly we have NHibernate not only in the query objects and the repository implementation (someone needs to pass the current session of the call scope into the query) but also on the business logic. And also we have two dependencies which need to be injected in every class which needs database access. We can solve the first problem with an easy adapter for the session and the transaction which even allows nested scopes:

    public class UnitOfWork : IUnitOfWork
    {
        private readonly Func rootScopeFactory;
        private readonly Func dependentScopeFactory;
        public UnitOfWork(ISessionFactory sessionFactory, Func rootScopeFactory, Func dependentScopeFactory)
        {
			this.sessionFactory = sessionFactory;
            this.rootScopeFactory = rootScopeFactory;
            this.dependentScopeFactory = dependentScopeFactory;
        }
        public IUnitOfWorkScope Start()
        {
            IUnitOfWorkScope scope = !CurrentSessionContext.HasBind(this.sessionFactory)
                                         ? this.rootScopeFactory()
                                         : this.dependentScopeFactory();
            scope.Begin();
            return scope;
        }
    }
	public class UnitOfWorkScope : IUnitOfWorkScope
    {
        public UnitOfWorkRootScope(ISessionFactory sessionFactory)
        {
			this.SessionFactory = sessionFactory;
        }
        protected ISessionFactory SessionFactory { get; private set; }
        protected ISession Session
        {
            get { return this.SessionFactory.GetCurrentSession(); }
        }
        protected ITransaction Transaction
        {
            get { return this.Session.Transaction; }
        }
        public virtual void Begin()
        {
            if (!this.Session.Transaction.IsActive)
            {
                this.isTransactionCreated = true;
                this.Session.BeginTransaction();
            }
        }
        public virtual void Commit()
        {
            this.isScopeCompleted = true;
            if (this.isTransactionCreated)
            {
                this.Transaction.Commit();
            }
        }
        public virtual void Rollback()
        {
            this.isScopeCompleted = true;
            this.Transaction.Rollback();
        }
        public void Dispose()
        {
            this.Dispose(true);
            GC.SuppressFinalize(this);
        }
        protected virtual void Dispose(bool disposing)
        {
            if (disposing)
            {
                if (!this.isScopeCompleted && this.Transaction.IsActive)
                {
                    this.Rollback();
                }
                if (this.isTransactionCreated)
                {
                    this.Transaction.Dispose();
                }
            }
        }
    }
	public class UnitOfWorkRootScope : UnitOfWorkScope
    {
        public UnitOfWorkRootScope(ISessionFactory sessionFactory)
            : base(sessionFactory)
        {
        }
        public override void Begin()
        {
            ISession session = this.SessionFactory.OpenSession();
            CurrentSessionContext.Bind(session);
            base.Begin();
        }
        protected override void Dispose(bool disposing)
        {
            base.Dispose(disposing);
            if (disposing)
            {
                ISession session = CurrentSessionContext.Unbind(this.SessionFactory);
                session.Close();
                session.Dispose();
            }
        }
    }

Let’s revisit our Foo class, shall we?

public class Foo {
   public Foo(IUnitOfWork uow, IRepository repository) {
      this.uow = uow;
      this.repository = repository;
   }
   public void Bar() {
      using(IUnitOfWorkScope scope = this.uow.Start()) {
         var customers = this.repository.GetCustomersWithFirstNameOf("daniel");
         // modify customers, other class could att
         scope.Commit();
      }
   }
}

Nice we gained a little bit of abstraction in our business code regarding the persistency layer but we still have two dependencies in our Foo class. But do we really need the repository abstraction? NHibernate’s ISession is itself an unit of work. So why don’t we just move these few members onto the scope?

    public interface IUnitOfWorkScope : IDisposable
    {
        void Begin();
        void Commit();
        void Rollback();
	IEnumerable FindAllBy(IQuery query)
          where T : Entity;
        ...
    }

We can even move away the generic restriction which would then allow to use projections in the query too. This simplifies Foo to the following implemention (still using the extension method approach but this time on IUnitOfWorkScope).

public class Foo {
   public Foo(IUnitOfWork uow) {
      this.uow = uow;
   }
   public void Bar() {
      using(IUnitOfWorkScope scope = this.uow.Start()) {
         var customers = scope.GetCustomersWithFirstNameOf("daniel");
         // modify customers, other class could att
         scope.Commit();
      }
   }
}

Isn’t that nice? No need for repositories. Unit testable. Simple and small abstraction. All persistency related stuff in either in the query or in the unit of work implementation and only there we need tests which either go to the database or use InMemory databases. But there is still one caveat regarding testability. Can you spot it?

About the author

Daniel Marbach

15 comments

  • Hi Daniel

    Looks nice so far. But didn’t you mix up and simplify things a little bit too much here? For me the Repository is the separation of Domain Objects and Persistent Entities. It’s also a separation of the client from the used persistency technology (either NHibernate, EF, SharePoint, ADO.NET, WebService or whatever). As client side calling code, I really don’t care, where and how my data is stored.

    Of course the interface of the repository should be expressive enough and not generic as this was mentioned many times by you and the different contributors.

    But why do you mix together the Repository with the UnitOfWork and the transaction handling? Is it something NHibernate specific? Did I miss your point?

    Cheers
    Thomas

  • Daniel, I am a big supporter of the Repository pattern, however I am against the Generic Repository. About NHIbernate or any ORM, I think they are an implementation detail of a repository. For UoW, when dealing with aggregate roots the AR itself is a unit of work implementation. When you save an AR, all changes are saved as one unit by the repository.

    I wouldn’t mix the UoW with the Repository, in fact very rarely I explicitely use UoW since I let the AR handle that functionality. What I do use explicitely is Transaction, inside and outside of a repository. I let the DI Container handle the scope of it.

  • Hy Mike

    Thanks for providing your opinion. I’m also a huge fan of UoW handling by using DI container or aspect oriented style approaches. But not every application can use explicit business call context. Usually when you have web applications it is really easy because you have an explicit call context and this can easily be handled by good DI containers. In my last project we had a more complex setup. The application contained a lot of autonomous business components which needed to fetch and modify data but could also be part of a business transaction started from a client interaction. This requires a bit more sophisticated business scoping.

    I aggree that any ORM should be an implementation detail of a repository.

    Daniel

  • Hi Daniel

    Really interesting post. But I’m struggeling with the UnitOfWork class here:
    – Why it is internal?
    – I’m missing the sessionFactory field/property
    – Why are there two scopes? (Is it because of nested scopes?)

    Greets

    Patrick

  • Hy Patrick
    – Changed it to public.
    – Intenionally left out 😉 Stripped away a bit of the code so that adaptors need to think and not copy paste 😀
    – Because of nested scopes

    Daniel

  • Everything is a pattern or an anti pattern based on context. In DDD terms, a repository should be used only to fetch AggregateRoots and that too, by their Ids. As such, FindBy* is something that misses the point completely. Furthermore, an Aggregate is the atomic consistency boundary. As such, transactions spanning Aggregates are pointless – anything that’s outside the Aggregate boundary is not expected to be immediately consistent. That eventual consistency needs to be accepted and modelled properly. From another angle, Aggregates exposing their state to the outside (for FindBy* queries) are a point of coupling. Such state is usually used for querying. A way out of this is to separate the query data from the data that makes complicated decisions – CQRS. And Event Sourcing can be used to avoid exposing state for Aggregates. These approaches help in solving complicated problems with lots of ifs, then, whens and temporal aspects.

    In such systems, the read data is typically stored according to queries needed. Data as duplicated – i.e. viewmodels are persisted. This significantly reduces the complexity of the data and as such, reduces the complexity of the data access mechanism. Instead of complicated operations, things boil down to sorting and filtering. With this, the complexity required from the data access tech is also reduced.

    In a CRUD application, these patterns can backfire – resulting in complexity itself (although there may be some non-functional benefits). In such systems, an abstraction for data access can be useful. However, as Ayende points out, ISession is already an abstraction. For these scenarios, ISession or even a repository on top of it may be a good solution. It’s only when these are used in logical, behavioural, temporal models that complexities arise and they seem inadequate.

    Finally, the approach you’ve outlined may work nicely for CRUD applications where the “model” is basically an ERD diagram over the database. This can be fine for many applications. But then, DDD is probably not appropriate for those anyway.

  • Hi Daniel

    You presented an interesting idea here. Thanks for sharing it with us. However, when I played with your idea I stumbled upon two issues that bother me a bit.

    The first one is the visibility of the generic FindAllBy(IQuery) method in the IUnitOfWorkScope interface. Users of IUnitOfWorkScope are supposed to call the specific query methods made available through extension methods. But as a user of IUnitOfWorkScope it’s very easy to circumvent extension methods and directly call the generic FindAllBy() method instead. Then we are back to the situation which Greg Young in the cited blog post described as follows:
    “It could literally be running any query that could be expressed within the QueryObject (read: any possible query). In order to now figure out what the contract actually entails one would have to go look through the domain (and possibly the UI (ugh) depending on where the query objects originate).”

    His internalGenericRepository approach on the other hand allows hiding this unwanted possibility for shortcuts.

    My other issue concerns the claimed testability. I see no problems testing the extension methods. But I cannot see a way to unit test methods like the Bar() method as easily as I wished to. The problem, of course, is that some mocking frameworks like my preferred Moq don’t support faking static methods (including extension methods like GetCustomersWithFirstNameOf()). I currently see the following workarounds:

    1) Faking the generic FindAllBy() method and hence assuming a certain implementation of GetCustomersWithFirstNameOf().
    2) Using a more capable mocking framework which can fake static methods, e.g. Typemock Isolator or Microsoft Fakes (of Visual Studio 2012; formerly called Moles)
    3) Using static Func variables in order to control functionality from unit tests as described by Moq’s Daniel Cazzulino (http://blogs.clariusconsulting.net/kzu/how-to-mock-extension-methods/)

    What is your approach for these situations?

    Cheers,
    Paul

  • Hy Paul,

    Thanks. I will try to answer your questions.

    0. (Visibility of generic FindAllBy method)
    I just can say regarding that point. Don’t try to protect developers from themselves by over complicating the infrastructure. If they want to sabotage your infrastructure they always find a way around it. Just let them read the blog post or your guidelines which you assembled out of the blog post. Hold a architecture workshop in which you present the new way of programming, show them how they can apply the new style or even refactor old ways of querying. As soon as you have two or more people on your side let them do the code review and your done.
    1. (Faking the generic FindAllBy)
    There are two scenarios here which are relevant for testing:
    a) Extension methods:
    Simply don’t test them. There is no value in testing a method like:

    public static class CustomerRepositoryExtensions {
       public static IEnumerable<Customer> GetCustomersWithFirstNameOf(this IUnitOfWork uow, string name) {
          return uow.FindAllBy(new CustomerFirstNameOfQuery(name))
       }
    }
    

    Simply code review that stuff and you are fine.
    b) Business logic:
    Can easily be tested with FakeItEasy or Moq.

    public class Foo {
       public Foo(IUnitOfWork uow) {
          this.uow = uow;
       }
       public void Bar() {
          using(IUnitOfWorkScope scope = this.uow.Start()) {
             var customers = scope.GetCustomersWithFirstNameOf("daniel");
             // modify customers, other class could att
             scope.Commit();
          }
       }
    }
    
    public class FooTest {
       public Foo() {
          this.uow = A.Fake<IUnitOfWork>(); // can be set up to automatically return IUnitOfWorkScope fake
          this.scope = ...;
       }
    
       public void Bar_ShouldModifyFetchedCustomer() {
          A.CallTo(() => this.scope.FindAllBy<CustomerFirstNameOfQuery>(A<string>.Ignored)).Returns(new ...);
          
          this.testee.Bar();
     
          // should...
       }
    }
    

    I see that this leeks out the CustomerFirstNameOfQuery class. But I can live with that drawback.

    There is actually another drawback in my proposed solution. It is not possible to inject dependencies into queries. But this is actually not needed because if you find the need to do that the query is violating SRP.

    No need for Typemock Isolator or Microsoft Fakes or even static func approach.

    Hope that helps,
    Daniel

  • @Daniel Marbach

    Hi Daniel

    Thanks for your reply.

    => 0. (Visibility of generic FindAllBy method)
    => I just can say regarding that point. Don’t try to protect developers from
    => themselves by over complicating the infrastructure.

    Fair enough and certainly a pragmatic approach (and if you still happen to find a direct call to the generic FindAllBy method it should be easy to fix). However, I don’t feel that Greg Young’s internalGenericRepository solution is over complicating the infrastructure. Besides, it has the added benefits of expressing that design intent more clearly. But I realize that your solution shines when it comes to wrapping all data access including queries in business transactions aka UnitOfWorkScope (as you nicely described in your follow-up post “The repository anti pattern clarified”). Oh well, I suppose that I have to think about it a bit more. I think we can’t have the cake and eat it too with either of these two approaches.

    => 1. (Faking the generic FindAllBy)
    => There are two scenarios here which are relevant for testing:
    => a) Extension methods:
    => Simply don’t test them. There is no value in testing a method like:
    => public static class CustomerRepositoryExtensions {
    => …
    => return uow.FindAllBy(new CustomerFirstNameOfQuery(name))

    I agree with you in this case as there is hardly anything in it that can go wrong. But don’t you have more complex queries in real life? I have not used Query objects in the past myself but if I did then I would have expected to see more complex examples like composing queries by building them from simpler ones and tying them together with the help of something like AndQuery or OrQuery classes. I would then be inclined to do some integration testing on these extension methods.

    => b) Business logic:
    => Can easily be tested with FakeItEasy or Moq.

    Thanks for clarifying. I expected you to do it that way.

    See you,
    Paul

  • The discussed approach still does not address two issues that Ayende has indicated:
    – In most of the cases Repository becomes a unneeded level of abstraction. I see in comments about persistence ignorance (this is how I understand ” a separation of the client from the used persistency technology”) but don’t we have a framework for framework itself, nothing more? How many times we really used different persistence layers with the same repository? So YAGNI is the good word here most of the time
    – With methods like “GetCustomerByName” we get repositories growing in size uncontrollably. We get “GetCustomerByFirstName”, “GetCustomerByLastName” and so on BUT we will never be able to combine them with AND or OR and we’ll need one more repository method to achieve this. And what’s most frustrating all these methods will have one like of ORM query only. So basically instead of injecting a session and writing one simple LINQ query we get a whole bunch of classes and lots of methods doing the same queries all over again.

    I was very skeptical about Ayende saying don’t use repositories but I am into this idea now after creating a 25th method in my next CustomerReporitory.GetByThisAndThatOrThisAndProbablySomethingElse…..

    The only issue I face now is how to create a rich domain model without injecting session into entities.

  • First I’d like to appreciate your great article I have learned a lot from it and Second I have a question to ask : Why we need a UnitOfWork and Scope ? As far as I know Nhibernate ISession is a unit of work itself and besides we can use a Context object to get the current session so that we don’t have to pass it in our ctor.I know that by introducing UnitOfWork we are using NHibernate in a loosely coupled way so that we can change it with say EF later but I think that deciding which ORM we are going to use is so strategic and critical that affects many other aspects of our architecture(such as caching ,data modelling ,etc) and once decided it’s not easily changeable .So I am asking why bother to introduce another layer of abstraction to ISession ?

  • Hy Nima
    You are right. Session is already a UnitOfWork. So in theory you wouldn’t need my abstraction layer. I always try to have only the abstractions necessary to get the job done. Abstracting too many things away is a design smile. In that case we had the abstraction to properly hide the SessionFactory and the CurrentSession context stuff and also to hook another kind of UnitOfWork into it. We use the appccelerate event broker to asynchronously fire events over process boundaries. We needed to have a mechanism to only fire those events when the underlying session transaction was committed without failure. Our abstraction hided this away from the user.
    Daniel

Recent Posts