Recently, I thought about the difference between patterns that allow one to abstract from working with a data warehouse. Many times I read the descriptions and various implementations of DAO and Repository superficially, even applied them in my projects, apparently not fully understanding the conceptual differences. I decided to figure it out, dug into Google and found an article that explained everything to me. I thought it would be nice to translate it into Russian. The original is for English readers
here . The rest are interested in welcome under cat.
Data Access Object (DAO) is a widely used pattern for storing business area objects in a database. In its broadest sense, DAO is a class that contains CRUD methods for a particular entity.
Suppose that we have an Account entity represented by the following class:
package com.thinkinginobjects.domainobject; public class Account { private String userName; private String firstName; private String lastName; private String email; private int age; public boolean hasUseName(String desiredUserName) { return this.userName.equals(desiredUserName); } public boolean ageBetween(int minAge, int maxAge) { return age >= minAge && age <= maxAge; } }
Create a DAO interface for this entity:
package com.thinkinginobjects.dao; import com.thinkinginobjects.domainobject.Account; public interface AccountDAO { Account get(String userName); void create(Account account); void update(Account account); void delete(String userName); }
The AccountDAO interface can have many implementations that can use different ORM frameworks or direct SQL queries to the database.
The pattern has the following advantages:
- Separates business logic using this pattern from data storage mechanisms and APIs used by them;
- The signatures of the interface methods are independent of the contents of the Account class. If you add the telephoneNumber field to the Account class, there will be no need to make changes to AccountDAO or the classes that use it.
However, the pattern leaves many unanswered questions. What if we need to get a list of accounts with a certain lastName? Can I add a method that only updates the email field for an account? What if we want to use long id instead of userName as an identifier? What exactly is the responsibility of the DAO?
The problem is that the responsibilities of the DAO are not clearly described. Most of the people represent DAO with certain gates to the database and add methods to it as soon as it finds a new way that they would like to communicate with the database. Therefore, it is often possible to see a DAO, inflated as in the following example:
package com.thinkinginobjects.dao; import java.util.List; import com.thinkinginobjects.domainobject.Account; public interface BloatAccountDAO { Account get(String userName); void create(Account account); void update(Account account); void delete(String userName); List getAccountByLastName(String lastName); List getAccountByAgeRange(int minAge, int maxAge); void updateEmailAddress(String userName, String newEmailAddress); void updateFullName(String userName, String firstName, String lastName); }
In BloatAccountDAO, we added methods for searching accounts by various parameters. If the Account class had more fields and more different ways to build queries, we could get an even more bloated DAO. The consequence of which would be:
- It is more difficult to create mocks for the DAO interface during unit testing. It would be necessary to implement more DAO methods even in those test scenarios when they are not used;
- The DAO interface is becoming increasingly tied to the fields of the Account class. There is a need to change the interface and its implementations when changing field types of the Account class.
To thicken the paint even more, we added additional update methods to the DAO. They are the direct result of two new usage scenarios that update different sets of account fields. They look like an innocent optimization and fit perfectly into the concept of AccountDAO if we consider the interface as a gateway to the data warehouse. The DAO pattern and class name AccountDAO are defined too vaguely to keep us from this step.
As a result, we got a bloated DAO interface and, I am sure, my colleagues will add even more methods in the future. In a year we will have a class with more than 20 methods and curse ourselves for choosing this pattern.
Repository pattern
The best solution would be to use the Repository pattern. Eric Evans gave an exact description in his
book : “Respotory is all objects of a certain type as a conceptual set. Its behavior is similar to the collection's behavior, with the exception of more developed possibilities for building queries. ”
Go back and design the AccountRepository according to this definition:
package com.thinkinginobjects.repository; import java.util.List; import com.thinkinginobjects.domainobject.Account; public interface AccountRepository { void addAccount(Account account); void removeAccount(Account account); void updateAccount(Account account);
The add and update methods look identical to the AccountDAO methods. The remove method differs from the delete method defined in the DAO in that it takes Account as a parameter instead of userName (account ID). Representing the repository as a collection changes its perception. You avoid disclosing the type of repository account identifier. This will make your life easier if you want to use long to identify accounts.
If you are thinking about the contracts of add / remove / update methods, just think about the collection's abstraction. If you are considering adding another update method for the repository, consider whether it makes sense to add another update method for the collection.
However, the query method is special. I would not expect to see such a method in the collection class. What is he doing?
The repository is different from the collection, if we consider the possibilities for building queries. Having a collection of objects in memory, it is quite simple to sort through all its elements and find the instance of interest. The repository works with a large set of objects, most often out of RAM at the time of the request. It is inappropriate to load all accounts in memory if we need one specific user. Instead, we pass the criterion to the repository, with which it can find one or more objects. The repository can generate a SQL query if it uses the database as a backend, or it can search a required object in a brute force if the collection is in memory.
One of the commonly used implementations of the criterion is the Specification pattern (further specification). The specification is a simple predicate that takes a business area object and returns a boolean:
package com.thinkinginobjects.repository; import com.thinkinginobjects.domainobject.Account; public interface AccountSpecification { boolean specified(Account account); }
So, we can create implementations for each method of executing requests to the AccountRepository.
The usual specification works well for an in-memory repository, but cannot be used with a database due to inefficiency.
For an AccountRepository that works with a SQL database, the specification needs to implement the SqlSpecification interface:
package com.thinkinginobjects.repository; public interface SqlSpecification { String toSqlClauses(); }
A repository that uses the database as a backend can use this interface to retrieve SQL query parameters. If Hibernate were used as a backend for the repository, we would use the HibernateSpicification interface that Criteria generates.
The SQL and Hibernate repositories do not use the specified method. Nevertheless, we find the presence of the implementation of this method in all classes an advantage, since this way we can use the stub for the AccountRepository for test purposes as well as in the caching implementation of the repository before the request is sent directly to the backend.
We can even go one step further and use Spicification with the ConjunctionSpecification and DisjunctionSpecification to perform more complex queries. It seems to us that this question is beyond the scope of the article. An interested reader can find details and examples in the
book of Evans.
package com.thinkinginobjects.specification; import org.hibernate.criterion.Criterion; import org.hibernate.criterion.Restrictions; import com.thinkinginobjects.domainobject.Account; import com.thinkinginobjects.repository.AccountSpecification; import com.thinkinginobjects.repository.HibernateSpecification; public class AccountSpecificationByUserName implements AccountSpecification, HibernateSpecification { private String desiredUserName; public AccountSpecificationByUserName(String desiredUserName) { super(); this.desiredUserName = desiredUserName; } @Override public boolean specified(Account account) { return account.hasUseName(desiredUserName); } @Override public Criterion toCriteria() { return Restrictions.eq("userName", desiredUserName); } }
package com.thinkinginobjects.specification; import com.thinkinginobjects.domainobject.Account; import com.thinkinginobjects.repository.AccountSpecification; import com.thinkinginobjects.repository.SqlSpecification; public class AccountSpecificationByAgeRange implements AccountSpecification, SqlSpecification{ private int minAge; private int maxAge; public AccountSpecificationByAgeRange(int minAge, int maxAge) { super(); this.minAge = minAge; this.maxAge = maxAge; } @Override public boolean specified(Account account) { return account.ageBetween(minAge, maxAge); } @Override public String toSqlClauses() { return String.format("age between %s and %s", minAge, maxAge); } }
')
Conclusion
The DAO pattern provides a vague description of the contract. Using it, you will get potentially misused and bloated class implementations. The Repository pattern uses a collection metaphor that gives us a hard contract and makes understanding your code easier.