Should a repository always return the same reference in memory when querying for the same ID?

entity framework repository pattern best practices
entity framework repository pattern dbcontext
asnotracking entity framework core example
repository pattern complex queries
repository pattern c#
generic repository pattern c#
repository pattern multiple tables
use of repository pattern

In many blogs or articles one reads the following statement about the repository

You should think of a repository as a collection of domain objects in memory

Now i am asking myself what should happen when i query the repository for the same Id twice.

Entity a = theRepo.GetById(1);
Entity b = theRepo.GetById(1);

assertTrue( a == b ); // Do they share the same reference ?
assertTrue( a.equals( b ) ); // This should always be true
  • Should the repository always return the same reference in memory ?
  • Should the repository return a new instance of the entity but with equal state?

Given an entity type, a repository is a collection of instances of a the given entity type.

A repository doesn't create instances of the entity. It is just a collection (in the sense of a "set") of instances that you created before. Then you add those instances to the repository (set of instances). And you can retrieve those instances.

A set doesn't duplicate elements. Given an id (eg id=1), the set will just have one instance with id=1, and that instance is the one you retrieve when you call "theRepo.GetById(1)".

So:

Entity a = theRepo.GetById(1);

Entity b = theRepo.GetById(1);

Should the repository always return the same reference in memory ?

See UPDATE 3.

Should the repository return a new instance of the entity but with equal state?

No. The repository should return the instance that you added before. The repository shouldn't create new instances.

Anyway, in order to check whether two instances are the same, you shouldn't compare the references, you should compare their ids.

You are mixing concepts. A repository is just a collection (set) of instances. Instances are created by factories (or by constructor methods of the entity).

See IDDD book by Vaughn Vernon ("Collection-Oriented Repositories" section in Chapter 12).

Hope it helps.

UPDATE:

When I say "...Repository is a set of instances..." I mean it mimics a set. My fault not expressing it well. Regarding to update an instance of the repository, such operation doesn't exist, since when you retrieve an instance and modify it, the changes are made in the instance of the repository, you don't have to re-save the instance. The peristence mechanism implemeting the repository must have capabilities to ensure this behaviour. See Chapter 12 of the Implementing DDD book by Vaugn Vernon.

UPDATE 2:

I want to clarify that what I say here is my understanding after reading Vaughn Vernon book IDDD, and also another book (Domain Driven Design in PHP, by Carlos Buenosvinos). I'm not trying to be misleading at all.

UPDATE 3:

I asked Vaughn Vernon the following question:

Regarding collection-oriented repository, I have a question: If I do Foo aFoo=fooRepository.getById(1); Foo anotherFoo=fooRepository.getById(1); then is it guaranteed that both references are the same (aFoo==anotherFoo)?

And he answered the following:

That depends on the backing persistence mechanism, such as Hibernate/JPA. It seems it should if you are using the same session in both reads and both reads have the same transactional scope, but check with your ORM.

4 Common Mistakes with the Repository Pattern, Your repositories should return domain objects and the client of the repository can decide if it I repeat: think of a repository as a collection of domain objects in memory. The database will always be in a consistent state. One of the reasons we use the repository pattern is to encapsulate fat queries. Instead, you should have a separate repository per domain class, like OrderRepository, ShippingRepository and ProductRepository. Repositories that return view models/DTOs. Once again, a repository is like a collection of domain objects. So it should not return view models/DTOs or anything that is not a domain object.

I don't think you can assume that a == b.

Consider the situation where you got instance a, and started to modify it, not yet saving it back to your database. If another thread requests the same entity and puts it in variable b, it should get a new one reflecting the data in the database, not a dirty one that another thread is modifying and hasn't yet (and possibly never will) save.

On the other hand, assuming that a or b has not been subsequently modified after it has been retrieved from the same repository, it should be safe to assume that a.equals(b), also assuming that the equals() method has been implemented correctly for the entity.

Tracking vs. No-Tracking Queries, Tracking behavior controls if Entity Framework Core will keep information the change tracker, EF Core will do identity resolution in a tracking query. When materializing an entity, EF Core will return the same entity instance It used weak references to keep track of entities that had already been returned. Reference memory is a long-term memory. In a spatial task, it mimics two aspects of episodic memory, namely the “what” (content) and “where” (place) dimensions of an event. One of the tasks most frequently used to assess spatial reference memory in the rat is the Morris water maze. It is highly sensitive to hippocampal damage.

In my opinion, your problem boils down to the lifespan of the repository. Repositories are transient (ideally) and also, sometimes, they live inside another class called "Unit of Work" which is transient as well.

I don't think this is a DDD issue, but more of an Infrastructure issue.

A Real-Time In-Memory Discovery Service: Leveraging Hierarchical , Leveraging Hierarchical Packaging Information in a Unique Identifier Network to Q event_timezone_ol'fset TINYINT Q repository_uri VARCHAR(255) Fig. table contains a unique identifier, a foreign key “object_event_id” to reference the Thus, redundancy does not come with the same disadvantages with regard to  SirixDB facilitates effective and efficient storing and querying of your temporal data. Every commit stores a space-efficient snapshot. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach called sliding snapshot. - sirixdb/sirix

7 Pro-tips for Room - Android Developers, By querying the database in one transaction, you ensure that if the Take care of the amount of memory used by your app and load only the subset of fields you will On an entity, define the parent entity to reference, the columns in it But you will also get the same object when other changes (deletes,  There should be no possibility of the Find method returning multiple results. However, it is possible for the Find method to return null. Retrieving multiple objects. Queries for retrieving values relating to multiple objects are only executed against a database when the data is iterated over. This is known as deferred execution.

Working with Objects, Any other methods such as EntityManager#persist($entity) or and even no matter what kind of Query method you are using (find, Repository Finder or DQL). have been retrieved per PHP request and keeps returning you the same instances. \Doctrine\Common\Collections\Collection) { echo "This will always be true!"; }. The Repository pattern is a popular way to achieve separation between the physical database, queries and other data access logic from the rest of an application. Read on to learn how to use Repository pattern in C# within the context of Entity Framework and ASP.NET MVC.

Spring Data JPA Tutorial: Creating Database Queries With the JPA , It must return todo entries whose title or description contains the id >process</ id > Modify the repository interface to support queries that use the JPA Criteria API. The JpaSpecificationExecutor<T> interface declares also two other Spring Framework Reference Manual: 12.5.6 Using @Transactional  Querying Azure Cosmos DB resources using the REST API. 04/18/2019; 8 minutes to read +2; In this article. Azure Cosmos DB is a globally distributed multi-model database with support for multiple APIs.

Comments
  • It depends if you're a functional programmer or a side effect oriented programmer ;)
  • If the repository is implemented correctly, it should return always the same instance. You are wrong in this. If you modify entity a on one Thread but you haven't yet saved it, you shouldn't get entity b on another Thread dirtied by the modification which is still in progress, and potentially not even persisted. It is only safe to return the same reference if the Entity is immutable. Otherwise you are introducing shared state without the Threads even knowing!
  • Also, a repository is not a collection either! It provides an interface that mimics that of a collection, it "mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects" (Martin Fowler). It does not mean that it will return the collection itself as a whole, or that in some way it maintains the same references. It just provides methods to search and update elements in a similar way that you do to a traditional collection.
  • Thanks for the clarifications. While in theory a Repository could persist an object instantly, upon modification, this is almost never the case. If you look at Spring's repository classes they all have a save() method. Other frameworks do the same. Furthermore requirements to do with transactions and rollback make this impractical. It is typically not considered the repository's responsibility to decide when to persist, but it is managed by the Unit of Work boundary, whatever that is.
  • @jbx Vaughn Vernon talks about 2 kinds of repositories: collection-oriented and persistence-oriented. I was referring to collection oriented, and you are talking about persistence oriented. Not all frameworks are suitable for implementing collection oriented repositories
  • Fact is, you have no guarantee that 2 calls to theRepo.getById(1) will retain the same instance. Your answer is misleading. If it happens to be the case (I know of no framework that does this, unless you consider plain collections as repositories), you will have to be very careful with managing shared state between 2 threads serving 2 client requests which access the same entity.
  • But that would mean, whenever i modify an entity and save it back to the database all other obtained instances would be out of snyc. I would need to get a fresh instance with the reflected changes?
  • @LogitekDev Yes. In fact you have to be very careful if you think this situation might indeed happen. If the underlying repository was implemented using a JPA EntityManager, it provides mechanisms to refresh() an Entity, if it was long lived enough and there was a risk it was changed. In practice this doesn't happen much because either the entity is queried and updated within the same transaction, but if you have such a risk in your application you need to think about it.
  • Usually (at least in the Spring Framework) repositories are singleton objects shared between all services who need them, so their lifespan is pretty much that of the application. They are thread safe because they maintain practically no state (apart from the references to the underlying datastore infrastructure/caching framework etc.)