Allow to disable the identity map or override UnitOfWork #4969

Closed
opened 2026-01-22 14:54:34 +01:00 by admin · 9 comments
Owner

Originally created by @lzref on GitHub (Jan 12, 2016).

Originally assigned to: @Ocramius on GitHub.

Hello. I'm surprised nobody asked about this before.

When dealing with existing codebase and trying to migrate it to using Doctrine, it's hard to do that immediately, in one commit. A more realistic approach is to start using Doctrine for new classes/tables, and gradually move the rest of the classes and queries later.

The problem with that approach is that Doctrine's identity map assumes that all database writes happen through it. Consider the following code:

$user = getUserWithDoctrine($user_id);
// ...
updateUserWithPdo($user_id, $new_user_fields);
// ...
$user2 = getUserWithDoctrine($user_id);

Doctrine's identity map won't know anything about the update query that was made directly using PDO and return the same object (without the latest changes). This is very counter-intuitive, especially if these reads and writes are situated in different parts of the codebase.

One workaround is to always clear the entity manager before executing read queries (methods like find(), etc.) However, that makes the code more bloated, as well as a bit slower.

Other concerns that have to do with identity mapping are increased memory usage and using stale data when rows are updated in the database by other scripts.

Hopefully by now I made my case that there's at least some situations where disabling identity map is beneficial. Even without it Doctrine has plenty of features to be extremely useful in projects.

Surprisingly, Doctrine currently provides no way of disabling the identity map or overriding UnitOfWork. Field $unitOfWork in the EntityManager class is private and has no setters. It gets set in EntityManager's constructor and it's always set to a new instance of \Doctrine\ORM\UnitOfWork (hardcoded).

Will appreciate your feedback!

Originally created by @lzref on GitHub (Jan 12, 2016). Originally assigned to: @Ocramius on GitHub. Hello. I'm surprised nobody asked about this before. When dealing with existing codebase and trying to migrate it to using Doctrine, it's hard to do that immediately, in one commit. A more realistic approach is to start using Doctrine for new classes/tables, and gradually move the rest of the classes and queries later. The problem with that approach is that Doctrine's identity map assumes that all database writes happen through it. Consider the following code: ``` $user = getUserWithDoctrine($user_id); // ... updateUserWithPdo($user_id, $new_user_fields); // ... $user2 = getUserWithDoctrine($user_id); ``` Doctrine's identity map won't know anything about the update query that was made directly using PDO and return the same object (without the latest changes). This is very counter-intuitive, especially if these reads and writes are situated in different parts of the codebase. One workaround is to always clear the entity manager before executing read queries (methods like find(), etc.) However, that makes the code more bloated, as well as a bit slower. Other concerns that have to do with identity mapping are increased memory usage and using stale data when rows are updated in the database by other scripts. Hopefully by now I made my case that there's at least some situations where disabling identity map is beneficial. Even without it Doctrine has plenty of features to be extremely useful in projects. Surprisingly, Doctrine currently provides no way of disabling the identity map or overriding UnitOfWork. Field $unitOfWork in the EntityManager class is private and has no setters. It gets set in EntityManager's constructor and it's always set to a new instance of \Doctrine\ORM\UnitOfWork (hardcoded). Will appreciate your feedback!
admin added the ImprovementDuplicate labels 2026-01-22 14:54:34 +01:00
admin closed this issue 2026-01-22 14:54:35 +01:00
Author
Owner

@DHager commented on GitHub (Jan 12, 2016):

One workaround is to always clear the entity manager before executing read queries (methods like find(), etc.) However, that makes the code more bloated, as well as a bit slower.

I'm not sure that's a "workaround" as much as "the right thing". Unless you tell Doctrine what you're doing, it thinks you have one unit of work, but you actually want two: The unit before PDO, and the unit after. Clearing state is what lets you indicate that, the UnitOfWork instance is just being reused for efficiency, like a one-object pool.

Doctrine's identity map won't know anything about the update query that was made directly using PDO and return the same object (without the latest changes). This is very counter-intuitive, especially if these reads and writes are situated in different parts of the codebase.

I, uhm, don't see it as counter-intuitive at all. I mean, Doctrine is not some sort of binary plugin for your database which can "monitor" all changes happening on any table in real-time. Even if you hacked PHP's internals so that PDO somehow "reported" what it was doing, you'd still get the same problem if a change was made by another PHP-process, another web-host, or even just some guy running a SQL console.

So your code-sample can never be done 100% safely, and therefore Doctrine was never designed to even try. Instead, it offers support for optimistic or pessimistic locking.

Hopefully by now I made my case that there's at least some situations where disabling identity map is beneficial. [...] Surprisingly, Doctrine currently provides no way of disabling the identity map or overriding UnitOfWork.

Could you elaborate on how you see this working if there were such an option? How--if at all--does Doctrine save changes if it doesn't keep track of the entities it managing? If it allows multiple copies retrieved at different times, how does it know which copy actually represents the value that you've modified and need to be saved to the database?

@DHager commented on GitHub (Jan 12, 2016): > One workaround is to always clear the entity manager before executing read queries (methods like find(), etc.) However, that makes the code more bloated, as well as a bit slower. I'm not sure that's a "workaround" as much as "the right thing". Unless you tell Doctrine what you're doing, it thinks you have _one_ unit of work, but you actually want two: The unit before PDO, and the unit after. Clearing state is what lets you indicate that, the UnitOfWork instance is just being reused for efficiency, like a one-object pool. > Doctrine's identity map won't know anything about the update query that was made directly using PDO and return the same object (without the latest changes). This is very counter-intuitive, especially if these reads and writes are situated in different parts of the codebase. I, uhm, don't see it as counter-intuitive at all. I mean, Doctrine is not some sort of binary plugin for your database which can "monitor" all changes happening on any table in real-time. Even if you hacked PHP's internals so that PDO somehow "reported" what it was doing, you'd _still_ get the same problem if a change was made by another PHP-process, another web-host, or even just some guy running a SQL console. So your code-sample can never be done 100% safely, and therefore Doctrine was never designed to even try. Instead, it offers support for optimistic or pessimistic locking. > Hopefully by now I made my case that there's at least some situations where disabling identity map is beneficial. [...] Surprisingly, Doctrine currently provides no way of disabling the identity map or overriding UnitOfWork. Could you elaborate on how you see this working if there were such an option? How--if at all--does Doctrine save changes if it doesn't keep track of the entities it managing? If it allows multiple copies retrieved at different times, how does it know which copy actually represents the value that you've modified and need to be saved to the database?
Author
Owner

@lzref commented on GitHub (Jan 12, 2016):

Hi, DHager, thanks for responding.

I'm not sure that's a "workaround" as much as "the right thing".

Is this what you're suggesting?

$entityManager->clear();
$user = $entityManager->find('User', 1234);

I can add a call to clear() before all find() calls in my code but it's more typing and less clean/readable code. And if we're always clearing the identity map before making find() calls, why use identity map in the first place? We can save a bunch of memory manipulations by not using the identity map.

Doctrine is not some sort of binary plugin for your database which can "monitor" all changes happening on any table in real-time.

That's not what I was suggesting. Precisely because this is impossible, Doctrine should allow to bypass the identity map, i.e. if I'm trying to get the same row from the database for the second time, don't assume it haven't changed compared to the last time (as explain in item 10.1 here: http://doctrine-orm.readthedocs.org/projects/doctrine-orm/en/latest/reference/unitofwork.html).

@lzref commented on GitHub (Jan 12, 2016): Hi, DHager, thanks for responding. > I'm not sure that's a "workaround" as much as "the right thing". Is this what you're suggesting? ``` $entityManager->clear(); $user = $entityManager->find('User', 1234); ``` I can add a call to clear() before all find() calls in my code but it's more typing and less clean/readable code. And if we're always clearing the identity map before making find() calls, why use identity map in the first place? We can save a bunch of memory manipulations by not using the identity map. > Doctrine is not some sort of binary plugin for your database which can "monitor" all changes happening on any table in real-time. That's not what I was suggesting. Precisely because this is impossible, Doctrine should allow to bypass the identity map, i.e. if I'm trying to get the same row from the database for the second time, don't assume it haven't changed compared to the last time (as explain in item 10.1 here: http://doctrine-orm.readthedocs.org/projects/doctrine-orm/en/latest/reference/unitofwork.html).
Author
Owner

@Ocramius commented on GitHub (Jan 12, 2016):

This is a duplicate of #5550.

@Ocramius commented on GitHub (Jan 12, 2016): This is a duplicate of #5550.
Author
Owner

@ekonoval commented on GitHub (Mar 28, 2018):

@lzref what was the solution in this case for you? I also have a legacy project and try using Doctrine with legacy code (direct mysqli queries) and also experience the same problem.

@ekonoval commented on GitHub (Mar 28, 2018): @lzref what was the solution in this case for you? I also have a legacy project and try using Doctrine with legacy code (direct mysqli queries) and also experience the same problem.
Author
Owner

@lzref commented on GitHub (Mar 29, 2018):

@ekonoval it's really ingrained into Doctrine's code and Doctrine provides no easy way to override stuff. Doctrine uses hardcoded class names in many places so even if you extend some classes it's hard to make Doctrine use them.

So what we ended up doing is just clearing entity manager cache before important reads or after writes. Mostly this situation doesn't happen that often in real life code (reading an entity, then modifying the DB row outside of doctrine, then reading it again within the same PHP request). This situation does happen quite often in our unit tests though (because each test creates DB records it needs), but we added an automatic clear() after we save stuff there.

@lzref commented on GitHub (Mar 29, 2018): @ekonoval it's really ingrained into Doctrine's code and Doctrine provides no easy way to override stuff. Doctrine uses hardcoded class names in many places so even if you extend some classes it's hard to make Doctrine use them. So what we ended up doing is just clearing entity manager cache before important reads or after writes. Mostly this situation doesn't happen that often in real life code (reading an entity, then modifying the DB row outside of doctrine, then reading it again within the same PHP request). This situation does happen quite often in our unit tests though (because each test creates DB records it needs), but we added an automatic clear() after we save stuff there.
Author
Owner

@ekonoval commented on GitHub (Mar 29, 2018):

@lzref thanks for your reply! From your experience, would you recommend just use

$entityManager->clear();
$user = $entityManager->find('User', 1234);

or

$entityManager->clear('User');
$user = $entityManager->find('User', 1234);
@ekonoval commented on GitHub (Mar 29, 2018): @lzref thanks for your reply! From your experience, would you recommend just use ``` $entityManager->clear(); $user = $entityManager->find('User', 1234); ``` or ``` $entityManager->clear('User'); $user = $entityManager->find('User', 1234); ```
Author
Owner

@Ocramius commented on GitHub (Mar 29, 2018):

@ekonoval it's really ingrained into Doctrine's code and Doctrine provides no easy way to override stuff. Doctrine uses hardcoded class names in many places so even if you extend some classes it's hard to make Doctrine use them.

Not just that: the codebase is not designed with swappable components in mind, and that on purpose. As I've already highlighted, #5550 may provide what you are looking for.

@ekonoval please don't use $entityManager->clear($entityName): we deprecated that for good reasons (see https://github.com/doctrine/doctrine2/issues/5855)

@Ocramius commented on GitHub (Mar 29, 2018): > @ekonoval it's really ingrained into Doctrine's code and Doctrine provides no easy way to override stuff. Doctrine uses hardcoded class names in many places so even if you extend some classes it's hard to make Doctrine use them. Not just that: the codebase is not designed with swappable components in mind, and that on purpose. As I've already highlighted, #5550 may provide what you are looking for. @ekonoval please don't use `$entityManager->clear($entityName)`: we deprecated that for good reasons (see https://github.com/doctrine/doctrine2/issues/5855)
Author
Owner

@ekonoval commented on GitHub (Mar 29, 2018):

@Ocramius thanks for explanation, though haven't seen the deprecation anywhere. And what about detach($entity) , is it cool to use it?

@ekonoval commented on GitHub (Mar 29, 2018): @Ocramius thanks for explanation, though haven't seen the deprecation anywhere. And what about detach($entity) , is it cool to use it?
Author
Owner

@Ocramius commented on GitHub (Mar 29, 2018):

is it cool to use it?

No: it brings in many other problems, as described in the linked https://github.com/doctrine/doctrine2/issues/5855

@Ocramius commented on GitHub (Mar 29, 2018): > is it cool to use it? No: it brings in many other problems, as described in the linked https://github.com/doctrine/doctrine2/issues/5855
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#4969