mirror of
https://github.com/doctrine/orm.git
synced 2026-03-23 22:42:18 +01:00
WeakReference to free up memory automatically #6391
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @bOmBeLq on GitHub (Jan 21, 2020).
Feature Request
Summary
So php 7.4 introduces WeakReference. AFAIK UnitOfWork keeps references to all fetched objects and that's why if you want to free up memory you have to explicitly detach them or clear the entity manager. I understand that this is to increase performance and not query for objects if they ware already fetched which totally makes sense.
I'm wondering if it would make any sense to use WeakReference to keep reference to fetched objects. As long as object is used anywhere outside of ORM it will be kept in memory and re-used from the ref. But if app context does not use entities anymore they would be automatically removed from memory.
Do you see any potential problems with such approach?
Auto-detaching will probably lead to performance drop in some specific cases - should this be considered BC break in that case? This feature could be configurable (on/off) as well.
@jvasseur commented on GitHub (Jan 27, 2020):
I thought of that to, but freeing references a long as an object isn't used anymore would work because the unit of work needs to keep a reference of all objects to generate the changeset on flush.
For example
If the unit of work "forgot" about
$entitywhen we callunseton it, the flush won't save any changes.@mvorisek commented on GitHub (Feb 2, 2020):
@jvasseur This can be handled by
__destructof some wrapper class. If object is dereferenced (by refcount = 0), wrapper destoy method will be called and it will check if flush is needed (i.e. needs to stay in the memory) - if yes, the wrapper object will be recreated and the reference to the wrapper instance will be saved (otherwise it will be destroyed immediatelly), if no, it will be simply freed.Related with: https://github.com/php/php-src/pull/4882#issuecomment-561787685 probably a RFC for
register_object_destroy_callbackwill be much better choise which will allow any managers likeEntityManagerto register a destroy callback on any object, allow to attach (i.e. add reference to) the object and cancel so processing of the destruct if needed.@jvasseur commented on GitHub (Feb 2, 2020):
@mvorisek that's a possible solution but it has some performances implication since it would means changes will be tracked in modified objects even if we never flush them.
@bOmBeLq commented on GitHub (Feb 3, 2020):
I would say that this can be expected behavior. A tradeoff for auto-detach and a BC break for sure in that case.
If flush is called on objects which are not used in app context anymore then I suppouse this usually is some app design mistake.
In above example service itself should flush the changes.
Eventually this may be turned off explictly if needed
That being said I'm still not sure if that is good idea. Just considering.
@mvorisek commented on GitHub (Feb 3, 2020):
@bOmBeLq I do not think that auto GC any changed entities which should / can be flushed is a good idea. Also GC not changed entities should be turned off by default as if these entities are loaded again, they can contain different values (if they are not loaded in same transaction and repeatable read or better transaction isolation).
@jvasseur commented on GitHub (Feb 3, 2020):
I'm using this pattern in a lot of cases, basically services are responsible for business logic but never commit (flush) the changes that I consider the responsibility of the controller that can then batch multiple change in the same flush to ensure they are done at the same time (or can decide to not flush them if needed).
Same here, I'm not sure if either doing it or not is a good idea, I'm just listing problems I see in going in this direction so that everyone can make the best decision possible.
@beberlei commented on GitHub (Feb 12, 2020):
We have a started discussion internally about a similar approach using WeakMap and WeakReference for different parts of the UnitOfWork and also documented the downsides with the lost update in case a changed update goes out of scope (for example during a batch update loop).
Nothing has been decided yet, but its something we have on our radar.
@mpdude commented on GitHub (Feb 29, 2024):
Currently debugging a batch processing script with excessive memory usage, so I came across this.
Challenges: We cannot use WeakMap for the identity map directly. It may happen that an entity is updated, but no other references kept in userland (outside the ORM). In that case we must not simply drop the reference, but keep it until the
flush()operation unless instructed to do otherwise. From the identity map/UoW point of view, we also cannot easily see whether an entity has been changed, and we also do not notice when it changes (to put some kind of lock in place or similar).Next challenge: What if we have one-to-many associations? If the identity map used a WeakMap, but both the collection on the "one" side as well as all the owning side backreferences use normal references, would this suffice of make sure the entities are destructed? Maybe the GC could handle this case (weakmap reference to a cyclic "normal" reference situation), but I'd need to check.
Update: If we have cyclic references between to objects, and this group is referred to only through a
WeakReference, thengc_collect_cycles()can collect it.