removeElement from EXTRA_LAZY collection is loading all elements #6741

Open
opened 2026-01-22 15:37:51 +01:00 by admin · 19 comments
Owner

Originally created by @joanna-bak-sourceability on GitHub (Jun 2, 2021).

Hello,

Recently we upgraded our doctrine/orm from 2.6.6 to 2.8.4.

This caused our workers to run out of memory, because suddenly after we called removeElement(), the collection got initialised with all elements from database.

This is currently breaking our system. Is there a fix for it or a way to go around it?

code:

 /**
 * @var Collection<int, Offer>
 *
 * @ORM\OneToMany(targetEntity=Offer::class, mappedBy="product", fetch="EXTRA_LAZY")
 */
private $offers;
if ($this->hasOffer($offer)) {
    $this->offers->removeElement($offer);
 }

In this example before we call removeElement only 2 objects are included in collection, collection is also not initialised.
After RemoveElement collection changes state to initialised and has further objects from database

We discovered the problem via blackfire debugging, and later manual tests.
I think that the message on https://www.doctrine-project.org/projects/doctrine-orm/en/2.9/tutorials/extra-lazy-associations.html is connected to this new behaviour:

removeElement directly issued DELETE queries to the database from version 2.4.0 to 2.7.0. This circumvents the flush operation and might run outside a transactional boundary if you don't create one yourself. We consider this a critical bug in the assumptio of how the ORM works and reverted&nbsp;removeElement&nbsp;EXTRA_LAZY behavior in 2.7.1.
Originally created by @joanna-bak-sourceability on GitHub (Jun 2, 2021). Hello, Recently we upgraded our doctrine/orm from 2.6.6 to 2.8.4. This caused our workers to run out of memory, because suddenly after we called `removeElement()`, the collection got initialised with all elements from database. This is currently breaking our system. Is there a fix for it or a way to go around it? code: ``` /** * @var Collection<int, Offer> * * @ORM\OneToMany(targetEntity=Offer::class, mappedBy="product", fetch="EXTRA_LAZY") */ private $offers; ``` ``` if ($this->hasOffer($offer)) { $this->offers->removeElement($offer); } ``` In this example before we call removeElement only 2 objects are included in collection, collection is also not initialised. After RemoveElement collection changes state to initialised and has further objects from database We discovered the problem via blackfire debugging, and later manual tests. I think that the message on https://www.doctrine-project.org/projects/doctrine-orm/en/2.9/tutorials/extra-lazy-associations.html is connected to this new behaviour: ```` removeElement directly issued DELETE queries to the database from version 2.4.0 to 2.7.0. This circumvents the flush operation and might run outside a transactional boundary if you don't create one yourself. We consider this a critical bug in the assumptio of how the ORM works and reverted&nbsp;removeElement&nbsp;EXTRA_LAZY behavior in 2.7.1. ````
admin added the BugBC Break labels 2026-01-22 15:37:51 +01:00
Author
Owner

@beberlei commented on GitHub (Jun 2, 2021):

This sounds like a critical bug. You could help us resolve this quicker by doing a little audit of how the removeElement method of Doctrine\ORM\PersistentCollection class changed between 2.6 and 2.8.4, otherwise we will look into this as time permits as well.

@beberlei commented on GitHub (Jun 2, 2021): This sounds like a critical bug. You could help us resolve this quicker by doing a little audit of how the `removeElement` method of `Doctrine\ORM\PersistentCollection` class changed between 2.6 and 2.8.4, otherwise we will look into this as time permits as well.
Author
Owner

@Fedik commented on GitHub (Jun 2, 2021):

on quick look, in 2.6 version was some special check for extra_lazy,
2.6 version:
https://github.com/doctrine/orm/blob/4fae126459d7deb6cb2877fc0c0f8ecfa4658516/lib/Doctrine/ORM/PersistentCollection.php#L368-L385

2.9:
https://github.com/doctrine/orm/blob/75b4b88c5b7cebc24ed7251a20c2a5aa027300e1/lib/Doctrine/ORM/PersistentCollection.php#L375-L383

So now it always call initialize() before removing, that will load all elements.

@Fedik commented on GitHub (Jun 2, 2021): on quick look, in 2.6 version was some special check for extra_lazy, 2.6 version: https://github.com/doctrine/orm/blob/4fae126459d7deb6cb2877fc0c0f8ecfa4658516/lib/Doctrine/ORM/PersistentCollection.php#L368-L385 2.9: https://github.com/doctrine/orm/blob/75b4b88c5b7cebc24ed7251a20c2a5aa027300e1/lib/Doctrine/ORM/PersistentCollection.php#L375-L383 So now it always call `initialize()` before removing, that will load all elements.
Author
Owner

@Fedik commented on GitHub (Jun 3, 2021):

it happens after #7940 that fixing #7864

@Fedik commented on GitHub (Jun 3, 2021): it happens after #7940 that fixing #7864
Author
Owner

@beberlei commented on GitHub (Jun 3, 2021):

@Fedik thank you for looking into this, I have not seen that joanna also referenced this changelog entry.

I am afraid, while this is a BC break it had to be made due to the inconsistent behavior, issuing delete statements outside the flush operation.

As a workaround I you require the old behavior, I recommend to issue the delete statement manually:

class EfficientCollectionDelete
{
     private EntityManager $entityManager;

     public function __construct(EntityManager $entityManager)
     {
         $this->entityManager = $entityManager;
     }

     public function removeElement(Collection $collection, object $element)
     {
           if (!($collection instanceof PersistentCollection)) {
                return $collection->removeElement($element);
           }

           if ($collection->contains($element)) {
                return $collection->removeElement($element);
           }

           $persister = $this->em->getUnitOfWork()->getCollectionPersister($collection->getMapping());
           return $persister->removeElement($collection, $element);
     }
}

To get this behavior back into ORM, we would need removeElement to schedule a collection element deletion instead of performing it, and then having a new operation during flush that iterates all scheduled collection element deletions and only then calls CollectionPersister::removeElement that was removed in #7940.

@beberlei commented on GitHub (Jun 3, 2021): @Fedik thank you for looking into this, I have not seen that joanna also referenced this changelog entry. I am afraid, while this is a BC break it had to be made due to the inconsistent behavior, issuing delete statements outside the flush operation. As a workaround I you require the old behavior, I recommend to issue the delete statement manually: ```php class EfficientCollectionDelete { private EntityManager $entityManager; public function __construct(EntityManager $entityManager) { $this->entityManager = $entityManager; } public function removeElement(Collection $collection, object $element) { if (!($collection instanceof PersistentCollection)) { return $collection->removeElement($element); } if ($collection->contains($element)) { return $collection->removeElement($element); } $persister = $this->em->getUnitOfWork()->getCollectionPersister($collection->getMapping()); return $persister->removeElement($collection, $element); } } ``` To get this behavior back into ORM, we would need `removeElement` to schedule a collection element deletion instead of performing it, and then having a new operation during flush that iterates all scheduled collection element deletions and only then calls `CollectionPersister::removeElement` that was removed in #7940.
Author
Owner

@BenoitDuffez commented on GitHub (Aug 2, 2022):

I feel like the initial post talked about two issues: OOM when removeElement and direct delete without unit of work.
I feel like only the latter has been fixed, but the former hasn't been addressed.

I'm using Doctrine ORM 2.12 and I have an OOM any time I try to remove an element from a large collection (several thousands). Is there a way to do this without loading the full collection?

@BenoitDuffez commented on GitHub (Aug 2, 2022): I feel like the initial post talked about two issues: OOM when `removeElement` and direct delete without unit of work. I feel like only the latter has been fixed, but the former hasn't been addressed. I'm using Doctrine ORM 2.12 and I have an OOM any time I try to remove an element from a large collection (several thousands). Is there a way to do this without loading the full collection?
Author
Owner

@joanna-bak-sourceability commented on GitHub (Aug 2, 2022):

We fixed this issue by removing OneToMany definitions on bigger collections. We are loading those entities separately when needed.

@joanna-bak-sourceability commented on GitHub (Aug 2, 2022): We fixed this issue by removing OneToMany definitions on bigger collections. We are loading those entities separately when needed.
Author
Owner

@twisted1919 commented on GitHub (Sep 13, 2023):

Why has this been closed?
The issue is not solved, I am experiencing it right now, calling removeElement will load the entire collection from the database, hundred of thousand of rows. How is this acceptable?

@twisted1919 commented on GitHub (Sep 13, 2023): Why has this been closed? The issue is not solved, I am experiencing it right now, calling removeElement will load the entire collection from the database, hundred of thousand of rows. How is this acceptable?
Author
Owner

@twisted1919 commented on GitHub (Oct 2, 2023):

@beberlei - is there any way you fix this? I had to return to this project today and absolutely no workaround works, which keeps us stuck in a place because of something that should work and doesn't.

@twisted1919 commented on GitHub (Oct 2, 2023): @beberlei - is there any way you fix this? I had to return to this project today and absolutely no workaround works, which keeps us stuck in a place because of something that should work and doesn't.
Author
Owner

@beberlei commented on GitHub (Oct 3, 2023):

@beberlei - is there any way you fix this? I had to return to this project today and absolutely no workaround works, which keeps us stuck in a place because of something that should work and doesn't.

My workaround should? https://github.com/doctrine/orm/issues/8739#issuecomment-853783135

@beberlei commented on GitHub (Oct 3, 2023): > @beberlei - is there any way you fix this? I had to return to this project today and absolutely no workaround works, which keeps us stuck in a place because of something that should work and doesn't. > My workaround should? https://github.com/doctrine/orm/issues/8739#issuecomment-853783135
Author
Owner

@Wedrix commented on GitHub (Oct 3, 2023):

In my humble opinion, I think it should revert to the original behaviour. It's reasonable to leave the burden of consistency on the developer if they are willing to go out of their way to change the default behaviour with 'EXTRA_LAZY'.

@Wedrix commented on GitHub (Oct 3, 2023): In my humble opinion, I think it should revert to the original behaviour. It's reasonable to leave the burden of consistency on the developer if they are willing to go out of their way to change the default behaviour with 'EXTRA_LAZY'.
Author
Owner

@twisted1919 commented on GitHub (Oct 4, 2023):

My workaround should? https://github.com/doctrine/orm/issues/8739#issuecomment-853783135

It doesn't, I don't remember exactly why, I think because the persister doesn't contain the method anymore.
Also think, Doctrine is used in Symfony, which is the base for API-Platform, where we got this issue. Any workaround you apply to plain Doctrine is going to be much difficult to apply when using another framework on top of it.

Regardless, how is this acceptable to you? a library so popular with such a huge problem, which is not visible at first but bytes you with first occasion. This is messed up on so many levels, I don't even know from where to start. I don't say this to be mean, it is disappointing how this is treated, I worked with dozen ORM's so far and none of them had such problems.

Somebody suggested I fix it myself. I would, but if I do this for each library that I work with, I wouldn't be able to complete my projects. I also don't have the Doctrine expertise to do so. Nor weeks to learn it.

Do you need any sponsorship to tackle this issue?

@twisted1919 commented on GitHub (Oct 4, 2023): > My workaround should? https://github.com/doctrine/orm/issues/8739#issuecomment-853783135 It doesn't, I don't remember exactly why, I think because the persister doesn't contain the method anymore. Also think, Doctrine is used in Symfony, which is the base for API-Platform, where we got this issue. Any workaround you apply to plain Doctrine is going to be much difficult to apply when using another framework on top of it. Regardless, how is this acceptable to you? a library so popular with such a huge problem, which is not visible at first but bytes you with first occasion. This is messed up on so many levels, I don't even know from where to start. I don't say this to be mean, it is disappointing how this is treated, I worked with dozen ORM's so far and none of them had such problems. Somebody suggested I fix it myself. I would, but if I do this for each library that I work with, I wouldn't be able to complete my projects. I also don't have the Doctrine expertise to do so. Nor weeks to learn it. Do you need any sponsorship to tackle this issue?
Author
Owner

@JeroenBoesten commented on GitHub (Feb 8, 2024):

@Fedik thank you for looking into this, I have not seen that joanna also referenced this changelog entry.

I am afraid, while this is a BC break it had to be made due to the inconsistent behavior, issuing delete statements outside the flush operation.

As a workaround I you require the old behavior, I recommend to issue the delete statement manually:

class EfficientCollectionDelete
{
     private EntityManager $entityManager;

     public function __construct(EntityManager $entityManager)
     {
         $this->entityManager = $entityManager;
     }

     public function removeElement(Collection $collection, object $element)
     {
           if (!($collection instanceof PersistentCollection)) {
                return $collection->removeElement($element);
           }

           if ($collection->contains($element)) {
                return $collection->removeElement($element);
           }

           $persister = $this->em->getUnitOfWork()->getCollectionPersister($collection->getMapping());
           return $persister->removeElement($collection, $element);
     }
}

To get this behavior back into ORM, we would need removeElement to schedule a collection element deletion instead of performing it, and then having a new operation during flush that iterates all scheduled collection element deletions and only then calls CollectionPersister::removeElement that was removed in #7940.

As far as I can see the CollectionPersister interface does not contain the removeElement part anymore (removed at https://github.com/doctrine/orm/commit/041404e8b351fde94b749738d85a932a4730e16a) so the last part of your workaround could never work?

@JeroenBoesten commented on GitHub (Feb 8, 2024): > @Fedik thank you for looking into this, I have not seen that joanna also referenced this changelog entry. > > I am afraid, while this is a BC break it had to be made due to the inconsistent behavior, issuing delete statements outside the flush operation. > > As a workaround I you require the old behavior, I recommend to issue the delete statement manually: > > ``` > class EfficientCollectionDelete > { > private EntityManager $entityManager; > > public function __construct(EntityManager $entityManager) > { > $this->entityManager = $entityManager; > } > > public function removeElement(Collection $collection, object $element) > { > if (!($collection instanceof PersistentCollection)) { > return $collection->removeElement($element); > } > > if ($collection->contains($element)) { > return $collection->removeElement($element); > } > > $persister = $this->em->getUnitOfWork()->getCollectionPersister($collection->getMapping()); > return $persister->removeElement($collection, $element); > } > } > ``` > > To get this behavior back into ORM, we would need `removeElement` to schedule a collection element deletion instead of performing it, and then having a new operation during flush that iterates all scheduled collection element deletions and only then calls `CollectionPersister::removeElement` that was removed in #7940. As far as I can see the CollectionPersister interface does not contain the removeElement part anymore (removed at https://github.com/doctrine/orm/commit/041404e8b351fde94b749738d85a932a4730e16a) so the last part of your workaround could never work?
Author
Owner

@bOmBeLq commented on GitHub (Mar 6, 2024):

I guess what @beberlei proposed seems like best solution:

To get this behavior back into ORM, we would need removeElement to schedule a collection element deletion instead of performing it, and then having a new operation during flush that iterates all scheduled collection element deletions and only then calls CollectionPersister::removeElement that was removed in https://github.com/doctrine/orm/pull/7940.

My understanding is that you want to keep list of removed elements somewhere and then on flush iterate it and perform DELETE for these deleted entries without ever initializing full collection.
I guess another edge case will be that we also need to remove element from collection if client actually triggers collection loading (with foreach eg.) before flush (i.e. merge scheduled deletions into fetched collection).
I can try to work on that as my first contribution.

@bOmBeLq commented on GitHub (Mar 6, 2024): I guess what @beberlei proposed seems like best solution: > To get this behavior back into ORM, we would need removeElement to schedule a collection element deletion instead of performing it, and then having a new operation during flush that iterates all scheduled collection element deletions and only then calls CollectionPersister::removeElement that was removed in https://github.com/doctrine/orm/pull/7940. My understanding is that you want to keep list of removed elements somewhere and then on flush iterate it and perform DELETE for these deleted entries without ever initializing full collection. I guess another edge case will be that we also need to remove element from collection if client actually triggers collection loading (with foreach eg.) before flush (i.e. merge scheduled deletions into fetched collection). I can try to work on that as my first contribution.
Author
Owner

@twisted1919 commented on GitHub (Mar 6, 2024):

@bOmBeLq - whatever @beberlei proposed, does not work, so that's that.

Secondly, the current behavior is wrong, no matter how you put it, this is not a sane approach, you cannot just load hundred of thousand of objects in memory only to remove a few, it makes no sense to me how they could even come up and accept such behavior. Its also not documented and since the extra_lazy piece works as expected when you load data, you don't expect such behavior when you delete data, again, it makes no sense.

Also, the whole attitude of @beberlei towards this issue is just disappointing, they created havoc and now they run from it leaving us running for solutions in a platform we have no expertise in nor any time to invest to fix.

If you have enough expertise and time here, and you can work on it, as I said, I am willing to sponsor the feature fix.

@twisted1919 commented on GitHub (Mar 6, 2024): @bOmBeLq - whatever @beberlei proposed, does not work, so that's that. Secondly, the current behavior is wrong, no matter how you put it, this is not a sane approach, you cannot just load hundred of thousand of objects in memory only to remove a few, it makes no sense to me how they could even come up and accept such behavior. Its also not documented and since the extra_lazy piece works as expected when you load data, you don't expect such behavior when you delete data, again, it makes no sense. Also, the whole attitude of @beberlei towards this issue is just disappointing, they created havoc and now they run from it leaving us running for solutions in a platform we have no expertise in nor any time to invest to fix. If you have enough expertise and time here, and you can work on it, as I said, I am willing to sponsor the feature fix.
Author
Owner

@bOmBeLq commented on GitHub (Mar 6, 2024):

@bOmBeLq - whatever @beberlei proposed, does not work, so that's that.

I only meant the approach for implementing it in ORM itself to work out of box.
Not the tmp fix he proposed.

I will try to come up with something but cannot guarantee any time frame since it's my first time working on this repo + it will simply be chill after regular working hours.

@bOmBeLq commented on GitHub (Mar 6, 2024): > @bOmBeLq - whatever @beberlei proposed, does not work, so that's that. I only meant the approach for implementing it in ORM itself to work out of box. Not the tmp fix he proposed. I will try to come up with something but cannot guarantee any time frame since it's my first time working on this repo + it will simply be chill after regular working hours.
Author
Owner

@bOmBeLq commented on GitHub (Mar 12, 2024):

So there is an issue. I wanted to schedule element for deletion if it's lazy loaded list:

public function removeElement($element)
{
  if (!$this->initialized && $this->association['fetch'] === ClassMetadata::FETCH_EXTRA_LAZY) {
    $this->lazyDeleteElements[] = $element;
     return ?;
  }
 [...]

But we cannot determine if the element is in list therefore cannot return true/false.
I see 3 solutions:

  1. Perform query to database to check if this item exists and basing on that return true/false (if doesn't exist also don't schedule the deletion).
  2. Make method result nullable and return null in such case.
  3. Add dedicated method "eventuallyRemoveElement" which would work like point 2. Or add extra parameter to removeElement ($skipExistanceCheckOnExtralazyCollection=true)

For me:

  1. This is off table because we are fixing one performance issue and adding another.
  2. Seems like may result in some false positives if someone assumes it's just true/false but let's be honest who checks result of this call anyway?
  3. If you don't like 2 this seems like safest option.

Any 1 toughs?

@bOmBeLq commented on GitHub (Mar 12, 2024): So there is an issue. I wanted to schedule element for deletion if it's lazy loaded list: ``` public function removeElement($element) { if (!$this->initialized && $this->association['fetch'] === ClassMetadata::FETCH_EXTRA_LAZY) { $this->lazyDeleteElements[] = $element; return ?; } [...] ``` But we cannot determine if the element is in list therefore cannot return `true/false`. I see 3 solutions: 1. Perform query to database to check if this item exists and basing on that return true/false (if doesn't exist also don't schedule the deletion). 2. Make method result nullable and return null in such case. 3. Add dedicated method "eventuallyRemoveElement" which would work like point 2. Or add extra parameter to removeElement ($skipExistanceCheckOnExtralazyCollection=true) For me: 1. This is off table because we are fixing one performance issue and adding another. 2. Seems like may result in some false positives if someone assumes it's just true/false but let's be honest who checks result of this call anyway? 3. If you don't like 2 this seems like safest option. 4. Any 1 toughs?
Author
Owner

@twisted1919 commented on GitHub (Mar 19, 2024):

@bOmBeLq - I think no. 2 makes most sense. One could argue the result in this case should be true since the item has been queued successfully. The docs could then say, in case of extra lazy, the results are queued for removal. But you see, this is why for such questions, best fit to answer would be somebody from the core team.

@twisted1919 commented on GitHub (Mar 19, 2024): @bOmBeLq - I think no. 2 makes most sense. One could argue the result in this case should be `true` since the item has been queued successfully. The docs could then say, in case of extra lazy, the results are queued for removal. But you see, this is why for such questions, best fit to answer would be somebody from the core team.
Author
Owner

@l-you commented on GitHub (Jan 11, 2025):

Any update on this? It is a very annoying bug.

@l-you commented on GitHub (Jan 11, 2025): Any update on this? It is a very annoying bug.
Author
Owner

@zhil commented on GitHub (Jul 3, 2025):

This is a strange situation. I believe it's a bug because after upgrading Doctrine, we encountered major bottlenecks on our large database - and we only caught this in production.

For anyone looking for a quick solution: we ended up using raw SQL queries instead of arrayCollection->removeElement().

@zhil commented on GitHub (Jul 3, 2025): This is a strange situation. I believe it's a bug because after upgrading Doctrine, we encountered major bottlenecks on our large database - and we only caught this in production. For anyone looking for a quick solution: we ended up using raw SQL queries instead of arrayCollection->removeElement().
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#6741