Batch Processing using toIterable() #6598

Closed
opened 2026-01-22 15:35:30 +01:00 by admin · 2 comments
Owner

Originally created by @stlrnz on GitHub (Jan 5, 2021).

Hi all!

I've updated my application to Doctrine ORM 2.8 some days ago.

Before that, my Repository code to iterate through a large result set looked like the following and worked perfectly even on >500.000 rows without leaking any memory.

    public function iterateAll(): Generator
    {
        $iterator = $this->createQueryBuilder('e')->getQuery()->iterate(null, Query::HYDRATE_SIMPLEOBJECT);

        foreach ($iterator as $equipment) {
            // index 0 is always the object
            yield $equipment[0];
        }
    }

Since Query::iterate() is deprecated now, I tried to use Query::toIterable() as suggested.

    public function iterateAll(): Generator
    {
        $iterator = $this->createQueryBuilder('e')->getQuery()->toIterable([], Query::HYDRATE_SIMPLEOBJECT);

        foreach ($iterator as $equipment) {
            yield $equipment;
        }
    }

With this implementation I stumbled over a massive memory leak in my application. I debugged this and found out that the AbstractHydrator is never releasing the Objects in AbstractHydrator::toIterable(). The called method AbstractHydrator::hydrateRowData() (or in my case SimpleObjectHydrator::hydrateRowData) just adds new entries to $result.
378944dd27/lib/Doctrine/ORM/Internal/Hydration/AbstractHydrator.php (L177)
378944dd27/lib/Doctrine/ORM/Internal/Hydration/SimpleObjectHydrator.php (L157)

Is that the intended behaviour? Wouldn't it be better to clear $result in every loop cycle to free memory?

If that's intended, is there a better way to loop through large result sets?

Originally created by @stlrnz on GitHub (Jan 5, 2021). Hi all! I've updated my application to Doctrine ORM 2.8 some days ago. Before that, my Repository code to iterate through a large result set looked like the following and worked perfectly even on >500.000 rows without leaking any memory. ``` public function iterateAll(): Generator { $iterator = $this->createQueryBuilder('e')->getQuery()->iterate(null, Query::HYDRATE_SIMPLEOBJECT); foreach ($iterator as $equipment) { // index 0 is always the object yield $equipment[0]; } } ``` Since `Query::iterate()` is deprecated now, I tried to use `Query::toIterable()` as suggested. ``` public function iterateAll(): Generator { $iterator = $this->createQueryBuilder('e')->getQuery()->toIterable([], Query::HYDRATE_SIMPLEOBJECT); foreach ($iterator as $equipment) { yield $equipment; } } ``` With this implementation I stumbled over a massive memory leak in my application. I debugged this and found out that the `AbstractHydrator` is never releasing the Objects in `AbstractHydrator::toIterable()`. The called method `AbstractHydrator::hydrateRowData()` (or in my case `SimpleObjectHydrator::hydrateRowData`) just adds new entries to `$result`. https://github.com/doctrine/orm/blob/378944dd27c1953440c1797fc02ff4620ca956ea/lib/Doctrine/ORM/Internal/Hydration/AbstractHydrator.php#L177 https://github.com/doctrine/orm/blob/378944dd27c1953440c1797fc02ff4620ca956ea/lib/Doctrine/ORM/Internal/Hydration/SimpleObjectHydrator.php#L157 Is that the intended behaviour? Wouldn't it be better to clear `$result` in every loop cycle to free memory? If that's intended, is there a better way to loop through large result sets?
admin added the Bug label 2026-01-22 15:35:30 +01:00
admin closed this issue 2026-01-22 15:35:31 +01:00
Author
Owner

@beberlei commented on GitHub (Jan 6, 2021):

This is a bug

@beberlei commented on GitHub (Jan 6, 2021): This is a bug
Author
Owner

@simPod commented on GitHub (Jan 6, 2021):

@beberlei this looks like in order to fix https://github.com/doctrine/orm/issues/3238 we need to keep hydrated objects in memory so ObjectHydrator can look them up. Any idea? Seems like we can't have both to me 🤔

@simPod commented on GitHub (Jan 6, 2021): @beberlei this looks like in order to fix https://github.com/doctrine/orm/issues/3238 we need to keep hydrated objects in memory so ObjectHydrator can look them up. Any idea? Seems like we can't have both to me 🤔
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#6598