[PR #10035] Optimize Object Hydration performance on large result sets #12048

Closed
opened 2026-01-22 16:12:45 +01:00 by admin · 0 comments
Owner

Original Pull Request: https://github.com/doctrine/orm/pull/10035

State: closed
Merged: No


Hi folks!

While assessing a client's project performance I found some optimizations regarding the ORM hydration process that improves speed when working with large result sets and collections.

Expand to get some context on how we reached this. The context is about online carts on a B2B e-commerce platform with a high number of items and complex personalization logic. We reached the point in the optimization process where most of the time was spent in the ORM (after serialization) when fetching a cart.

This makes sense because of the volume and the logic at stake.
Denormalization or caching does not make sense for us because of the low read/write ratio and the business logic involved so we first went with the 2-step hydration technic.

While analyzing profiling results we more or less came to the same conclusion as #8390: the hydration process converts data over and over again even if the entity is already hydrated>

But after trying to work around performance for specific DBAL types that were costly to convert (namely Symfony's UUID and Dunglas's JSON document) and digging deeper into profiles I concluded something could be done directly in the Hydrator for the benefit of everyone instead of writing our specific optimized hydrator:
Screenshot 2022-09-08 at 17 51 38

This PR acts according to one consideration: most parts of the hydration process are not slow per se but are repeated for every row and column. By applying some small optimizations, moving some code before the first iteration, or some code at the very last moment, we can improve the overall process.
The more rows and columns, the more the process is accelerated.

For examples:

  • type conversion can be costly but only identifiers are required to determine if the entity is already hydrated or not: the actual data can be converted later on in the process only if needed using only the columns required for the current entity (not the data for other entities in relations);
  • we can prepare the mapping information in a way that prevents repeating some part of the process thousands of times;
  • and we can avoid using the reflection to retrieve relations properties if they are available in another mapping.

With this patch, I manage to reach the following results:

Metric Before After Notes
->getQuery()->getResult() 726.39 ms 358.67 ms -51%, best of 20 iterations, query result cache enabled, 11 826 entities created
Doctrine\ORM\PersistentCollection::* 164 634 77 345
TypedNoDefaultReflectionProperty::getValue 162 770 125 669
Doctrine\ORM\Internal\Hydration\AbstractHydrator::resultSetMapping 441 742 19 745
Doctrine\ORM\Internal\Hydration\AbstractHydrator::hydrateColumnInfo 946 970 337
Doctrine\DBAL\Types\IntegerType::convertToPHPValue 406 982 279 366
Doctrine\DBAL\Types\StringType::convertToPHPValue 181 207 63 473
Doctrine\DBAL\Types\BooleanType::convertToPHPValue 75 163 7 865
Doctrine\DBAL\Types\DateTimeImmutableType::convertToPHPValue 25 118 789
App\DBAL\Types\UuidType::convertToPHPValue 13 056 413

I hope we can manage to get this work so that everyone enjoys better performance 🤗
If you are willing to move in this direction, we can probably adapt ArrayHydrator the same way.

**Original Pull Request:** https://github.com/doctrine/orm/pull/10035 **State:** closed **Merged:** No --- Hi folks! While assessing a client's project performance I found some optimizations regarding the ORM hydration process that improves speed when working with large result sets and collections. <details> <summary>Expand to get some context on how we reached this.</summary> The context is about online carts on a B2B e-commerce platform with a high number of items and complex personalization logic. We reached the point in the optimization process where most of the time was spent in the ORM (after serialization) when fetching a cart. This makes sense because of the volume and the logic at stake. Denormalization or caching does not make sense for us because of the low read/write ratio and the business logic involved so we first went with the 2-step hydration technic. </details> While analyzing profiling results we more or less came to the same conclusion as #8390: the hydration process converts data over and over again even if the entity is already hydrated> But after trying to work around performance for specific DBAL types that were costly to convert (namely Symfony's UUID and Dunglas's JSON document) and digging deeper into profiles I concluded something could be done directly in the Hydrator for the benefit of everyone instead of writing our specific optimized hydrator: ![Screenshot 2022-09-08 at 17 51 38](https://user-images.githubusercontent.com/870118/189378440-36c194aa-38d3-43c4-88b5-f40f67b70294.png) This PR acts according to one consideration: most parts of the hydration process are not slow per se but are repeated for every row and column. By applying some small optimizations, moving some code before the first iteration, or some code at the very last moment, we can improve the overall process. The more rows and columns, the more the process is accelerated. For examples: - type conversion can be costly but only identifiers are required to determine if the entity is already hydrated or not: the actual data can be converted later on in the process only if needed using only the columns required for the current entity (not the data for other entities in relations); - we can prepare the mapping information in a way that prevents repeating some part of the process thousands of times; - and we can avoid using the reflection to retrieve relations properties if they are available in another mapping. With this patch, I manage to reach the following results: | Metric | Before | After | Notes | |-----------------------------|--------|-------|----------------| |`->getQuery()->getResult()`| 726.39 ms | 358.67 ms | -51%, best of 20 iterations, query result cache enabled, 11 826 entities created |`Doctrine\ORM\PersistentCollection::*`| 164 634 | 77 345 |`TypedNoDefaultReflectionProperty::getValue` | 162 770 | 125 669 |`Doctrine\ORM\Internal\Hydration\AbstractHydrator::resultSetMapping` | 441 742 | 19 745 |`Doctrine\ORM\Internal\Hydration\AbstractHydrator::hydrateColumnInfo`| 946 970 | 337 |`Doctrine\DBAL\Types\IntegerType::convertToPHPValue`| 406 982 | 279 366 |`Doctrine\DBAL\Types\StringType::convertToPHPValue`| 181 207 | 63 473 |`Doctrine\DBAL\Types\BooleanType::convertToPHPValue`| 75 163 | 7 865 |`Doctrine\DBAL\Types\DateTimeImmutableType::convertToPHPValue`| 25 118 | 789 |`App\DBAL\Types\UuidType::convertToPHPValue`| 13 056 | 413 I hope we can manage to get this work so that everyone enjoys better performance 🤗 If you are willing to move in this direction, we can probably adapt ArrayHydrator the same way.
admin added the pull-request label 2026-01-22 16:12:45 +01:00
admin closed this issue 2026-01-22 16:12:45 +01:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#12048