[PR #10917] Got rid of redundant JOINs/WHEREs of WhereInWalker #12696

Open
opened 2026-01-22 16:14:53 +01:00 by admin · 0 comments
Owner

Original Pull Request: https://github.com/doctrine/orm/pull/10917

State: closed
Merged: No


Doctrine\ORM\Tools\Pagination\Paginator is an amazing tool that lets you fetch entities in chunks. And it sends two separate SQL queries:

  1. The first query to get unique entity IDs
  2. The second query is fetching the actual entity data. This query uses the result of the previous one as an additional filter.
    For simplicity let's call them id and data queries.

To demonstrate it, lets use CmsUser and CmsAddress entities:

...
$query = new Query($this->em);
$query->setDQL('SELECT u
    FROM Doctrine\\Tests\\Models\\CMS\\CmsUser u
    JOIN Doctrine\\Tests\\Models\\CMS\\CmsAddress a
    WHERE a.city = :filterCity');
$query->setParameters(['filterCity' => 'London']);
$query->setMaxResults(1);
$paginator = (new Paginator($query, true))->setUseOutputWalkers(false);
$paginator->getIterator();

The code above will trigger two SQL queries:
id

SELECT DISTINCT c0_.id AS id_0
FROM cms_users c0_
INNER JOIN cms_addresses c1_
WHERE c1_.city = ? LIMIT 1

data

SELECT
    c0_.id AS id_0,
    c0_.status AS status_1,
    c0_.username AS username_2,
    c0_.name AS name_3,
    c0_.email_id AS email_id_4
FROM cms_users c0_
INNER JOIN cms_addresses c1_
WHERE 
    c1_.city = ?
    AND c0_.id IN (?)

The problem is that the data query has:

  • unused join: INNER JOIN cms_addresses c1_
  • redundant where condition c1_.city = ?
    This PR tries to optimize the data query.

Basically, this PR does the following:

  1. Removes all WHERE conditions for data query, except the condition that comes from the id query results. Please note, that this change is not applied to the queries with subselect. Because in this case subselect part might use some params, so we can not skip the original params in the data query. It should be possible to optimize subselect cases in the future, but it would require a bit more sophisticated solution: for data query remove only original params that are not used in the subquery (instead of skipping all).
  2. All JOINs are skipped for the data query if not any join is used in SELECT, GROUP BY, or ORDER BY clauses. It is done only if the HAVING clause is not specified (simplicity is the main reason, and it can be optimized in the future).

I want to clarify that the EasyAdmin createIndexQueryBuilder triggered this PR, and all the index actions in EasyAdmin should benefit from it.

**Original Pull Request:** https://github.com/doctrine/orm/pull/10917 **State:** closed **Merged:** No --- `Doctrine\ORM\Tools\Pagination\Paginator` is an amazing tool that lets you fetch entities in chunks. And it sends two separate SQL queries: 1. The first query to get unique entity IDs 2. The second query is fetching the actual entity data. This query uses the result of the previous one as an additional filter. For simplicity let's call them id and data queries. To demonstrate it, lets use [CmsUser](https://github.com/doctrine/orm/blob/bb5524099cd088f539b87b34cbfc17bfc930d6fc/tests/Doctrine/Tests/Models/CMS/CmsUser.php#L142) and [CmsAddress](https://github.com/doctrine/orm/blob/bb5524099cd088f539b87b34cbfc17bfc930d6fc/tests/Doctrine/Tests/Models/CMS/CmsAddress.php) entities: ```php ... $query = new Query($this->em); $query->setDQL('SELECT u FROM Doctrine\\Tests\\Models\\CMS\\CmsUser u JOIN Doctrine\\Tests\\Models\\CMS\\CmsAddress a WHERE a.city = :filterCity'); $query->setParameters(['filterCity' => 'London']); $query->setMaxResults(1); $paginator = (new Paginator($query, true))->setUseOutputWalkers(false); $paginator->getIterator(); ```` The code above will trigger two SQL queries: **id** ```sql SELECT DISTINCT c0_.id AS id_0 FROM cms_users c0_ INNER JOIN cms_addresses c1_ WHERE c1_.city = ? LIMIT 1 ``` **data** ```sql SELECT c0_.id AS id_0, c0_.status AS status_1, c0_.username AS username_2, c0_.name AS name_3, c0_.email_id AS email_id_4 FROM cms_users c0_ INNER JOIN cms_addresses c1_ WHERE c1_.city = ? AND c0_.id IN (?) ``` The problem is that the data query has: - unused join: `INNER JOIN cms_addresses c1_` - redundant where condition `c1_.city = ?` This PR tries to optimize the data query. Basically, this PR does the following: 1. Removes all WHERE conditions for **data** query, except the condition that comes from the **id** query results. Please note, that _this change is not applied to the queries with subselect_. Because in this case subselect part might use some params, so we can not skip the original params in the **data** query. It should be possible to optimize subselect cases in the future, but it would require a bit more sophisticated solution: for **data** query remove only original params that are not used in the subquery (instead of skipping all). 2. All `JOIN`s are skipped for the **data** query if not any join is used in `SELECT`, `GROUP BY`, or `ORDER BY` clauses. It is done only if the `HAVING` clause is not specified (simplicity is the main reason, and it can be optimized in the future). I want to clarify that the EasyAdmin `createIndexQueryBuilder` triggered this PR, and all the index actions in EasyAdmin should benefit from it.
admin added the pull-request label 2026-01-22 16:14:53 +01:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#12696