Paginator does not replace WHERE clause and returned data may be incomplete #6840

Open
opened 2026-01-22 15:39:49 +01:00 by admin · 5 comments
Owner

Originally created by @LuigiCardamone on GitHub (Sep 29, 2021).

Bug Report

Q A
BC Break no
Version 2.7.3

Summary

If I have a query like this (using pseudo code for brevity):
SELECT * FROM Author a JOIN Book b WHERE b.price = 10

the Paginator creates a first query to get the list of id:

SELECT DISTINCT id_0
FROM (SELECT DISTINCT id_0
      FROM (
          SELECT * FROM Author a JOIN Book b WHERE b.price = 10 
       ) dctrn_result_inner
      ORDER BY id_0 ASC) dctrn_result
LIMIT 10

Then the Paginator creates a second query passing the list of id fetched from first query:

SELECT * FROM Author a JOIN Book b 
WHERE b.price = 10 
AND a.id IN (1,2,3,4,5,6,7,8,9,10)

The question is, why is the entire WHERE clause not replaced with the "IN condition"?
Which are the reason of keeping the original WHERE clause?

Keeping the original WHERE clause has the following problem:
the Author entity will not have the complete list of his books but only the books with price equal to 10.

I think that this may be a bug since the comment on the class WhereInWalker say that
"Replaces the whereClause of the AST with a WHERE id IN (:foo_1, :foo_2) equivalent."
but the code do not replace but "append".

Current behavior

IN clause is appended to the original WHERE clause

How to reproduce

  • Create any query with a WHERE condition
  • Create the Paginator: $paginator = new Paginator($query, true);
  • Use $paginator->getIterator() to get data
  • Inspect the executed queries

Expected behavior

IN clause should replace the original WHERE clause

Originally created by @LuigiCardamone on GitHub (Sep 29, 2021). ### Bug Report | Q | A |------------ | ------ | BC Break | no | Version | 2.7.3 #### Summary If I have a query like this (using pseudo code for brevity): `SELECT * FROM Author a JOIN Book b WHERE b.price = 10 ` the Paginator creates a first query to get the list of id: ``` SELECT DISTINCT id_0 FROM (SELECT DISTINCT id_0 FROM ( SELECT * FROM Author a JOIN Book b WHERE b.price = 10 ) dctrn_result_inner ORDER BY id_0 ASC) dctrn_result LIMIT 10 ``` Then the Paginator creates a second query passing the list of id fetched from first query: ``` SELECT * FROM Author a JOIN Book b WHERE b.price = 10 AND a.id IN (1,2,3,4,5,6,7,8,9,10) ``` The question is, why is the entire WHERE clause not replaced with the "IN condition"? Which are the reason of keeping the original WHERE clause? Keeping the original WHERE clause has the following problem: the Author entity will not have the complete list of his books but only the books with price equal to 10. I think that this may be a bug since the comment on the class WhereInWalker say that "Replaces the whereClause of the AST with a WHERE id IN (:foo_1, :foo_2) equivalent." but the code do not replace but "append". #### Current behavior IN clause is appended to the original WHERE clause #### How to reproduce - Create any query with a WHERE condition - Create the Paginator: `$paginator = new Paginator($query, true);` - Use `$paginator->getIterator()` to get data - Inspect the executed queries #### Expected behavior IN clause should replace the original WHERE clause
Author
Owner

@stof commented on GitHub (Dec 16, 2021):

Keeping the original WHERE clause has the following problem:
the Author entity will not have the complete list of his books but only the books with price equal to 10.

And this is exactly what happens as well when fetching all results without paginating, because that's what your query is asking for.

So what you suggest here is introducing a bug in the paginator, not fixing one.

@stof commented on GitHub (Dec 16, 2021): > Keeping the original WHERE clause has the following problem: > the Author entity will not have the complete list of his books but only the books with price equal to 10. And this is exactly what happens as well when fetching all results without paginating, because that's what your query is asking for. So what you suggest here is **introducing** a bug in the paginator, not fixing one.
Author
Owner

@LuigiCardamone commented on GitHub (Dec 16, 2021):

Thank you for your feedback.
If you reason with tabular results in mind you are right. If you reason with object results in mind I would expect to get objects that are always complete (an author with all his books, as when I query the author for Id).
Does my point of view make sense for you?

@LuigiCardamone commented on GitHub (Dec 16, 2021): Thank you for your feedback. If you reason with tabular results in mind you are right. If you reason with object results in mind I would expect to get objects that are always complete (an author with all his books, as when I query the author for Id). Does my point of view make sense for you?
Author
Owner

@stof commented on GitHub (Dec 16, 2021):

@LuigiCardamone what does not make sense is expecting a different behavior when you run the same DQL query in a paginated way or an unpaginated way.

Writing a DQL query that applies a filter on the related collection and hydrates that collection will indeed hydrate it with partial data. And this is not a bug in the paginator as using $query->getResult() also does it.
If you want to select all books of the author for authors having at least one book with a price equal to 10, you need to rewrite your query so that it actually does that.

@stof commented on GitHub (Dec 16, 2021): @LuigiCardamone what does not make sense is expecting a different behavior when you run the **same** DQL query in a paginated way or an unpaginated way. Writing a DQL query that applies a filter on the related collection and hydrates that collection will indeed hydrate it with partial data. And this is **not** a bug in the paginator as using `$query->getResult()` also does it. If you want to select all books of the author for authors having at least one book with a price equal to 10, you need to rewrite your query so that it actually does that.
Author
Owner

@LuigiCardamone commented on GitHub (Dec 16, 2021):

@stof thank you for your explanation. Now I have a clear idea of the situation and calling it a bug was completely wrong.
I understand that you are keeping the same behavior of a non paginated DQL query (even if we could question if the standard DQL behavior is ok since hydrating partial data is problematic in any case).
Anyway, we are using this customized version of the Paginator for several months and we found it very useful for our use cases: do you think that it could be useful to add this behavior to the Paginator as an optional feature?

@LuigiCardamone commented on GitHub (Dec 16, 2021): @stof thank you for your explanation. Now I have a clear idea of the situation and calling it a bug was completely wrong. I understand that you are keeping the same behavior of a non paginated DQL query (even if we could question if the standard DQL behavior is ok since hydrating partial data is problematic in any case). Anyway, we are using this customized version of the Paginator for several months and we found it very useful for our use cases: do you think that it could be useful to add this behavior to the Paginator as an optional feature?
Author
Owner

@stof commented on GitHub (Dec 16, 2021):

No it is not something that will be accepted by the ORM maintainers, as it breaks things when using DQL in the expected way.

@stof commented on GitHub (Dec 16, 2021): No it is not something that will be accepted by the ORM maintainers, as it breaks things when using DQL in the expected way.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#6840