DDC-1958: pager produces wrong results on postgresql #2470

Open
opened 2026-01-22 13:54:24 +01:00 by admin · 0 comments
Owner

Originally created by @doctrinebot on GitHub (Jul 30, 2012).

Originally assigned to: @beberlei on GitHub.

Jira issue originally created by user mvrhov:

The query build by pager to get the subset of PKs to fetch produces wrong results on potgresql (and probably any database), that conforms to the SQL standard. The standard says, that if you wish to have the results in specific order, then you have to specify that by using an ORDER BY clause. If such a clause is not present the database can return the results in whatever order it sees fit.

Testcase fixtures:

CREATE TABLE test (
    id integer,
    name text
);

INSERT INTO test VALUES (1, 'c');
INSERT INTO test VALUES (2, 'a');
INSERT INTO test VALUES (3, 'e');
INSERT INTO test VALUES (4, 'b');
INSERT INTO test VALUES (5, 'd');
INSERT INTO test VALUES (6, 'a');
INSERT INTO test VALUES (7, 'g');
INSERT INTO test VALUES (8, 'h');
INSERT INTO test VALUES (9, 'e');
INSERT INTO test VALUES (10, 'j');

Passing f.e.

$qb = $this->repository
    ->createQueryBuilder('t')
    ->select('t')
    ->setFirstResult(0)
    ->setMaxResults(5)
    ->addOrderBy('t.name', 'ASC')

to pager produces SQL like this modified for readability

SELECT DISTINCT id FROM (
    SELECT id, name FROM test ORDER BY name
  ) dctrn_result
  LIMIT 5 OFFSET 0

Now there is nothing wrong with this modified query per se, but there is no ORDER BY clause in the outer query so according to the standard the DB can choose whatever order it seems fit. Now mysql chooses the same order, but postgresql does not and it's probably not the only DB doing so.

If you are interested in the results, this is the output I'm seeing:

  • postgresql: 8,4,1,5,3
  • mysql : 2,6,4,1,5

I and my coworker came to the standard compliant solution it was also tested on the dataset above on both postgresql and mysql and it produced equal results. We have found only one corner case this won't work and IMHO that can't be fixed. The problem is when you do a sort on a field from a table that is in 1:n relation to the main table.. e.g tables posts and tags, where one post can have a multiple tags and you want your results sorted by a tag.

Recipe for a correct query is:

  • remember the ORDER BY fields from original query and then remove them
  • wrap the original query with a DISTINCT query, but add the fields from ORDER BY to the SELECT part of that query and add the whole ORDER BY to the end of it, also add the PK to the order by clause, and add the LIMIT clause
  • wrap the resulting query into another query and select just the id.

so if I take the example from above the SQL should look like this:

SELECT id FROM (
  SELECT DISTINCT id, name FROM (
    SELECT id, name FROM test
  ) dctrn*result*inner
  ORDER BY name, id LIMIT 5 OFFSET 0
) dctrn_result
Originally created by @doctrinebot on GitHub (Jul 30, 2012). Originally assigned to: @beberlei on GitHub. Jira issue originally created by user mvrhov: The query build by pager to get the subset of PKs to fetch produces wrong results on potgresql (and probably any database), that conforms to the SQL standard. The standard says, that if you wish to have the results in specific order, then you have to specify that by using an ORDER BY clause. If such a clause is not present the database can return the results in whatever order it sees fit. Testcase fixtures: ``` CREATE TABLE test ( id integer, name text ); INSERT INTO test VALUES (1, 'c'); INSERT INTO test VALUES (2, 'a'); INSERT INTO test VALUES (3, 'e'); INSERT INTO test VALUES (4, 'b'); INSERT INTO test VALUES (5, 'd'); INSERT INTO test VALUES (6, 'a'); INSERT INTO test VALUES (7, 'g'); INSERT INTO test VALUES (8, 'h'); INSERT INTO test VALUES (9, 'e'); INSERT INTO test VALUES (10, 'j'); ``` Passing f.e. ``` $qb = $this->repository ->createQueryBuilder('t') ->select('t') ->setFirstResult(0) ->setMaxResults(5) ->addOrderBy('t.name', 'ASC') ``` to pager produces SQL like this modified for readability ``` SELECT DISTINCT id FROM ( SELECT id, name FROM test ORDER BY name ) dctrn_result LIMIT 5 OFFSET 0 ``` Now there is nothing wrong with this modified query per se, but there is no ORDER BY clause in the outer query so according to the standard the DB can choose whatever order it seems fit. Now mysql chooses the same order, but postgresql does not and it's probably not the only DB doing so. If you are interested in the results, this is the output I'm seeing: - postgresql: 8,4,1,5,3 - mysql : 2,6,4,1,5 I and my coworker came to the standard compliant solution it was also tested on the dataset above on both postgresql and mysql and it produced equal results. We have found only one corner case this won't work and IMHO that can't be fixed. The problem is when you do a sort on a field from a table that is in 1:n relation to the main table.. e.g tables posts and tags, where one post can have a multiple tags and you want your results sorted by a tag. Recipe for a correct query is: - remember the ORDER BY fields from original query and then remove them - wrap the original query with a DISTINCT query, but add the fields from ORDER BY to the SELECT part of that query and add the whole ORDER BY to the end of it, also add the PK to the order by clause, and add the LIMIT clause - wrap the resulting query into another query and select just the id. so if I take the example from above the SQL should look like this: ``` SELECT id FROM ( SELECT DISTINCT id, name FROM ( SELECT id, name FROM test ) dctrn*result*inner ORDER BY name, id LIMIT 5 OFFSET 0 ) dctrn_result ```
admin added the Bug label 2026-01-22 13:54:24 +01:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#2470