[Perf] useless discriminator column filtering for top level classes #6185

Closed
opened 2026-01-22 15:28:25 +01:00 by admin · 3 comments
Owner

Originally created by @rbs-asimon on GitHub (Feb 22, 2019).

Originally assigned to: @Ocramius on GitHub.

Given the following Address mapping :

/**
 * @Entity
 * @Table(name="`quote-address`")
 * @InheritanceType("SINGLE_TABLE")
 * @DiscriminatorColumn(name="type", type="string")
 * @DiscriminatorMap({"simple" = Address::class, "full" = FullAddress::class})
 */
class Address

It seems to me that the SQL translation of the DQL SELECT a, FROM Address should be the simple SELECT q0_... FROM "quote-address" q0_ instead of the actual SELECT q0_... FROM "quote-address" q0_ WHERE q0_.type IN ('simple', 'full').

It is certainly not wrong but sometimes not so optimal : the condition is only needed when you ask for a subclass.

Originally created by @rbs-asimon on GitHub (Feb 22, 2019). Originally assigned to: @Ocramius on GitHub. Given the following Address mapping : ``` /** * @Entity * @Table(name="`quote-address`") * @InheritanceType("SINGLE_TABLE") * @DiscriminatorColumn(name="type", type="string") * @DiscriminatorMap({"simple" = Address::class, "full" = FullAddress::class}) */ class Address ``` It seems to me that the SQL translation of the DQL `SELECT a, FROM Address` should be the simple `SELECT q0_... FROM "quote-address" q0_` instead of the actual `SELECT q0_... FROM "quote-address" q0_ WHERE q0_.type IN ('simple', 'full')`. It is certainly not wrong but sometimes not so optimal : the condition is only needed when you ask for a subclass.
admin added the BugInvalid labels 2026-01-22 15:28:25 +01:00
admin closed this issue 2026-01-22 15:28:27 +01:00
Author
Owner

@Ocramius commented on GitHub (Feb 23, 2019):

The discriminator is to be covered by an index anyway: closing, since this is not an issue.

@Ocramius commented on GitHub (Feb 23, 2019): The discriminator is to be covered by an index anyway: closing, since this is not an issue.
Author
Owner

@rbs-asimon commented on GitHub (Feb 23, 2019):

@Ocramius Thank you for the feedback.

I have to disagree : it can sometimes be an issue. Please take a look at these MySQL results :

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.22    |
+-----------+
1 row in set (0.00 sec)

Given the table :

CREATE TABLE `address` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `line1` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `line2` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `line3` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `post_code` varchar(15) COLLATE utf8mb4_unicode_ci NOT NULL,
  `city` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `phone` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `fax` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `company_name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `updated_at` datetime NOT NULL,
  `created_at` datetime NOT NULL,
  `deleted_at` datetime DEFAULT NULL,
  `entity_type` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `entity_type` (`entity_type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

populated with some rows :

mysql> select count(*), entity_type from address group by entity_type;
+----------+-------------+
| count(*) | entity_type |
+----------+-------------+
|   132794 |           1 |
|  1225832 |           2 |
+----------+-------------+
2 rows in set (0.23 sec)

We can compare select count(*) from address where entity_type in(1,2) and select count(*) from address :

mysql> explain select count(*) from address where entity_type in(1,2);                                                                                                                                      
| id | select_type | table   | partitions | type  | possible_keys | key         | key_len | ref  | rows   | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------+-------------+---------+------+--------+----------+--------------------------+
|  1 | SIMPLE      | address | NULL       | range | entity_type   | entity_type | 4       | NULL | 931175 |   100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+-------------+---------+------+--------+----------+--------------------------+
mysql> explain select count(*) from address;
| id | select_type | table   | partitions | type  | possible_keys | key         | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+---------+------------+-------+---------------+-------------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | address | NULL       | index | NULL          | entity_type | 4       | NULL | 1361146 |   100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+-------------+---------+------+---------+----------+-------------+
mysql> select SQL_NO_CACHE count(*) from address;                                                                                                                                                           +----------+
| count(*) |
+----------+
|  1358626 |
+----------+
1 row in set, 1 warning (0.16 sec)
mysql> select SQL_NO_CACHE count(*) from address where entity_type in(1,2);                                                                                                                                 +----------+
| count(*) |
+----------+
|  1358626 |
+----------+
1 row in set, 1 warning (0.24 sec)
@rbs-asimon commented on GitHub (Feb 23, 2019): @Ocramius Thank you for the feedback. I have to disagree : it can sometimes be an issue. Please take a look at these MySQL results : ``` mysql> select version(); +-----------+ | version() | +-----------+ | 5.7.22 | +-----------+ 1 row in set (0.00 sec) ``` Given the table : ``` CREATE TABLE `address` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `line1` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `line2` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `line3` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `post_code` varchar(15) COLLATE utf8mb4_unicode_ci NOT NULL, `city` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL, `phone` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `fax` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `company_name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `updated_at` datetime NOT NULL, `created_at` datetime NOT NULL, `deleted_at` datetime DEFAULT NULL, `entity_type` int(11) NOT NULL, PRIMARY KEY (`id`), KEY `entity_type` (`entity_type`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ``` populated with some rows : ``` mysql> select count(*), entity_type from address group by entity_type; +----------+-------------+ | count(*) | entity_type | +----------+-------------+ | 132794 | 1 | | 1225832 | 2 | +----------+-------------+ 2 rows in set (0.23 sec) ``` We can compare `select count(*) from address where entity_type in(1,2)` and `select count(*) from address` : ``` mysql> explain select count(*) from address where entity_type in(1,2); | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+-------------+---------+------+--------+----------+--------------------------+ | 1 | SIMPLE | address | NULL | range | entity_type | entity_type | 4 | NULL | 931175 | 100.00 | Using where; Using index | +----+-------------+---------+------------+-------+---------------+-------------+---------+------+--------+----------+--------------------------+ ``` ``` mysql> explain select count(*) from address; | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+-------------+---------+------+---------+----------+-------------+ | 1 | SIMPLE | address | NULL | index | NULL | entity_type | 4 | NULL | 1361146 | 100.00 | Using index | +----+-------------+---------+------------+-------+---------------+-------------+---------+------+---------+----------+-------------+ ``` ``` mysql> select SQL_NO_CACHE count(*) from address; +----------+ | count(*) | +----------+ | 1358626 | +----------+ 1 row in set, 1 warning (0.16 sec) ``` ``` mysql> select SQL_NO_CACHE count(*) from address where entity_type in(1,2); +----------+ | count(*) | +----------+ | 1358626 | +----------+ 1 row in set, 1 warning (0.24 sec) ```
Author
Owner

@Messere commented on GitHub (Feb 17, 2021):

Just a note: indexes on discriminator were suggested to be bad practice in https://github.com/doctrine/orm/issues/2037
and I can see how having an index on discriminator needs to be considered as case by case optimization.

Omitting any condition that is know to always evaluate to true seems like a good idea, especially that db engine has no way to know it and needs to evaluate the condition with each query.

@Messere commented on GitHub (Feb 17, 2021): Just a note: indexes on discriminator were suggested to be bad practice in https://github.com/doctrine/orm/issues/2037 and I can see how having an index on discriminator needs to be considered as case by case optimization. Omitting any condition that is know to always evaluate to true seems like a good idea, especially that db engine has no way to know it and needs to evaluate the condition with each query.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#6185