Support unique yet stable identifier generation in "Specification"-style designs #6483

Open
opened 2026-01-22 15:33:54 +01:00 by admin · 6 comments
Owner

Originally created by @mpdude on GitHub (Jun 9, 2020).

Feature Request

Q A
New Feature yes
RFC yes
BC Break no

Summary

I've seen quite a few applications of the "Specification" pattern. In essence, a Specification is a
"business" or "domain logic" class that encapsulates a particular business condition, filter expression or the like. There are various interpretations and approaches of this pattern, but here, this or that might be good starting points if you don't know about it.

Among other things, a Specification needs to be able to "apply" itself to a Doctrine QueryBuilder. While doing so, it might need to add conditions and parameters to a query or perform joins. In both of these cases, the Specification needs to come up with unique names for parameters or join aliases. The challenge with these unique names is that the Specification itself cannot know which other Specifications are applied to the same QueryBuilder, and even several instances of the same Specification class may be applied at once.

In some cases I've seen, Specifications even need to make these aliases publicly available and have to do so even before they got hold of the QueryBuilder that they'll be applied to. (For example, a Specification might express "eager load a particular entity", and another Specification might build on top of it and need the alias.) This defeats an approach where you could use a central \SplObjectStorage to keep track of alias names per QueryBuilder instance, because you don't even have that yet.

One likely solution is to start using uniqid() for the parameter and alias names. But this has a serious drawback: It defeats the Query Cache, since repeated requests for even the very same page will result in different DQL every time. Depending on cache implementation, your cache will just become evicted again and again (APCu behavior, to my knowledge) or gradually fill up disk space and ultimately use up all available OPcache memory for cached, but ever never reused PHP files (e. g. default Symfony Cache pools).

Another solution would be to use something like static variables to generate identifiers based on a counter that re-starts from 0 upon every request. That way, chances are that "the very same request" will generate identical queries again; but of course, conditional code paths that execute additional Specifications - even in completely unrelated queries - break things again.

So, what I would like to discuss here is if and how we could support this from the ORM side.

One approach I had in mind was if I were able to run a "filter" somewhere around Query::_parse(): If at least all uniqid()-based identifiers followed a particular naming convention, those identifiers could be replaced with more stable names that make caching possible again. Yes, this could only be text based, since we want to avoid parsing the DQL in the first place, and so it has the risk of wrong matches; and yes, it would also have to keep a substitution map around to rename parameters, in case the same query is executed multiple times.

This is just a first idea. If you have other ideas, or completely different approaches to the problem, please share them.

Originally created by @mpdude on GitHub (Jun 9, 2020). ### Feature Request | Q | A |------------ | ------ | New Feature | yes | RFC | yes | BC Break | no #### Summary I've seen quite a few applications of the "Specification" pattern. In essence, a Specification is a "business" or "domain logic" class that encapsulates a particular business condition, filter expression or the like. There are various interpretations and approaches of this pattern, but [here](https://martinfowler.com/apsupp/spec.pdf), [this](https://beberlei.de/2013/03/04/doctrine_repositories.html) or [that](https://github.com/Happyr/Doctrine-Specification) might be good starting points if you don't know about it. Among other things, a Specification needs to be able to "apply" itself to a Doctrine `QueryBuilder`. While doing so, it might need to add conditions and parameters to a query or perform joins. In both of these cases, the Specification needs to come up with unique names for parameters or join aliases. The challenge with these unique names is that the Specification itself cannot know which other Specifications are applied to the same `QueryBuilder`, and even several instances of the same Specification class may be applied at once. In some cases I've seen, Specifications even need to make these aliases publicly available and have to do so _even before_ they got hold of the `QueryBuilder` that they'll be applied to. (For example, a Specification might express "eager load a particular entity", and another Specification might build on top of it and need the alias.) This defeats an approach where you could use a central `\SplObjectStorage` to keep track of alias names per `QueryBuilder` instance, because you don't even have that yet. One likely solution is to start using `uniqid()` for the parameter and alias names. But this has a serious drawback: It defeats the Query Cache, since repeated requests for even the very same page will result in different DQL every time. Depending on cache implementation, your cache will just become evicted again and again (APCu behavior, to my knowledge) or gradually fill up disk space and ultimately use up all available OPcache memory for cached, but ever never reused PHP files (e. g. default Symfony Cache pools). Another solution would be to use something like `static` variables to generate identifiers based on a counter that re-starts from 0 upon every request. That way, chances are that "the very same request" will generate identical queries again; but of course, conditional code paths that execute additional Specifications - even in completely unrelated queries - break things again. So, what I would like to discuss here is if and how we could support this from the ORM side. One approach I had in mind was if I were able to run a "filter" somewhere around `Query::_parse()`: If at least all `uniqid()`-based identifiers followed a particular naming convention, those identifiers could be replaced with more stable names that make caching possible again. Yes, this could only be text based, since we want to avoid parsing the DQL in the first place, and so it has the risk of wrong matches; and yes, it would also have to keep a substitution map around to rename parameters, in case the same query is executed multiple times. This is just a first idea. If you have other ideas, or completely different approaches to the problem, please share them.
Author
Owner

@beberlei commented on GitHub (Jun 9, 2020):

Why not a number to increment? "p1", "p2", "p3", ... thats the way we do it for column aliases in the persisters for example.

@beberlei commented on GitHub (Jun 9, 2020): Why not a number to increment? "p1", "p2", "p3", ... thats the way we do it for column aliases in the persisters for example.
Author
Owner

@mpdude commented on GitHub (Jun 9, 2020):

Who could issue/give out these numbers? Not the individual Specifications (requires coordination between all Specifications for a particular query, possibly even before the common QueryBuilder is created).

@mpdude commented on GitHub (Jun 9, 2020): Who could issue/give out these numbers? Not the individual Specifications (requires coordination between all Specifications for a particular query, possibly even before the common `QueryBuilder` is created).
Author
Owner

@mpdude commented on GitHub (Jun 9, 2020):

Query::useQueryCache(false) could be used to turn off the Query Cache. Performance-wise it does not make a difference when uniqid()-based identifiers/names are used, but at least we'd not fill up the cache with junk.

Problem: Specifications operate on the QueryBuilder, and there is currently no way to set this for the Query through the QueryBuilder.

@mpdude commented on GitHub (Jun 9, 2020): `Query::useQueryCache(false)` could be used to turn off the Query Cache. Performance-wise it does not make a difference when `uniqid()`-based identifiers/names are used, but at least we'd not fill up the cache with junk. Problem: Specifications operate on the QueryBuilder, and there is currently no way to set this for the `Query` through the `QueryBuilder`.
Author
Owner

@mpdude commented on GitHub (Jun 9, 2020):

Maybe a query hint, possibly even a default one set on the EntityManager, could be used to trigger something like a generic "DQL pre-processor" before the Parser is activated; or even control the actual Parser class used?

Care must be taken when re-naming things in the DQL because parameter names also need to be taken care of, but parameter processing is independent of the Parser, but at least happens based on the ParserResult.

@mpdude commented on GitHub (Jun 9, 2020): Maybe a query hint, possibly even a default one set on the `EntityManager`, could be used to trigger something like a generic "DQL pre-processor" before the `Parser` is activated; or even control the actual `Parser` class used? Care must be taken when re-naming things in the DQL because parameter names also need to be taken care of, but parameter processing is independent of the `Parser`, but at least happens based on the `ParserResult`.
Author
Owner

@mpdude commented on GitHub (Jun 22, 2020):

Here's a whacky helper class that... well... solves it? The main con argument is that the replacement of temporary, uniqid() based identifiers happens with a naïve string replacement. This is because we cannot afford real DQL parsing at this stage.

Usage:

Whenever you want to join something or add a parameter to the QueryBuilder, call SpecificationIdProvider::createAlias() to create a unique identifier.

When you're finished building the DQL query (or done with the QueryBuilder), call SpecificationIdProvider::cleanup($yourQueryOrQueryBuilder) to obtain a "clean" query where all identifiers will be stable across requests.

<?php

use Doctrine\ORM\Query;
use Doctrine\ORM\QueryBuilder;

class SpecificationIdProvider
{
    public static function createAlias()
    {
        return uniqid('_unique_alias_');
    }

    public static function cleanup($query): Query
    {
        if ($query instanceof QueryBuilder) {
            $query = $query->getQuery();
        }
        $dql = $query->getDQL();

        preg_match_all('/\b_unique_alias_[a-f0-9]{13}\b/', $dql, $matches);

        $map = [];
        foreach ($matches[0] as $key => $alias) {
            $map[$alias] = 'alias_' . $key;
        }

        $cloneQuery = clone $query; // resets hints and parameters, @see \Doctrine\ORM\AbstractQuery::__clone()
        $cloneQuery->setDQL(str_replace(array_keys($map), $map, $dql));
        foreach ($query->getHints() as $name => $value) {
            $cloneQuery->setHint($name, $value);
        }

        foreach ($query->getParameters() as $parameter) {
            $name = $parameter->getName();
            $newName = isset($map[$name]) ? $map[$name] : $name;
            $cloneQuery->setParameter($newName, $parameter->getValue(), $parameter->getType());
        }

        return $cloneQuery;
    }
}
@mpdude commented on GitHub (Jun 22, 2020): Here's a whacky helper class that... well... solves it? The main con argument is that the replacement of temporary, `uniqid()` based identifiers happens with a naïve string replacement. This is because we cannot afford real DQL parsing at this stage. Usage: Whenever you want to join something or add a parameter to the `QueryBuilder`, call `SpecificationIdProvider::createAlias()` to create a unique identifier. When you're finished building the DQL query (or done with the `QueryBuilder`), call `SpecificationIdProvider::cleanup($yourQueryOrQueryBuilder)` to obtain a "clean" query where all identifiers will be stable across requests. ```php <?php use Doctrine\ORM\Query; use Doctrine\ORM\QueryBuilder; class SpecificationIdProvider { public static function createAlias() { return uniqid('_unique_alias_'); } public static function cleanup($query): Query { if ($query instanceof QueryBuilder) { $query = $query->getQuery(); } $dql = $query->getDQL(); preg_match_all('/\b_unique_alias_[a-f0-9]{13}\b/', $dql, $matches); $map = []; foreach ($matches[0] as $key => $alias) { $map[$alias] = 'alias_' . $key; } $cloneQuery = clone $query; // resets hints and parameters, @see \Doctrine\ORM\AbstractQuery::__clone() $cloneQuery->setDQL(str_replace(array_keys($map), $map, $dql)); foreach ($query->getHints() as $name => $value) { $cloneQuery->setHint($name, $value); } foreach ($query->getParameters() as $parameter) { $name = $parameter->getName(); $newName = isset($map[$name]) ? $map[$name] : $name; $cloneQuery->setParameter($newName, $parameter->getValue(), $parameter->getType()); } return $cloneQuery; } } ```
Author
Owner

@mpdude commented on GitHub (Mar 26, 2025):

Just came across the fact that one can add Criteria to a QueryBuilder instance through QueryBuilder::addCriteria().

The values to compare against are placed in parameters, and the parameter names are derived based on the alias and field name being referred to by the expression:

4baa7bd252/src/Query/QueryExpressionVisitor.php (L97-L104)

Not sure if that would help in this case here, but might be worth a more detailed look.

@mpdude commented on GitHub (Mar 26, 2025): Just came across the fact that one can add `Criteria` to a `QueryBuilder` instance through ``QueryBuilder::addCriteria()``. The values to compare against are placed in parameters, and the parameter names are derived based on the alias and field name being referred to by the expression: https://github.com/doctrine/orm/blob/4baa7bd25218f363dc18286a9e1d32e2148b7fee/src/Query/QueryExpressionVisitor.php#L97-L104 Not sure if that would help in this case here, but might be worth a more detailed look.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#6483