DDC-2332: [UnitOfWork::doPersist()] The spl_object_hash() generate not unique hash! #2929

Closed
opened 2026-01-22 14:07:32 +01:00 by admin · 70 comments
Owner

Originally created by @doctrinebot on GitHub (Mar 5, 2013).

Originally assigned to: @beberlei on GitHub.

Jira issue originally created by user fchris82:

I created fixtures and some data was inserted many times without calling the Task entity PrePersist event listener.

I printed the used and generated hash and I saw a Proxies\*_CG_*\Asitly\ProjectManagementBundle\Entity\User hash equal a Task entity hash!

Originally created by @doctrinebot on GitHub (Mar 5, 2013). Originally assigned to: @beberlei on GitHub. Jira issue originally created by user fchris82: I created fixtures and some data was inserted many times without calling the `Task` entity PrePersist event listener. I printed the used and generated hash and I saw a `Proxies\*_CG_*\Asitly\ProjectManagementBundle\Entity\User` hash equal a `Task` entity hash!
admin added the BugMissing Tests labels 2026-01-22 14:07:32 +01:00
admin closed this issue 2026-01-22 14:07:32 +01:00
Author
Owner

@doctrinebot commented on GitHub (Mar 5, 2013):

Comment created by @ocramius:

Please provide either a code example or a test case. As it stands, this issue is incomplete

@doctrinebot commented on GitHub (Mar 5, 2013): Comment created by @ocramius: Please provide either a code example or a test case. As it stands, this issue is incomplete
Author
Owner

@doctrinebot commented on GitHub (Mar 5, 2013):

Comment created by @beberlei:

Are you calling EntityManager#clear() inbetween? Because PHP reuses the hashes. The ORM accounts for this.

@doctrinebot commented on GitHub (Mar 5, 2013): Comment created by @beberlei: Are you calling EntityManager#clear() inbetween? Because PHP reuses the hashes. The ORM accounts for this.
Author
Owner

@doctrinebot commented on GitHub (Mar 5, 2013):

Comment created by fchris82:

Hi!

http://diginin.hu/splobject_hashbug.zip

  • unzip
  • Edit the spl_object_hash_bug\app\config\parameters.yml (database connection parameters)
  • open command line
  • go to spl_object_hash_bug directory
  • php app\console doctrine:generate:database
  • php app\console doctrine:schema:create
  • php app\console doctrine:fixtures:load

Chris

@doctrinebot commented on GitHub (Mar 5, 2013): Comment created by fchris82: Hi! http://diginin.hu/spl*object_hash*bug.zip - unzip - Edit the spl_object_hash_bug\app\config\parameters.yml (database connection parameters) - open command line - go to spl_object_hash_bug directory - php app\console doctrine:generate:database - php app\console doctrine:schema:create - php app\console doctrine:fixtures:load Chris
Author
Owner

@doctrinebot commented on GitHub (Mar 5, 2013):

Comment created by @beberlei:

This is not a reproduce case, i don't want to execute your whole project.

I want to know, what is the actual bug that you see? Can you just print a list of all the hashes? Because the hashes dont differ at the end, bu tjust somewhere in the middle.

@doctrinebot commented on GitHub (Mar 5, 2013): Comment created by @beberlei: This is not a reproduce case, i don't want to execute your whole project. I want to know, what is the actual bug that you see? Can you just print a list of all the hashes? Because the hashes dont differ at the end, bu tjust somewhere in the middle.
Author
Owner

@doctrinebot commented on GitHub (Mar 5, 2013):

Comment created by fchris82:

I attached a hashlogs.txt file. The last Task class hash is 0000000050ab4aba0000000058e1cb12 ( line 3 129 )

This is not unique, view the line 2 760 . The Task is not being saved and the program don't call the prePersist listener. The "UnitOfWork" believe the entity has been saved because the isset($this->entityStates[$oid]) is true. But it is an other entity.

@doctrinebot commented on GitHub (Mar 5, 2013): Comment created by fchris82: I attached a hashlogs.txt file. The last Task class hash is 0000000050ab4aba0000000058e1cb12 ( line 3 129 ) This is not unique, view the line 2 760 . The Task is not being saved and the program don't call the prePersist listener. The "UnitOfWork" believe the entity has been saved because the `isset($this->entityStates[$oid])` is true. But it is an other entity.
Author
Owner

@doctrinebot commented on GitHub (Mar 6, 2013):

Comment created by fchris82:

The EntityManager::clear() fix the problem, but this is not "good" and "beautiful" solution. Shows no sign of that conflicts were and this is causing the problem. I was looking for the problem 7 hours.

@doctrinebot commented on GitHub (Mar 6, 2013): Comment created by fchris82: The `EntityManager::clear()` fix the problem, but this is not "good" and "beautiful" solution. Shows no sign of that conflicts were and this is causing the problem. I was looking for the problem 7 hours.
Author
Owner

@doctrinebot commented on GitHub (Jun 26, 2014):

Comment created by @ocramius:

One possible issue here is that a listener registers an entity as managed while a proxy is being loaded.

The given data is still insufficient to actually verify the problem.

@doctrinebot commented on GitHub (Jun 26, 2014): Comment created by @ocramius: One possible issue here is that a listener registers an entity as managed while a proxy is being loaded. The given data is still insufficient to actually verify the problem.
Author
Owner

@tomaszmadeyski commented on GitHub (Feb 4, 2016):

I can confirm having this issue. My case is working on quite big collection of entities, and code (simplified) looks like this:

//$idsArr is about 2k elements
foreach ($idsArr as $id) {
    $entityA = new A($id);
    $em->persist($entityA);
}
$em->flush();

//some code happening in between but everything is happening during single request

//$anotherIdsArr is about 2k items
foreach ($anotherIdsArr as $anotherId) {
    $entityB = new B($someConstValue, $anotherId);
    $em->persis($entityB);
}
$em->flush();

Class A definition:

/**
 * @ORM\Entity()
 * @ORM\Table(name="a")
 */
class A
{
    /**
     * @var integer
     *
     * @ORM\Id
     * @ORM\Column(type="integer")
     */
    protected $l1;

    /**
     * @param integer $l1
     */
    public function __construct($l1)
    {
        $this->l1 = $l1;
    }

} 

Class B definition

/**
 * @ORM\Entity()
 * @ORM\Table(name="b")
 */
class B
{
    /**
     * @var integer
     *
     * @ORM\Id
     * @ORM\Column(type="integer")
     */
    protected $l1;

    /**
     * @var integer
     *
     * @ORM\Id
     * @ORM\Column(type="integer")
     */
    protected $l2;

    /**
     * @param integer $l1
     * @param integer $l2
     */
    public function __construct($l1, $l2)
    {
        $this->l1 = $l1;
        $this->l2 = $l2;
    }

}

While trying to persist B instance
$entityB = new B(12, 8995);
I get
$oid == '0000000013204a0200007fc5ffc03559'
and such element already exists in $this->entityIdentifiers so getEntityState method returns STATE_MANAGED so $entityB is not persisted.

$this->entityIdentifiers['0000000013204a0200007fc5ffc03559'] stores instance of A entity: $a = new A(6148)

I can also confirm that calling $em->clear(A::class) before second loop fixes the issue but it's rather a workaround.
I think the issue here is that according to doc spl_object_hash can return same hash if object is destroyed and this is what is happening in my case

@tomaszmadeyski commented on GitHub (Feb 4, 2016): I can confirm having this issue. My case is working on quite big collection of entities, and code (simplified) looks like this: ``` //$idsArr is about 2k elements foreach ($idsArr as $id) { $entityA = new A($id); $em->persist($entityA); } $em->flush(); //some code happening in between but everything is happening during single request //$anotherIdsArr is about 2k items foreach ($anotherIdsArr as $anotherId) { $entityB = new B($someConstValue, $anotherId); $em->persis($entityB); } $em->flush(); ``` Class A definition: ``` /** * @ORM\Entity() * @ORM\Table(name="a") */ class A { /** * @var integer * * @ORM\Id * @ORM\Column(type="integer") */ protected $l1; /** * @param integer $l1 */ public function __construct($l1) { $this->l1 = $l1; } } ``` Class B definition ``` /** * @ORM\Entity() * @ORM\Table(name="b") */ class B { /** * @var integer * * @ORM\Id * @ORM\Column(type="integer") */ protected $l1; /** * @var integer * * @ORM\Id * @ORM\Column(type="integer") */ protected $l2; /** * @param integer $l1 * @param integer $l2 */ public function __construct($l1, $l2) { $this->l1 = $l1; $this->l2 = $l2; } } ``` While trying to persist B instance `$entityB = new B(12, 8995);` I get `$oid == '0000000013204a0200007fc5ffc03559'` and such element already exists in `$this->entityIdentifiers` so `getEntityState` method returns `STATE_MANAGED` so `$entityB` is not persisted. `$this->entityIdentifiers['0000000013204a0200007fc5ffc03559']` stores instance of A entity: `$a = new A(6148)` I can also confirm that calling `$em->clear(A::class)` before second loop fixes the issue but it's rather a workaround. I think the issue here is that according to [doc](http://php.net/manual/en/function.spl-object-hash.php) `spl_object_hash` can return same hash if object is destroyed and this is what is happening in my case
Author
Owner

@Ocramius commented on GitHub (Feb 4, 2016):

entityIdentifiers keeps a reference of type A, therefore that A wasn't garbage-collected. This means that the issue comes from somewhere else.

As I already stated above, this issue needs a reproducible test case.

@Ocramius commented on GitHub (Feb 4, 2016): `entityIdentifiers` keeps a reference of type `A`, therefore that `A` wasn't garbage-collected. This means that the issue comes from somewhere else. As I already stated above, this issue needs a reproducible test case.
Author
Owner

@tomaszmadeyski commented on GitHub (Feb 4, 2016):

the thing is it looks like A was garbage collected - as I stated in comment there's quite a lot happening between these two loops so it might have been garbage collected

@tomaszmadeyski commented on GitHub (Feb 4, 2016): the thing is it looks like `A` was garbage collected - as I stated in comment there's quite a lot happening between these two loops so it might have been garbage collected
Author
Owner

@thebuccaneersden commented on GitHub (Jun 3, 2016):

@Ocramius I believe I have a reproducible test case here ~~> https://github.com/thebuccaneersden/doctrine-bug

@thebuccaneersden commented on GitHub (Jun 3, 2016): @Ocramius I believe I have a reproducible test case here ~~> https://github.com/thebuccaneersden/doctrine-bug
Author
Owner

@Ocramius commented on GitHub (Jun 3, 2016):

@thebuccaneersden no, that is not a test case, that is an application that I'd have to set up and run, and that's not going to happen, sorry :-\

Please refer to 1c2b7c9685/tests/Doctrine/Tests/ORM/Functional/Ticket for a list of existing functional test cases.

@Ocramius commented on GitHub (Jun 3, 2016): @thebuccaneersden no, that is not a test case, that is an application that I'd have to set up and run, and that's not going to happen, sorry :-\ Please refer to https://github.com/doctrine/doctrine2/tree/1c2b7c968561f2e319117d7860c88e8c0cad3c7a/tests/Doctrine/Tests/ORM/Functional/Ticket for a list of existing functional test cases.
Author
Owner

@sroze commented on GitHub (Feb 1, 2018):

I just had the same issue: the spl_object_hash of an entity was the same than a previous one (completely different class), the $uow->getEntityIdentifier() was returning a wrong ID (the one of the previous one) and therefore the BasicEntityPersister::prepareUpdateData method was failing.

Calling $em->clear() worked around the bug.

@sroze commented on GitHub (Feb 1, 2018): I just had the same issue: the `spl_object_hash` of an entity was the same than a previous one (completely different class), the `$uow->getEntityIdentifier()` was returning a wrong ID (the one of the previous one) and therefore the `BasicEntityPersister::prepareUpdateData` method was failing. Calling `$em->clear()` worked around the bug.
Author
Owner

@petervf commented on GitHub (Feb 28, 2018):

This test triggers the problem consistently for me.

namespace Doctrine\Tests\ORM\Functional;

use Doctrine\Tests\Models\CMS\CmsArticle;
use Doctrine\Tests\Models\CMS\CmsUser;
use Doctrine\Tests\OrmFunctionalTestCase;

class HashCollisionTest extends OrmFunctionalTestCase
{
    protected function setUp()
    {
        $this->useModelSet('cms');
        parent::setUp();
    }

    public function testHashCollision() {
        $user = new CmsUser();
        $user->username = "Test";
        $user->name     = "Test";
        $this->_em->persist($user);
        $this->_em->flush();

        $articles = [];
        for($i=0;$i<100;$i++) {
            $article = new CmsArticle();
            $article->topic = "Test";
            $article->text  = "Test";
            $article->setAuthor($this->_em->merge($user));
            $this->_em->persist($article);
            $this->_em->flush();
            $this->_em->clear();
            $articles [] = $article;
        }

        $user = $this->_em->merge($user);
        foreach($articles as $article) {
            $article = $this->_em->merge($article);
            $article->setAuthor($user);
        }
        unset($article);

        gc_collect_cycles();

        $keep = [];
        for($x=0;$x<1000;$x++) {
            $keep[] = $article = new CmsArticle();

            $article->topic = "Test";
            $article->text  = "Test";
            $article->setAuthor($this->_em->merge($user));

            $this->_em->persist($article);
            $this->_em->flush();
            $this->assertNotNull($article->id, "Article wasn't persisted on iteration $x");
        }
    }
}

The problem is in the loop that merges the articles. Every article->user is its own seperate instance of a proxy object. The spl_object_hash of these objects get added to entityStates/entityIdentifiers in UnitOfWork->registerManaged(). However, no reference to the object is kept here. RegisterManaged calls UnitOfWork->addToIdentityMap(). The identityMap does keep a reference to the object, but only for the first object with the same id. This means that for most of the CmsUser Proxy objects, no reference is kept from within doctrine, so if I break the last reference in my code by calling $article->setAuthor($user), the proxy object may be garbage collected, but UnitOfWork->entityStates and UnitOfWork->entityIdentifiers still contain the spl_object_hash of the now non existent object.

If I now try to create a new CmsArticle, php may use the same memory location, generating the same object hash. In this case that causes doctrine to think it's already managing the entity, and not persisting it.

@petervf commented on GitHub (Feb 28, 2018): This test triggers the problem consistently for me. ```php namespace Doctrine\Tests\ORM\Functional; use Doctrine\Tests\Models\CMS\CmsArticle; use Doctrine\Tests\Models\CMS\CmsUser; use Doctrine\Tests\OrmFunctionalTestCase; class HashCollisionTest extends OrmFunctionalTestCase { protected function setUp() { $this->useModelSet('cms'); parent::setUp(); } public function testHashCollision() { $user = new CmsUser(); $user->username = "Test"; $user->name = "Test"; $this->_em->persist($user); $this->_em->flush(); $articles = []; for($i=0;$i<100;$i++) { $article = new CmsArticle(); $article->topic = "Test"; $article->text = "Test"; $article->setAuthor($this->_em->merge($user)); $this->_em->persist($article); $this->_em->flush(); $this->_em->clear(); $articles [] = $article; } $user = $this->_em->merge($user); foreach($articles as $article) { $article = $this->_em->merge($article); $article->setAuthor($user); } unset($article); gc_collect_cycles(); $keep = []; for($x=0;$x<1000;$x++) { $keep[] = $article = new CmsArticle(); $article->topic = "Test"; $article->text = "Test"; $article->setAuthor($this->_em->merge($user)); $this->_em->persist($article); $this->_em->flush(); $this->assertNotNull($article->id, "Article wasn't persisted on iteration $x"); } } } ``` The problem is in the loop that merges the articles. Every article->user is its own seperate instance of a proxy object. The spl_object_hash of these objects get added to entityStates/entityIdentifiers in UnitOfWork->registerManaged(). However, no reference to the object is kept here. RegisterManaged calls UnitOfWork->addToIdentityMap(). The identityMap does keep a reference to the object, but only for the first object with the same id. This means that for most of the CmsUser Proxy objects, no reference is kept from within doctrine, so if I break the last reference in my code by calling $article->setAuthor($user), the proxy object may be garbage collected, but UnitOfWork->entityStates and UnitOfWork->entityIdentifiers still contain the spl_object_hash of the now non existent object. If I now try to create a new CmsArticle, php may use the same memory location, generating the same object hash. In this case that causes doctrine to think it's already managing the entity, and not persisting it.
Author
Owner

@sandvige commented on GitHub (Sep 6, 2018):

We're also reproducing the issue. From our point of view, it appears to be because entities are kept into $identityMap using a strategy that is different than others maps ($entityIdentifiers, $originalEntityData, etc. which are oid based). When an entity is created using the same ids as a previous one, the new entity is tracked by maps oid-based, but it's NOT tracked by the $identityMap (because the check is based on keys, not on oid) . If the previous one is garbage collected (this is eventually happening anytime), the $identityMap will be confused when it comes to find the associated entity, resulting into a problematic skip. We're about to provide a fix, if someone can validate the way we're understanding the issue, this could save us time before writing code. We're about to create something like an IdentityMapManager that will track ALL entities:

something like that in addToIdentityMap(): $this->identityMap[$className][$idHash][$oid] = $entity;

@sandvige commented on GitHub (Sep 6, 2018): We're also reproducing the issue. From our point of view, it appears to be because entities are kept into $identityMap using a strategy that is different than others maps ($entityIdentifiers, $originalEntityData, etc. which are oid based). When an entity is created using the same ids as a previous one, the new entity is tracked by maps oid-based, but it's NOT tracked by the $identityMap (because the check is based on keys, not on oid) . If the previous one is garbage collected (this is eventually happening anytime), the $identityMap will be confused when it comes to find the associated entity, resulting into a problematic skip. We're about to provide a fix, if someone can validate the way we're understanding the issue, this could save us time before writing code. We're about to create something like an IdentityMapManager that will track ALL entities: something like that in addToIdentityMap(): $this->identityMap[$className][$idHash][$oid] = $entity;
Author
Owner

@petervf commented on GitHub (Sep 6, 2018):

@sandvige We worked around it by implementing an EntityMangerDecorator that keeps a reference to every entity that passes through it; only releasing it when clear() is called. I'd guess that ends up being functionally the same as your suggestion.

Keeping a reference to all entities in this way is far from ideal but it's the best we could do.

An anonimized version of our implementation is:

class OurEntityManager extends EntityManagerDecorator {

    /** @var array */
    private $references = [];

    public function __construct(EntityManagerInterface $wrapped) {
        parent::__construct($wrapped);
    }

    public function clear($objectName = null) {
        if($objectName === null) {
            $this->references = [];
        }

        parent::clear($objectName);
    }

    public function detach($object){
        unset($this->references[spl_object_hash($object)]);

        parent::detach($object);
    }

    public function remove($object) {
        $this->references[spl_object_hash($object)] = $object;
        parent::remove($object);
    }

    public function persist($object) {
        $this->references[spl_object_hash($object)] = $object;
        parent::persist($object);
    }

    public function merge($object) {
        $newObject = parent::merge($object);
        $this->references[spl_object_hash($newObject)] = $newObject;

        foreach($this->getClassProperties($newObject) as $property) {
            $value = $property->getValue($newObject);
            if(is_object($value) && !($value instanceof \DateTime)) {
                $this->references[spl_object_hash($value)] = $value;
            }
        }

        return $newObject;
    }

    private $classPropertiesCache = [];
    private function getClassProperties($entity){
        $class = get_class($entity);

        unset($this->classPropertiesCache[$class]);
        if(!isset($this->classPropertiesCache[$class])) {
            $arr = [];
            $this->__getClassproperties(new \ReflectionClass($entity), $arr);
            $this->classPropertiesCache[$class] = $arr;
        }

        return $this->classPropertiesCache[$class];
    }

    private function __getClassproperties(\ReflectionClass $reflectionClass, &$results) {
        $metaData = null;
        try {
            $metaData = $this->wrapped->getMetadataFactory()->getMetadataFor($reflectionClass->getName());
        }catch(MappingException $mex) {
            // Skip when a subclass is not a valid entity
        }

        if($metaData !== null) {
            foreach($reflectionClass->getProperties() as $property) {
                $propertyName = $property->getName();

                if(isset($metaData->associationMappings[$propertyName])) {
                    if(isset($metaData->associationMappings[$propertyName])) {
                        $property->setAccessible(true);
                        $results[] = $property;
                    }
                }
            }
        }

        if($reflectionClass->getParentClass() !== false) {
            $this->__getClassProperties($reflectionClass->getParentClass(), $results);
        }

        return $results;
    }
}
@petervf commented on GitHub (Sep 6, 2018): @sandvige We worked around it by implementing an EntityMangerDecorator that keeps a reference to every entity that passes through it; only releasing it when clear() is called. I'd guess that ends up being functionally the same as your suggestion. Keeping a reference to all entities in this way is far from ideal but it's the best we could do. An anonimized version of our implementation is: ```php class OurEntityManager extends EntityManagerDecorator { /** @var array */ private $references = []; public function __construct(EntityManagerInterface $wrapped) { parent::__construct($wrapped); } public function clear($objectName = null) { if($objectName === null) { $this->references = []; } parent::clear($objectName); } public function detach($object){ unset($this->references[spl_object_hash($object)]); parent::detach($object); } public function remove($object) { $this->references[spl_object_hash($object)] = $object; parent::remove($object); } public function persist($object) { $this->references[spl_object_hash($object)] = $object; parent::persist($object); } public function merge($object) { $newObject = parent::merge($object); $this->references[spl_object_hash($newObject)] = $newObject; foreach($this->getClassProperties($newObject) as $property) { $value = $property->getValue($newObject); if(is_object($value) && !($value instanceof \DateTime)) { $this->references[spl_object_hash($value)] = $value; } } return $newObject; } private $classPropertiesCache = []; private function getClassProperties($entity){ $class = get_class($entity); unset($this->classPropertiesCache[$class]); if(!isset($this->classPropertiesCache[$class])) { $arr = []; $this->__getClassproperties(new \ReflectionClass($entity), $arr); $this->classPropertiesCache[$class] = $arr; } return $this->classPropertiesCache[$class]; } private function __getClassproperties(\ReflectionClass $reflectionClass, &$results) { $metaData = null; try { $metaData = $this->wrapped->getMetadataFactory()->getMetadataFor($reflectionClass->getName()); }catch(MappingException $mex) { // Skip when a subclass is not a valid entity } if($metaData !== null) { foreach($reflectionClass->getProperties() as $property) { $propertyName = $property->getName(); if(isset($metaData->associationMappings[$propertyName])) { if(isset($metaData->associationMappings[$propertyName])) { $property->setAccessible(true); $results[] = $property; } } } } if($reflectionClass->getParentClass() !== false) { $this->__getClassProperties($reflectionClass->getParentClass(), $results); } return $results; } } ```
Author
Owner

@Ocramius commented on GitHub (Sep 6, 2018):

If I understand the problem correctly, we are getting an entity garbage collected while we still have a reference to its OID: does the problem present itself also on master?

@Ocramius commented on GitHub (Sep 6, 2018): If I understand the problem correctly, we are getting an entity garbage collected while we still have a reference to its OID: does the problem present itself also on `master`?
Author
Owner

@sandvige commented on GitHub (Sep 6, 2018):

@petervf Thanks for the code! Indeed, this is also a valid approach. @Ocramius As long as we're only keeping one entity here: https://github.com/doctrine/doctrine2/blob/master/lib/Doctrine/ORM/UnitOfWork.php#L1355

By the way, the function returns false when this is happening, but every piece of code using this method is giving approximately 0 sh*ts :D

@sandvige commented on GitHub (Sep 6, 2018): @petervf Thanks for the code! Indeed, this is also a valid approach. @Ocramius As long as we're only keeping one entity here: https://github.com/doctrine/doctrine2/blob/master/lib/Doctrine/ORM/UnitOfWork.php#L1355 By the way, the function returns false when this is happening, but every piece of code using this method is giving approximately 0 sh*ts :D
Author
Owner

@sandvige commented on GitHub (Sep 24, 2018):

@Ocramius I added code using the provided unit tests (this is failing without a fix), and I also added a dumb fix showing unit tests going green. I will work on it in the next day, I just wanted to bump this issue as I'll provide a way to handle the issue by keeping an array that maps entities with oids (the missing part I was explaining a few comments above)

@sandvige commented on GitHub (Sep 24, 2018): @Ocramius I added code using the provided unit tests (this is failing without a fix), and I also added a dumb fix showing unit tests going green. I will work on it in the next day, I just wanted to bump this issue as I'll provide a way to handle the issue by keeping an array that maps entities with oids (the missing part I was explaining a few comments above)
Author
Owner

@carlospauluk commented on GitHub (Oct 11, 2018):

Really bad choice to use spl_object_hash(), since it is not reliable when dealing with too much garbage collection. It just doesnt provide an unique representation of all object, forcing us to have to call $entityManager->clear() all the time.

@carlospauluk commented on GitHub (Oct 11, 2018): Really bad choice to use spl_object_hash(), since it is not reliable when dealing with too much garbage collection. It just doesnt provide an unique representation of all object, forcing us to have to call $entityManager->clear() all the time.
Author
Owner

@Ocramius commented on GitHub (Oct 15, 2018):

@carlospauluk there is no other concept of object identity in php-src. As long as objects are not GC'd, the identifiers are also not reused, hence the bug must be in a retained identifier in the UoW.

From the tests in https://github.com/doctrine/doctrine2/pull/7407, it seems like the bug is when merging entities, where an object is GC'd, while the identity map is not correctly cleaned.

If you want to help out, please try and debug the scenario to find the exact location of the retained identifier. Regardless if spl_object_hash() or spl_object_id() are used, that's our only way to retain object identity in a map in this programming language.

@Ocramius commented on GitHub (Oct 15, 2018): @carlospauluk there is no other concept of object identity in php-src. As long as objects are not GC'd, the identifiers are also not reused, hence the bug must be in a retained identifier in the UoW. From the tests in https://github.com/doctrine/doctrine2/pull/7407, it seems like the bug is when merging entities, where an object is GC'd, while the identity map is not correctly cleaned. If you want to help out, please try and debug the scenario to find the exact location of the retained identifier. Regardless if `spl_object_hash()` or `spl_object_id()` are used, that's our only way to retain object identity in a map in this programming language.
Author
Owner

@sandvige commented on GitHub (Dec 24, 2018):

@Ocramius & @carlospauluk spl_object_hash() and spl_object_id() are ok if we manage to keep a reference to the associated objects (which is exactly what we do in the provided fix, with the oidMap attribute). If the bug is correctly understood, why don't we merge it? Because of the tests? We cannot provide an exact test, because the GC is running across ALL the running tests, making it impossible to predict its behavior. This is why the provided unit test is trying to "saturate" the memory to maximize our chance for the GC to run.

@sandvige commented on GitHub (Dec 24, 2018): @Ocramius & @carlospauluk spl_object_hash() and spl_object_id() are ok if we manage to keep a reference to the associated objects (which is exactly what we do in the provided fix, with the oidMap attribute). If the bug is correctly understood, why don't we merge it? Because of the tests? We cannot provide an exact test, because the GC is running across ALL the running tests, making it impossible to predict its behavior. This is why the provided unit test is trying to "saturate" the memory to maximize our chance for the GC to run.
Author
Owner

@Ocramius commented on GitHub (Dec 27, 2018):

why don't we merge it? Because of the tests?

Precisely. No test => no regression verification => no way to check if we don't break this accidentally further down the line.

Let me be very clear that i do not merge nor approve patches that do not have automated verification. There are exclusions to this, but this isn't the case.

the GC is running across ALL the running tests

Then let's use @runInSeparateProcess, or a separate CLI script that is only there to verify this specific scenario, but it is required to run in CI (with low -dmemory_limit=).

No reliable automation => no merge.

@Ocramius commented on GitHub (Dec 27, 2018): > why don't we merge it? Because of the tests? Precisely. No test => no regression verification => no way to check if we don't break this accidentally further down the line. Let me be very clear that **i do not merge nor approve patches that do not have automated verification**. There are exclusions to this, but this isn't the case. > the GC is running across ALL the running tests Then let's use `@runInSeparateProcess`, or a separate CLI script that is only there to verify this specific scenario, but it is required to run in CI (with low `-dmemory_limit=`). No reliable automation => no merge.
Author
Owner

@carlospauluk commented on GitHub (Jan 2, 2019):

If I'm getting it right, we can see this divergence when we look at the functions isInIdentityMap () and getEntityState () in the UnitOfWork.php.
I'm really beginning with the doctrine internals code, but I see that getEntityState () relies only on the oid = spl_object_hash ( entity) to check the state of an entity, while
isInIdentityMap () relies on the spl_object_hash plus the "$ classMetadata-> rootEntityName".
This is exactly the problem that I'm getting here with my code.
Indeed @Ocramius, after I made a bunch of merges with detached entities, someway the GC seems to rip off the references just after the finish of my "mergeAll" function, and then when I try to persist a fresh new entity (a few functions later in the same thread), the getEntityState () gets "confused" and returns that this new entity is already persisted and does not schedule the INSERT.

Why do not use the double check (rootEntityName + spl_object_hash) to ensure that we're talking about the same object instead of only using the weak spl_object_hash () logic?

@carlospauluk commented on GitHub (Jan 2, 2019): If I'm getting it right, we can see this divergence when we look at the functions isInIdentityMap () and getEntityState () in the UnitOfWork.php. I'm really beginning with the doctrine internals code, but I see that getEntityState () relies only on the $ oid = spl_object_hash ($ entity) to check the state of an entity, while isInIdentityMap () relies on the spl_object_hash plus the "$ classMetadata-> rootEntityName". This is exactly the problem that I'm getting here with my code. Indeed @Ocramius, after I made a bunch of merges with detached entities, someway the GC seems to rip off the references just after the finish of my "mergeAll" function, and then when I try to persist a fresh new entity (a few functions later in the same thread), the getEntityState () gets "confused" and returns that this new entity is already persisted and does not schedule the INSERT. Why do not use the double check (rootEntityName + spl_object_hash) to ensure that we're talking about the same object instead of only using the weak spl_object_hash () logic?
Author
Owner

@sandvige commented on GitHub (Feb 8, 2019):

@Ocramius You're right, you're the man in charge of quality, you're doing it right. But I'm providing tests. What is wrong with the tests I'm providing except the resources its using? Finally, your last comment is just saying "everything is good". I was just arguing why we need to use 1000+ entities to reproduce, and why we cannot reproduce with only 2. What is wrong?

@sandvige commented on GitHub (Feb 8, 2019): @Ocramius You're right, you're the man in charge of quality, you're doing it right. But I'm providing tests. What is wrong with the tests I'm providing except the resources its using? Finally, your last comment is just saying "everything is good". I was just arguing why we need to use 1000+ entities to reproduce, and why we cannot reproduce with only 2. What is wrong?
Author
Owner

@Ocramius commented on GitHub (Feb 8, 2019):

@sandvige https://github.com/doctrine/orm/pull/7407/files#discussion_r220486572

@Ocramius commented on GitHub (Feb 8, 2019): @sandvige https://github.com/doctrine/orm/pull/7407/files#discussion_r220486572
Author
Owner

@sandvige commented on GitHub (Feb 8, 2019):

@Ocramius Oh! Ok, I will remove this fix from this PR, and add it to another with the "Missing Tests" flag. Thanks for your input

@sandvige commented on GitHub (Feb 8, 2019): @Ocramius Oh! Ok, I will remove this fix from this PR, and add it to another with the "Missing Tests" flag. Thanks for your input
Author
Owner

@sandvige commented on GitHub (Feb 11, 2019):

@Ocramius I updated the PR :)

@sandvige commented on GitHub (Feb 11, 2019): @Ocramius I updated the PR :)
Author
Owner

@codler commented on GitHub (Mar 4, 2020):

I hope this will be fixed soon!

@codler commented on GitHub (Mar 4, 2020): I hope this will be fixed soon!
Author
Owner

@NicklasWallgren commented on GitHub (Mar 5, 2020):

My temporary fix until this issue has been resolved.

interface UniqueIdentifier
{

    /**
     * Returns a unique entity id.
     *
     * @return string
     */
    public function uniqueId(): string;

}
    /**
     * Calculates a unique id.
     *
     * @param mixed $entity
     * @return string
     */
    public static function calculateUniqueId($entity)
    {
        if ($entity instanceof UniqueIdentifier) {
            return $entity->uniqueId();
        }
        
        // Risk of collisions
        return \spl_object_hash($entity);
    }

abstract class Entity implements UniqueIdentifier {

    /**
     * @var string
     */
    private $uniqueId;

    /**
     * @inheritDoc
     */
    public function uniqueId(): string
    {
        if ($this->uniqueId === null) {
            $this->uniqueId = uniqid();
        }

        return $this->uniqueId;
    }

}

@NicklasWallgren commented on GitHub (Mar 5, 2020): My temporary fix until this issue has been resolved. ``` interface UniqueIdentifier { /** * Returns a unique entity id. * * @return string */ public function uniqueId(): string; } ``` ``` /** * Calculates a unique id. * * @param mixed $entity * @return string */ public static function calculateUniqueId($entity) { if ($entity instanceof UniqueIdentifier) { return $entity->uniqueId(); } // Risk of collisions return \spl_object_hash($entity); } ```` ``` abstract class Entity implements UniqueIdentifier { /** * @var string */ private $uniqueId; /** * @inheritDoc */ public function uniqueId(): string { if ($this->uniqueId === null) { $this->uniqueId = uniqid(); } return $this->uniqueId; } }
Author
Owner

@Matt-PMCT commented on GitHub (Oct 26, 2020):

Got bit by this today, couldn't figure out why an entity wasn't persisting/flushing, hash collision...

@Matt-PMCT commented on GitHub (Oct 26, 2020): Got bit by this today, couldn't figure out why an entity wasn't persisting/flushing, hash collision...
Author
Owner

@Hikariii commented on GitHub (Nov 12, 2020):

As a solution a weakmap can be used (php 8).
See https://wiki.php.net/rfc/weak_maps for more details and how it relates to spl_object_hash.

@Hikariii commented on GitHub (Nov 12, 2020): As a solution a weakmap can be used (php 8). See https://wiki.php.net/rfc/weak_maps for more details and how it relates to spl_object_hash.
Author
Owner

@bigfoot90 commented on GitHub (Aug 27, 2021):

I'm also getting this bug. Are there any progress on this?

@bigfoot90 commented on GitHub (Aug 27, 2021): I'm also getting this bug. Are there any progress on this?
Author
Owner

@greg0ire commented on GitHub (Aug 27, 2021):

Maybe… in https://github.com/doctrine/orm/pull/8837, I migrated Doctrine to the new spl_object_id(), which might not have any collisions. It's not released yet, but maybe you can try?

@greg0ire commented on GitHub (Aug 27, 2021): Maybe… in https://github.com/doctrine/orm/pull/8837, I migrated Doctrine to the new `spl_object_id()`, which might not have any collisions. It's not released yet, but maybe you can try?
Author
Owner

@NicklasWallgren commented on GitHub (Aug 27, 2021):

@greg0ire The same issue applies to spl_object_id() as far as I understand.

"The object id is unique for the lifetime of the object. Once the object is destroyed, its id may be reused for other objects".
https://www.php.net/manual/en/function.spl-object-id.php

@NicklasWallgren commented on GitHub (Aug 27, 2021): @greg0ire The same issue applies to `spl_object_id()` as far as I understand. "The object id is unique for the lifetime of the object. Once the object is destroyed, its id may be **reused** for other objects". https://www.php.net/manual/en/function.spl-object-id.php
Author
Owner

@bigfoot90 commented on GitHub (Aug 27, 2021):

I've tried branch 2.10.x too and it's not working

@bigfoot90 commented on GitHub (Aug 27, 2021): I've tried branch `2.10.x` too and it's not working
Author
Owner

@greg0ire commented on GitHub (Aug 27, 2021):

Oh right, that's unfortunate! Please spl_object_id() is already mentioned above, I need to re-read the whole thread to fully understand where we're at.

@greg0ire commented on GitHub (Aug 27, 2021): Oh right, that's unfortunate! Please `spl_object_id()` is already mentioned above, I need to re-read the whole thread to fully understand where we're at.
Author
Owner

@Matt-PMCT commented on GitHub (Apr 18, 2023):

It seems tests are written, a pull is made at #8994 , is there anything an average user can do to push this issue forward? I'm still encountering random objects in my databases not being saved because they get eaten by the hash collisions...

@Matt-PMCT commented on GitHub (Apr 18, 2023): It seems tests are written, a pull is made at #8994 , is there anything an average user can do to push this issue forward? I'm still encountering random objects in my databases not being saved because they get eaten by the hash collisions...
Author
Owner

@greg0ire commented on GitHub (Apr 18, 2023):

@Matt-PMCT if you are using PHP 8, maybe you could experiment with weak maps, as suggested in https://github.com/doctrine/orm/issues/3037#issuecomment-726360458 ?

I think switching to those would fix the bug without requiring a test for the bug (which is the part that seems to be the issue), because switching to weak maps can also be seen as a performance improvement: if the object has been garbage collected, there is no need to keep its identifier in memory, right? If it does fix the issue, then we can:

  • use weak maps on 3.x (in fact, we probably should, regardless)
  • think of a way to use them on 2.x as well (by moving the bits that change in 2 different traits, and pick a trait or the other based on the current version of PHP)
@greg0ire commented on GitHub (Apr 18, 2023): @Matt-PMCT if you are using PHP 8, maybe you could experiment with weak maps, as suggested in https://github.com/doctrine/orm/issues/3037#issuecomment-726360458 ? I think switching to those would fix the bug without requiring a test for the bug (which is the part that seems to be the issue), because switching to weak maps can also be seen as a performance improvement: if the object has been garbage collected, there is no need to keep its identifier in memory, right? If it does fix the issue, then we can: - use weak maps on 3.x (in fact, we probably should, regardless) - think of a way to use them on 2.x as well (by moving the bits that change in 2 different traits, and pick a trait or the other based on the current version of PHP)
Author
Owner

@Matt-PMCT commented on GitHub (May 10, 2023):

@greg0ire I got a chance to look at this today and looked at the comment you linked. Unfortunately, the comment doesn't provide enough information for me to do any sort of educated implementation. spl_object_id is used 59 times in the UnitOfWork file, and after reading the PHP RFC and the final documentation (https://www.php.net/manual/en/class.weakmap.php) I do not fully understand how this would be implemented.

My first attempt was to transition just one of the arrays to a WeakMap, so I tried modify the UnitOfWork such that the entityInsertions array was converted to use a WeakMap. But what I ran into was that there are several functions that do things like array merges, diffs, and such, so I could not just convert the array to a Weak Map.

I think I am not understanding how it is envisioned that WeakMaps may be used here (also admittedly this is my first ever foray into them). I'm still willing to try, but I need a little guidance. Even the roughest description of how someone would envision this occurring.

My current project has a deployment script that can reliably trigger this hash collision, but it is not easily distilled into a unit test since it is a conglomeration of many different entities spanning my entire project.

@Matt-PMCT commented on GitHub (May 10, 2023): @greg0ire I got a chance to look at this today and looked at the comment you linked. Unfortunately, the comment doesn't provide enough information for me to do any sort of educated implementation. spl_object_id is used 59 times in the UnitOfWork file, and after reading the PHP RFC and the final documentation (https://www.php.net/manual/en/class.weakmap.php) I do not fully understand how this would be implemented. My first attempt was to transition just one of the arrays to a WeakMap, so I tried modify the UnitOfWork such that the entityInsertions array was converted to use a WeakMap. But what I ran into was that there are several functions that do things like array merges, diffs, and such, so I could not just convert the array to a Weak Map. I think I am not understanding how it is envisioned that WeakMaps may be used here (also admittedly this is my first ever foray into them). I'm still willing to try, but I need a little guidance. Even the roughest description of how someone would envision this occurring. My current project has a deployment script that can reliably trigger this hash collision, but it is not easily distilled into a unit test since it is a conglomeration of many different entities spanning my entire project.
Author
Owner

@greg0ire commented on GitHub (May 11, 2023):

@Matt-PMCT thanks for looking into this!

so I tried modify the UnitOfWork such that the entityInsertions array was converted to use a WeakMap

Great! I trust you used true or maybe even null as the value, and the entities as the key right?

But what I ran into was that there are several functions that do things like array merges, diffs, and such, so I could not just convert the array to a Weak Map.

🤔 any specific example? I mean if I had to merge 2 weak maps, I would iterate over one of them and add every element it has to the other, so I don't think array_merge would be a blocker. For a diff, it's more complicated I suppose, but hopefully it's feasible. It would be helpful if you could use github range permalinks to point us to the offending code .

I think I am not understanding how it is envisioned that WeakMaps may be used here (also admittedly this is my first ever foray into them). I'm still willing to try, but I need a little guidance. Even the roughest description of how someone would envision this occurring.

The point of weakmaps is that you can use objects as keys directly. So to answer your question, we would use them as a replacement for all arrays currently keyed with spl_object_id(). You seem to have understood this already so I think I must have misunderstood your question.

@greg0ire commented on GitHub (May 11, 2023): @Matt-PMCT thanks for looking into this! > so I tried modify the UnitOfWork such that the entityInsertions array was converted to use a WeakMap Great! I trust you used `true` or maybe even `null` as the value, and the entities as the key right? > But what I ran into was that there are several functions that do things like array merges, diffs, and such, so I could not just convert the array to a Weak Map. :thinking: any specific example? I mean if I had to merge 2 weak maps, I would iterate over one of them and add every element it has to the other, so I don't think `array_merge` would be a blocker. For a diff, it's more complicated I suppose, but hopefully it's feasible. It would be helpful if you could use [github range permalinks](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-a-permanent-link-to-a-code-snippet) to point us to the offending code . > I think I am not understanding how it is envisioned that WeakMaps may be used here (also admittedly this is my first ever foray into them). I'm still willing to try, but I need a little guidance. Even the roughest description of how someone would envision this occurring. The point of weakmaps is that you can use objects as keys directly. So to answer your question, we would use them as a replacement for all arrays currently keyed with `spl_object_id()`. You seem to have understood this already so I think I must have misunderstood your question.
Author
Owner

@mpdude commented on GitHub (Jun 22, 2023):

@Matt-PMCT are you still able to run the deployment script that triggers the issue?

  1. Does it make use of detach/merge for entities?
  2. Could you run it with the code from #10785 and report if that throws an exception somewhere?
@mpdude commented on GitHub (Jun 22, 2023): @Matt-PMCT are you still able to run the deployment script that triggers the issue? 1. Does it make use of detach/merge for entities? 2. Could you run it with the code from #10785 and report if that throws an exception somewhere?
Author
Owner

@mpdude commented on GitHub (Jun 23, 2023):

@tomaszmadeyski did not provide enough code to reproduce in https://github.com/doctrine/orm/issues/3037#issuecomment-180062249. It might be that they're using duplicate application-provided IDs, causing this problem down the road.

@sandvige in https://github.com/doctrine/orm/issues/3037#issuecomment-419038351 said:

When an entity is created using the same ids as a previous one, the new entity is tracked by maps oid-based, but it's NOT tracked by the $identityMap (because the check is based on keys, not on oid)

IMHO, that is not valid – you must not create entities with the same ID as another entity used previously. The EM/UoW should reject that, at least as long as it still knows about the previous instance. (Otherwise, it will probably fail when trying to insert into the database.)

#10785 is a WIP that will reject entity instances using the same ID as another instance already in memory.

This leaves us with https://github.com/doctrine/orm/issues/3037#issuecomment-369211489 as the only piece of reproduce code provided, and that code also ended up in #7404 and #8994. It still fails on 2.15. I think that's a bug in merge(), which is creating proxy instances without checking the identity map first. I'll see if I can follow up on that.

@Matt-PMCT might provide another lead/edge case

@mpdude commented on GitHub (Jun 23, 2023): @tomaszmadeyski did not provide enough code to reproduce in https://github.com/doctrine/orm/issues/3037#issuecomment-180062249. It might be that they're using duplicate application-provided IDs, causing this problem down the road. @sandvige in https://github.com/doctrine/orm/issues/3037#issuecomment-419038351 said: > When an entity is created using the same ids as a previous one, the new entity is tracked by maps oid-based, but it's NOT tracked by the $identityMap (because the check is based on keys, not on oid) IMHO, that is not valid – you must not create entities with the same ID as another entity used previously. The EM/UoW should reject that, at least as long as it still knows about the previous instance. (Otherwise, it will probably fail when trying to insert into the database.) #10785 is a WIP that will reject entity instances using the same ID as another instance already in memory. This leaves us with https://github.com/doctrine/orm/issues/3037#issuecomment-369211489 as the only piece of reproduce code provided, and that code also ended up in #7404 and #8994. It still fails on 2.15. I think that's a bug in `merge()`, which is creating proxy instances without checking the identity map first. I'll see if I can follow up on that. @Matt-PMCT might provide another lead/edge case
Author
Owner

@flack commented on GitHub (Jun 23, 2023):

I don't know if it helps, but I did some digging in some tests in my unittest suite which trigger this problem. The code is a bit convoluted, but what happens is more or less this:

1: Entity A persisted & detached
2: Entity B which has an association to A persisted, detached, merged and deleted
3: Entity C which has an association to A persisted, detached, merged and deleted
4: Try to persist a new entity D ==> spl_object_id collision. The new object (created with new, no identifier set) gets the same spl_object_id as the association proxy for the A object in step 3

I don't know why both step 2 and 3 are necessary, but if I remove either of them, the problem doesn't occur. The proxy which uses the same spl_object_id as the object in step 4 is created here:

https://github.com/doctrine/orm/blob/2.15.x/lib/Doctrine/ORM/UnitOfWork.php#L3675

before the call to registerManaged the spl_object_id of $other is not present in UnitOfWork internal variables. Afterwards it appears in three places:

  • entityStates: MANAGED
  • entityIdentifiers: ['id' => Entity A]
  • originalEntityData: []
@flack commented on GitHub (Jun 23, 2023): I don't know if it helps, but I did some digging in some tests in my unittest suite which trigger this problem. The code is a bit convoluted, but what happens is more or less this: 1: Entity A persisted & detached 2: Entity B which has an association to A persisted, detached, merged and deleted 3: Entity C which has an association to A persisted, detached, merged and deleted 4: Try to persist a new entity D ==> `spl_object_id` collision. The new object (created with `new`, no identifier set) gets the same `spl_object_id` as the association proxy for the A object in step 3 I don't know why both step 2 and 3 are necessary, but if I remove either of them, the problem doesn't occur. The proxy which uses the same spl_object_id as the object in step 4 is created here: https://github.com/doctrine/orm/blob/2.15.x/lib/Doctrine/ORM/UnitOfWork.php#L3675 before the call to `registerManaged` the spl_object_id of `$other` is not present in `UnitOfWork` internal variables. Afterwards it appears in three places: - entityStates: MANAGED - entityIdentifiers: ['id' => Entity A] - originalEntityData: []
Author
Owner

@flack commented on GitHub (Jun 23, 2023):

If I apply this patch:

diff --git a/lib/Doctrine/ORM/UnitOfWork.php b/lib/Doctrine/ORM/UnitOfWork.php
index ebb5f12a0..42a376b00 100644
--- a/lib/Doctrine/ORM/UnitOfWork.php
+++ b/lib/Doctrine/ORM/UnitOfWork.php
@@ -3672,11 +3672,7 @@ class UnitOfWork implements PropertyChangedListener
                                 if ($targetClass->subClasses) {
                                     $other = $this->em->find($targetClass->name, $relatedId);
                                 } else {
-                                    $other = $this->em->getProxyFactory()->getProxy(
-                                        $assoc2['targetEntity'],
-                                        $relatedId
-                                    );
-                                    $this->registerManaged($other, $relatedId, []);
+                                    $other = $this->em->getReference($targetClass->name, $relatedId);
                                 }
                             }

The amount of collisions in my test suite goes down from 27 to 1 (and memory usage increases from 100.5 MB to 102.5 MB). All my tests pass, but I haven't tried running the doctrine testsuite with the change.

@flack commented on GitHub (Jun 23, 2023): If I apply this patch: ```diff diff --git a/lib/Doctrine/ORM/UnitOfWork.php b/lib/Doctrine/ORM/UnitOfWork.php index ebb5f12a0..42a376b00 100644 --- a/lib/Doctrine/ORM/UnitOfWork.php +++ b/lib/Doctrine/ORM/UnitOfWork.php @@ -3672,11 +3672,7 @@ class UnitOfWork implements PropertyChangedListener if ($targetClass->subClasses) { $other = $this->em->find($targetClass->name, $relatedId); } else { - $other = $this->em->getProxyFactory()->getProxy( - $assoc2['targetEntity'], - $relatedId - ); - $this->registerManaged($other, $relatedId, []); + $other = $this->em->getReference($targetClass->name, $relatedId); } } ``` The amount of collisions in my test suite goes down from 27 to 1 (and memory usage increases from 100.5 MB to 102.5 MB). All my tests pass, but I haven't tried running the doctrine testsuite with the change.
Author
Owner

@mpdude commented on GitHub (Jun 23, 2023):

@flack please try the change from #10791

@mpdude commented on GitHub (Jun 23, 2023): @flack please try the change from #10791
Author
Owner

@mpdude commented on GitHub (Jun 23, 2023):

@flack regarding this...

1: Entity A persisted & detached
2: Entity B which has an association to A persisted, detached, merged and deleted
3: Entity C which has an association to A persisted, detached, merged and deleted
4: Try to persist a new entity D ==> spl_object_id collision. The new object (created with new, no identifier set) gets the same spl_object_id as the association proxy for the A object in step 3

In order to get the ID collisions, you (+ the code you're passing through) needs to create the right amount of objects and/or trigger GC.

But, from a more qualitative point of view, what is important is that during all those operations,

  • we should not simply replace ::$entityMap entries with different objects
  • during all your back-and-forth operations, there should be ::$entityIdentifiers and ::$entityStates entries exactly for those objects that are in the ::$identityMap at that point of time.

So, from your description, I'd say 1) + 2) should suffice to get the inconsistency somewhere along the way.

  1. seems to just repeat 2), and 4) is what causes the final collision. But I'd suspect the inconsistency happens with 1) or 2) already.

Do you think you could write a test similar to #10791?

@mpdude commented on GitHub (Jun 23, 2023): @flack regarding this... > 1: Entity A persisted & detached > 2: Entity B which has an association to A persisted, detached, merged and deleted > 3: Entity C which has an association to A persisted, detached, merged and deleted > 4: Try to persist a new entity D ==> `spl_object_id` collision. The new object (created with `new`, no identifier set) gets the same `spl_object_id` as the association proxy for the A object in step 3 In order to get the ID collisions, you (+ the code you're passing through) needs to create the right amount of objects and/or trigger GC. But, from a more qualitative point of view, what is important is that during all those operations, * we should not simply replace `::$entityMap` entries with different objects * during all your back-and-forth operations, there should be `::$entityIdentifiers` and `::$entityStates` entries exactly for those objects that are in the `::$identityMap` at that point of time. So, from your description, I'd say 1) + 2) should suffice to get the inconsistency somewhere along the way. 3) seems to just repeat 2), and 4) is what causes the final collision. But I'd suspect the inconsistency happens with 1) or 2) already. Do you think you could write a test similar to #10791?
Author
Owner

@flack commented on GitHub (Jun 23, 2023):

@flack please try the change from #10791

thanks, I gave it a try and it has (in my setup) the same effect as the patch I posted above, i.e. collisions go down from 27 to 1 and memory usage rises slightly. All tests pass, so that's good! :-)

I also noticed that the change fixes the problem I described here: https://github.com/doctrine/orm/pull/10785#issuecomment-1604098742

@flack commented on GitHub (Jun 23, 2023): > @flack please try the change from #10791 thanks, I gave it a try and it has (in my setup) the same effect as the patch I posted above, i.e. collisions go down from 27 to 1 and memory usage rises slightly. All tests pass, so that's good! :-) I also noticed that the change fixes the problem I described here: https://github.com/doctrine/orm/pull/10785#issuecomment-1604098742
Author
Owner

@mpdude commented on GitHub (Jun 23, 2023):

So let's hunt your last one collision :-)?

Can you try do make consistency checks as described in https://github.com/doctrine/orm/issues/3037#issuecomment-1604698758?

@mpdude commented on GitHub (Jun 23, 2023): So let's hunt your last one collision :-)? Can you try do make consistency checks as described in https://github.com/doctrine/orm/issues/3037#issuecomment-1604698758?
Author
Owner

@flack commented on GitHub (Jun 23, 2023):

Do you think you could write a test similar to #10791?

I mean I can try, but it seems #10791 fixes this particular issue, would it still add value in that case?

@flack commented on GitHub (Jun 23, 2023): > Do you think you could write a test similar to #10791? I mean I can try, but it seems #10791 fixes this particular issue, would it still add value in that case?
Author
Owner

@flack commented on GitHub (Jun 23, 2023):

So let's hunt your last one collision :-)?

I'll see what I can do, but this one is really annoying to narrow down, it only occurs if I run my full suite of 778 tests. Might take a while :-)

Can you try do make consistency checks as described in #3037 (comment)?

So if I understand correctly you want me to check if at any point during the runtime there are discrepancies between $entityIdentifiers, $entityStates and $identityMap, and whether or not something writes to entityMap?

@flack commented on GitHub (Jun 23, 2023): > So let's hunt your last one collision :-)? I'll see what I can do, but this one is really annoying to narrow down, it only occurs if I run my full suite of 778 tests. Might take a while :-) > Can you try do make consistency checks as described in [#3037 (comment)](https://github.com/doctrine/orm/issues/3037#issuecomment-1604698758)? So if I understand correctly you want me to check if at any point during the runtime there are discrepancies between `$entityIdentifiers`, `$entityStates` and `$identityMap`, and whether or not something writes to `entityMap`?
Author
Owner

@mpdude commented on GitHub (Jun 23, 2023):

Hm, that’s probably not practical.

Does the last one trigger the exception from #10785 ?

@mpdude commented on GitHub (Jun 23, 2023): Hm, that’s probably not practical. Does the last one trigger the exception from #10785 ?
Author
Owner

@flack commented on GitHub (Jun 23, 2023):

Yes, looks like (I have to apply #10791 as well, otherwise I get dozens of #10785 exceptions). The immediate context is

1: create/persist/detach entity A
2: create/persist entity B which has an association to entity A

On 1) I get the error from addToIdentityMap (which is called from executeInserts)
on 2) I get the the OID collision before calling $em->persist

But like I said, it doesn't happen if I run just this test class in isolation, so before step 1 there are probably a couple more missing. It doesn't happen in a testcase proper, but in setUpBeforeClass, and I suspect some deletion in a tearDownAfterClass from another test class leaves some inconsistency or something.

EDIT: To be more clear, I actually changed the exception into a debug_print_backtrace, otherwise it would stop at 1)

@flack commented on GitHub (Jun 23, 2023): Yes, looks like (I have to apply #10791 as well, otherwise I get dozens of #10785 exceptions). The immediate context is 1: create/persist/detach entity A 2: create/persist entity B which has an association to entity A On 1) I get the error from `addToIdentityMap` (which is called from `executeInserts`) on 2) I get the the OID collision before calling `$em->persist` But like I said, it doesn't happen if I run just this test class in isolation, so before step 1 there are probably a couple more missing. It doesn't happen in a testcase proper, but in `setUpBeforeClass`, and I suspect some deletion in a `tearDownAfterClass` from another test class leaves some inconsistency or something. EDIT: To be more clear, I actually changed the exception into a `debug_print_backtrace`, otherwise it would stop at 1)
Author
Owner

@mpdude commented on GitHub (Jun 23, 2023):

On 1) I get the error from addToIdentityMap (which is called from executeInserts)

That is while trying to persist A for the first time?

So, your entity manager is not clean (clear()ed) at that time when the test starts?
Do you provide IDs for the A entity yourself, or are those database-provided IDs (auto increments)?
If the latter is the case, do you truncate the database so that the same IDs get re-assigned, but at the same time, you do not clear the EM so that it still has other (previous test, older) entities with particular IDs in its identity map?

@mpdude commented on GitHub (Jun 23, 2023): > On 1) I get the error from addToIdentityMap (which is called from executeInserts) That is while trying to persist A for the first time? So, your entity manager is not clean (`clear()`ed) at that time when the test starts? Do you provide IDs for the A entity yourself, or are those database-provided IDs (auto increments)? If the latter is the case, do you truncate the database so that the same IDs get re-assigned, but at the same time, you do not clear the EM so that it still has other (previous test, older) entities with particular IDs in its identity map?
Author
Owner

@mpdude commented on GitHub (Jun 23, 2023):

Here is code that would use a TRUNCATE TABLE to make two entities use the same ID, which corrupts the internal data structures on 2.15 and causes the exception to be raised with the change from #10785.

This might well be something you're doing over the course of your tests, spread out over different test cases or phases.

<?php

declare(strict_types=1);

namespace Doctrine\Tests\ORM\Functional\Ticket;

use Doctrine\ORM\Mapping as ORM;
use Doctrine\Tests\OrmFunctionalTestCase;

class IdReassignmentAfterTruncateTest extends OrmFunctionalTestCase
{
    public function testTruncatingTablesToReassignIdsIsNotSoGood(): void
    {
        $this->createSchemaForModels(SomeEntity::class);

        $entity = new SomeEntity();
        $this->_em->persist($entity);
        $this->_em->flush();

        $this->_em->getConnection()->executeStatement('TRUNCATE TABLE some_table');

        $entity = new SomeEntity();
        $this->_em->persist($entity);
        $this->_em->flush();
    }
}

/**
 * @ORM\Entity
 * @ORM\Table(name="some_table")
 */
class SomeEntity
{
    /**
     * @ORM\Id
     * @ORM\GeneratedValue()
     * @ORM\Column(type="integer")
     *
     * @var int
     */
    private $id;
}
@mpdude commented on GitHub (Jun 23, 2023): Here is code that would use a `TRUNCATE TABLE` to make two entities use the same ID, which corrupts the internal data structures on `2.15` and causes the exception to be raised with the change from #10785. This might well be something you're doing over the course of your tests, spread out over different test cases or phases. ```php <?php declare(strict_types=1); namespace Doctrine\Tests\ORM\Functional\Ticket; use Doctrine\ORM\Mapping as ORM; use Doctrine\Tests\OrmFunctionalTestCase; class IdReassignmentAfterTruncateTest extends OrmFunctionalTestCase { public function testTruncatingTablesToReassignIdsIsNotSoGood(): void { $this->createSchemaForModels(SomeEntity::class); $entity = new SomeEntity(); $this->_em->persist($entity); $this->_em->flush(); $this->_em->getConnection()->executeStatement('TRUNCATE TABLE some_table'); $entity = new SomeEntity(); $this->_em->persist($entity); $this->_em->flush(); } } /** * @ORM\Entity * @ORM\Table(name="some_table") */ class SomeEntity { /** * @ORM\Id * @ORM\GeneratedValue() * @ORM\Column(type="integer") * * @var int */ private $id; } ```
Author
Owner

@flack commented on GitHub (Jun 24, 2023):

That is while trying to persist A for the first time?

Yes, it's a fresh object

So, your entity manager is not clean (clear()ed) at that time when the test starts?

Nope. I mean I could clear it in tearDownAfterClass if I wanted to, and it does fix (or rather mitigate) the issue, but I kind of prefer running the entire suite with the same em state. Tends to find some more obscure issues.

Do you provide IDs for the A entity yourself, or are those database-provided IDs (auto increments)? If the latter is the case, do you truncate the database so that the same IDs get re-assigned, but at the same time, you do not clear the EM so that it still has other (previous test, older) entities with particular IDs in its identity map?

All IDs are auto increments, yes. There is no clearing of the em, the only cleanup between tests is to delete/detach everything that is not needed anymore.

@flack commented on GitHub (Jun 24, 2023): > That is while trying to persist A for the first time? Yes, it's a fresh object > So, your entity manager is not clean (`clear()`ed) at that time when the test starts? Nope. I mean I could clear it in `tearDownAfterClass` if I wanted to, and it does fix (or rather mitigate) the issue, but I kind of prefer running the entire suite with the same em state. Tends to find some more obscure issues. > Do you provide IDs for the A entity yourself, or are those database-provided IDs (auto increments)? If the latter is the case, do you truncate the database so that the same IDs get re-assigned, but at the same time, you do not clear the EM so that it still has other (previous test, older) entities with particular IDs in its identity map? All IDs are auto increments, yes. There is no clearing of the em, the only cleanup between tests is to delete/detach everything that is not needed anymore.
Author
Owner

@flack commented on GitHub (Jun 24, 2023):

Here is code that would use a TRUNCATE TABLE to make two entities use the same ID, which corrupts the internal data structures on 2.15 and causes the exception to be raised with the change from #10785.

This might well be something you're doing over the course of your tests, spread out over different test cases or phases.

No, I don't think so. The testsuite DB (SQLite in memory) gets setup once in the suite's bootstrap code, and after, there are no structural changes, all DB operations are done through EntityManager, there are no executeStatement calls or anything else that bypasses ORM and goes directly to DBAL (but I'll try to doublecheck to be sure)

@flack commented on GitHub (Jun 24, 2023): > Here is code that would use a `TRUNCATE TABLE` to make two entities use the same ID, which corrupts the internal data structures on `2.15` and causes the exception to be raised with the change from #10785. > > This might well be something you're doing over the course of your tests, spread out over different test cases or phases. No, I don't think so. The testsuite DB (SQLite in memory) gets setup once in the suite's bootstrap code, and after, there are no structural changes, all DB operations are done through `EntityManager`, there are no `executeStatement` calls or anything else that bypasses ORM and goes directly to DBAL (but I'll try to doublecheck to be sure)
Author
Owner

@mpdude commented on GitHub (Jun 24, 2023):

At the time this exeception is raised, two object instances of the same class exist that compete for the same @Id value. You should be able to see which class and ID that is.

At that time, where does the second object come from? How has it been created, where did it get its ID?

And, can you find where the first object was created and where it comes from? Set a breakpoint with a condition at the place where the objects are added to the identity map and watch out for that ID – so you should find the origin of the first object as well.

How come both are using the same identifier values?

@mpdude commented on GitHub (Jun 24, 2023): At the time this exeception is raised, two object instances of the same class exist that compete for the same `@Id` value. You should be able to see which class and ID that is. At that time, where does the second object come from? How has it been created, where did it get its ID? And, can you find where the first object was created and where it comes from? Set a breakpoint with a condition at the place where the objects are added to the identity map and watch out for that ID – so you should find the origin of the first object as well. How come both are using the same identifier values?
Author
Owner

@flack commented on GitHub (Jun 24, 2023):

It took a while, but I figured it out now: It turns out there was one test that constructs and entity from an XML string. The XML comes from a hardcoded test asset and it contains a value of 32 for an association field in the entity. My code instantiates a new entity and applies the values from the XML, for associations it does that by calling $entity->linkToClassB = $em->getReference($value). The entity is never persisted, but just calling getReference is enough to poison $identityMap. After that other tests run until the point where there's 32 entries in the ClassB table, and then the exception happens.

So basically it is the same situation as in the testcase in #843. You can argue that this is a user error, but (I was meaning to write that in #843 as well), I have to say that getReference is quite a footgun in that case, because it will happily give you a reference to whatever nonsense you call it with, and then cause random-looking problems way later in completely different places. At least #10785 makes it possible to spot that there is a problem, but the problem might only occur on e.g. some long-running job on a production server and can be next to impossible to reproduce in unit tests (esp. since I bet almost everyone resets state between tests)

@flack commented on GitHub (Jun 24, 2023): It took a while, but I figured it out now: It turns out there was one test that constructs and entity from an XML string. The XML comes from a hardcoded test asset and it contains a value of `32` for an association field in the entity. My code instantiates a new entity and applies the values from the XML, for associations it does that by calling `$entity->linkToClassB = $em->getReference($value)`. The entity is never persisted, but just calling `getReference` is enough to poison `$identityMap`. After that other tests run until the point where there's 32 entries in the ClassB table, and then the exception happens. So basically it is the same situation as in the testcase in #843. You can argue that this is a user error, but (I was meaning to write that in #843 as well), I have to say that `getReference` is quite a footgun in that case, because it will happily give you a reference to whatever nonsense you call it with, and then cause random-looking problems way later in completely different places. At least #10785 makes it possible to spot that there is a problem, but the problem might only occur on e.g. some long-running job on a production server and can be next to impossible to reproduce in unit tests (esp. since I bet almost everyone resets state between tests)
Author
Owner

@Matt-PMCT commented on GitHub (Jun 25, 2023):

@mpdude I had originally responded that I did not get an exception, but I was mistaken... (messed something up in my original run...) Your code does indeed throw an error message for me.

@Matt-PMCT commented on GitHub (Jun 25, 2023): @mpdude I had originally responded that I did not get an exception, but I was mistaken... (messed something up in my original run...) Your code does indeed throw an error message for me.
Author
Owner

@Matt-PMCT commented on GitHub (Jun 25, 2023):

@mpdude diving a bit deeper my problem seems to start with a class I have that extends another class.

image

In this case I have a "Role" class and then an "ApiRole" class that extends the "Role" class. Your error says I am adding an entity of the class "App\Entity\Role" with an ID of 1, but I think I am trying add an ApiRole, not a Role...
image

Now the new exception complains because my ApiRole has an ID of 1, which would collide with my Role that has an ID of 1 (different tables in the database...)

However, most interesting to me is, that it was an ApiRole that was colliding with a completely different Object later in the process...

My original ApiRole was just this:
image

If I remove the "extends" and copy all the code from the Roles Object the exception and collisions disappear. I was trying to avoid having duplicate code for two objects that essentially are the same but I wanted to store in different places. I'm not very good with how to use extends, so I'll look at that, maybe I was doing something dumb...

I think this proves that your addition is working and providing good feedback to the users 😁

@Matt-PMCT commented on GitHub (Jun 25, 2023): @mpdude diving a bit deeper my problem seems to start with a class I have that extends another class. ![image](https://github.com/doctrine/orm/assets/6832191/ef16139a-d16c-4dac-bdb4-da746c05bf37) In this case I have a "Role" class and then an "ApiRole" class that extends the "Role" class. Your error says I am adding an entity of the class "App\Entity\Role" with an ID of 1, but I think I am trying add an ApiRole, not a Role... ![image](https://github.com/doctrine/orm/assets/6832191/58643c53-c4a3-460b-8b82-87d62d8b17c4) Now the new exception complains because my ApiRole has an ID of 1, which would collide with my Role that has an ID of 1 (different tables in the database...) However, most interesting to me is, that it was an ApiRole that was colliding with a completely different Object later in the process... My original ApiRole was just this: ![image](https://github.com/doctrine/orm/assets/6832191/60da4712-319d-42d9-96c3-f89dcfcc96a6) If I remove the "extends" and copy all the code from the Roles Object the exception and collisions disappear. I was trying to avoid having duplicate code for two objects that essentially are the same but I wanted to store in different places. I'm not very good with how to use extends, so I'll look at that, maybe I was doing something dumb... I think this proves that your addition is working and providing good feedback to the users 😁
Author
Owner

@mpdude commented on GitHub (Jun 25, 2023):

@Matt-PMCT could you share the code for both classes? I only need to see the “head” where @Entity etc is written, and the properties with annotations for the id and discriminator columns.

@mpdude commented on GitHub (Jun 25, 2023): @Matt-PMCT could you share the code for both classes? I only need to see the “head” where `@Entity` etc is written, and the properties with annotations for the id and discriminator columns.
Author
Owner

@Matt-PMCT commented on GitHub (Jun 25, 2023):

@Matt-PMCT could you share the code for both classes? I only need to see the “head” where @Entity etc is written, and the properties with annotations for the id and discriminator columns.

ApiRole:

namespace App\Entity\Api;

use App\Entity\Role;
use Doctrine\ORM\Mapping as ORM;

/**
 * @ORM\Entity(repositoryClass="App\Repository\Api\ApiRoleRepository")
 */
class ApiRole extends Role
{
    private $spare;

}

Role:

namespace App\Entity;

use Doctrine\ORM\Mapping as ORM;

/**
 * @ORM\Entity
 * @ORM\Table(name="roles")
 */
class Role
{
    /**
     * @ORM\Column(type="smallint", options={"unsigned":true})
     * @ORM\Id
     * @ORM\GeneratedValue(strategy="AUTO")
     */
    private $id;
    /**
     * @ORM\Column(type="string")
     */
    private $name;
    /**
     * @ORM\Column(type="string")
     */
    private $displayName;
    /**
     * @ORM\Column(type="datetime")
     */
    protected $dateTimeStart;
    /**
     * @ORM\Column(type="datetime", nullable=true)
     */
    protected $dateTimeEnd;
@Matt-PMCT commented on GitHub (Jun 25, 2023): > @Matt-PMCT could you share the code for both classes? I only need to see the “head” where `@Entity` etc is written, and the properties with annotations for the id and discriminator columns. ApiRole: ``` namespace App\Entity\Api; use App\Entity\Role; use Doctrine\ORM\Mapping as ORM; /** * @ORM\Entity(repositoryClass="App\Repository\Api\ApiRoleRepository") */ class ApiRole extends Role { private $spare; } ``` Role: ``` namespace App\Entity; use Doctrine\ORM\Mapping as ORM; /** * @ORM\Entity * @ORM\Table(name="roles") */ class Role { /** * @ORM\Column(type="smallint", options={"unsigned":true}) * @ORM\Id * @ORM\GeneratedValue(strategy="AUTO") */ private $id; /** * @ORM\Column(type="string") */ private $name; /** * @ORM\Column(type="string") */ private $displayName; /** * @ORM\Column(type="datetime") */ protected $dateTimeStart; /** * @ORM\Column(type="datetime", nullable=true) */ protected $dateTimeEnd; ```
Author
Owner

@mpdude commented on GitHub (Jun 25, 2023):

Thanks!

Checking with https://www.doctrine-project.org/projects/doctrine-orm/en/2.15/reference/inheritance-mapping.html#entity-inheritance, you’re using entity inheritance without declaring it. That means the ORM will assume instances of both classes share the ID space and no two instances use the same id. But, in fact, both get their ID from different table auto-increments and so collisions may occur.

I am afraid the best we could do is to find out why the ORM did not reject your configuration in the first place, and I’ll take a look at that.

@mpdude commented on GitHub (Jun 25, 2023): Thanks! Checking with https://www.doctrine-project.org/projects/doctrine-orm/en/2.15/reference/inheritance-mapping.html#entity-inheritance, you’re using entity inheritance without declaring it. That means the ORM will assume instances of both classes share the ID space and no two instances use the same id. But, in fact, both get their ID from different table auto-increments and so collisions may occur. I am afraid the best we could do is to find out why the ORM did not reject your configuration in the first place, and I’ll take a look at that.
Author
Owner

@Matt-PMCT commented on GitHub (Jun 25, 2023):

@mpdude thanks a ton, I had no idea I needed to do anything special. Your code changes in #10785 directly led me to at least identify the problem. Seems like a solid improvement that I hope gets implemented, great work, and thank you!

@Matt-PMCT commented on GitHub (Jun 25, 2023): @mpdude thanks a ton, I had no idea I needed to do anything special. Your code changes in #10785 directly led me to at least identify the problem. Seems like a solid improvement that I hope gets implemented, great work, and thank you!
Author
Owner

@mpdude commented on GitHub (Jun 25, 2023):

@Matt-PMCT could you please confirm that the exception added in #10463 would have spotted your entity configuration as invalid?

It will be an exception in 3.0, and is a deprecation warning starting in 2.15.

@mpdude commented on GitHub (Jun 25, 2023): @Matt-PMCT could you please confirm that the exception added in #10463 would have spotted your entity configuration as invalid? It will be an exception in 3.0, and is a deprecation warning starting in 2.15.
Author
Owner

@Matt-PMCT commented on GitHub (Jun 25, 2023):

@mpdude confirmed #10463 does spot it

@Matt-PMCT commented on GitHub (Jun 25, 2023): @mpdude confirmed #10463 does spot it
Author
Owner

@mpdude commented on GitHub (Jun 25, 2023):

@Matt-PMCT try using a mapped superclass to hold the commonalities of both your Role classes.

@mpdude commented on GitHub (Jun 25, 2023): @Matt-PMCT try using a mapped superclass to hold the commonalities of both your `Role` classes.
Author
Owner

@mpdude commented on GitHub (Jun 25, 2023):

To summarize, we have

as actionable items to follow up on, and no other reproducible examples, leads or hints for this issue here.

@mpdude commented on GitHub (Jun 25, 2023): To summarize, we have - #10785, - #10791 and - #10797 as actionable items to follow up on, and no other reproducible examples, leads or hints for this issue here.
Author
Owner

@mpdude commented on GitHub (Jul 4, 2023):

With

being solved/merged/closed, I don't see what else we could do here to improve the situation. My suggestion is to close this issue.

@mpdude commented on GitHub (Jul 4, 2023): With - #10785, - #10791 and - #10797 being solved/merged/closed, I don't see what else we could do here to improve the situation. My suggestion is to close this issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: doctrine/archived-orm#2929