Problem Description
Duplicate records appear in the user_registration table after migration, with the organization set to "Unknown".
Where and Why the Problem Occurs
During the import process, ePersons are imported by calling the endpoint:
clarin/import/eperson.
This endpoint creates an EPersonRest object (see
https://github.com/dataquest-dev/DSpace/blob/dtq-dev/dspace-server-webapp/src/main/java/org/dspace/app/rest/repository/ClarinEPersonImportController.java#L99
).
When an ePerson is created from the REST object, a corresponding user registration record is also created automatically, with the organization set to "Unknown".
However, during the migration we also import existing user registration data, which contains more accurate information for some ePersons. This is done via the endpoint:
clarin/import/userregistration.
At this point, the issue occurs: instead of updating the already existing user registration created during the ePerson import, a new user registration is created. This leads to duplicate records in the user_registration table.
How the Issue Was Fixed
Instead of creating a new user registration during the user registration import, the existing user registration created during the ePerson import is now located. Its values are then updated using data from the old dump, preventing duplicate entries and preserving the correct information.
Problem Description
Duplicate records appear in the user_registration table after migration, with the organization set to "Unknown".
Where and Why the Problem Occurs
During the import process, ePersons are imported by calling the endpoint:
clarin/import/eperson.
This endpoint creates an EPersonRest object (see
https://github.com/dataquest-dev/DSpace/blob/dtq-dev/dspace-server-webapp/src/main/java/org/dspace/app/rest/repository/ClarinEPersonImportController.java#L99
).
When an ePerson is created from the REST object, a corresponding user registration record is also created automatically, with the organization set to "Unknown".
However, during the migration we also import existing user registration data, which contains more accurate information for some ePersons. This is done via the endpoint:
clarin/import/userregistration.
At this point, the issue occurs: instead of updating the already existing user registration created during the ePerson import, a new user registration is created. This leads to duplicate records in the user_registration table.
How the Issue Was Fixed
Instead of creating a new user registration during the user registration import, the existing user registration created during the ePerson import is now located. Its values are then updated using data from the old dump, preventing duplicate entries and preserving the correct information.