fix(python): remove redundant debug setter call in __deepcopy__#23958
Conversation
The `__deepcopy__` method copies all `__dict__` items (including `_Configuration__debug`) via the for-loop, then calls `result.debug = self.debug` which fires the property setter with the same value already present. On Python 3.12+, the setter's `logger.setLevel()` call triggers `logging.Manager._clear_cache()` which iterates every registered logger — O(n_loggers) per deepcopy. In applications with many loggers (kubernetes, boto3, django, etc.), this causes severe performance degradation when Configuration objects are deepcopied frequently (e.g., per-request in API clients). Benchmark with 2000 registered loggers, 1000 deepcopy calls: Before: 0.190s After: 0.004s (45x faster) The logger_file setter is kept because `logger_file_handler` is explicitly excluded from the __dict__ copy loop and needs to be re-created via the setter.
|
thanks for the PR can you please review the CI test failures when you've time? |
|
would you mind triggering them one more time? the error looks to be when hitting live |
it shouldnt be the case as it should be hitting the petstore server running locally but looks like my assumption is wrong as restarting it did fix the issue. i'll take another look later as we don't want to use the public petstore server (not owned by us) thanks for your contribution. (https://github.com/OpenAPITools/openapi-generator/actions/runs/27038354307/job/80237543724?pr=23958 not related tot his change) |
Hi! It seems I'm hitting an extremely pathological case during the creation of many openapi-generated objects in a very large (many registered loggers) codebase.
The
__deepcopy__method on theConfigurationcopies all__dict__items (including_Configuration__debug) and then loggers are shallow-copied over. After that, we callresult.debug = self.debugwhich fires the property setter with the same value already present. On Python 3.12+ (actually maybe earlier, just what im using here) , the setter'slogger.setLevel()call triggerslogging.Manager._clear_cache()which iterates every registered logger — O(n_loggers) per deepcopy.In applications with many loggers, this causes severe performance degradation when Configuration objects are deepcopied frequently (e.g., per-request in API clients).
I let my friendly bot create an example perf bechmmark to showcase the issue and fix:
https://gist.github.com/markjm/bb5805505d9dfddc5d595cadd0beaff1
$ python3 scripts/python_debug_setter_repro.py Python 3.12.3 Registered loggers: 20,000 deepcopy() calls: 1,000 Before fix: 1.744s After fix: 0.004s Speedup: 413.2xThe redundant debug setter in deepcopy causes measurable overhead that scales with the number of registered loggers (O(n_loggers) per copy).
Because we already copy over the parts we want (and explicitly configure loggers how we want on copy), I believe this change to be essentially a noop except for the perf improvement.
PR checklist
These must match the expectations made by your contribution.
You may regenerate an individual generator by passing the relevant config(s) as an argument to the script, for example
./bin/generate-samples.sh bin/configs/java*.IMPORTANT: Do NOT purge/delete any folders/files (e.g. tests) when regenerating the samples as manually written tests may be removed.
🫡 @cbornet @tomplus @krjakbrjak @fa0311 @multani
Summary by cubic
Remove a redundant debug setter call in Python clients’
Configuration.__deepcopy__. This avoids O(n_loggers) logger cache clears on Python 3.12+ and significantly speeds up deepcopy in apps with many loggers.result.debug = self.debugin__deepcopy__; the value is already copied from__dict__.logger_filesetter to re-create the file handler excluded from the dict copy.debugTrue/False.Written for commit b83efd1. Summary will update on new commits.