feat(common): Improve _xorBuffer() performance by dwatteau · Pull Request #66 · dwatteau/scummtr

dwatteau · 2025-04-16T21:46:16Z

Thomas' original code had some logic to xor the values with 4 bytes at once, to improve performance.

However, this caused -Wcast-align issues on some platforms (see issue #10 for more details). This code was then removed in commit ecff30b.

This new approach restores a bit of this logic, but it's now done in a way to avoid unaligned loads. It seems to work on that old mips64el board I have.

TODO:

do more tests
see if it does make a difference, regarding performance (is there a real bottleneck there?)
don't forget to test with different compilers: GCC, Clang, MSVC…
check everything's still OK on a big-endian environment
see if it improves issue [Performance] GCC builds are ~30% faster than Clang/MSVC builds #4, in some way

dwatteau · 2025-04-19T08:04:00Z

It looks like most of the _xorBuffer() calls are made on very small buffers (few bytes), where this optimization pipeline won't bring much (it could even be a bit counter-productive, so I've added a if (X > xxx) check).

==> to be checked a bit more, though.

dwatteau · 2025-04-19T08:14:37Z

I'd need to add a logger to register how often the if (n >= 16) case is triggered, in some typical imports/exports (on various games).

If it's very rare, then this optimization is probably just not worth it, and could be discarded (still, some unrelated parts of this PR would be worth a cherry-pick).

==> also see if 12 or 16 is the best value?

dwatteau · 2025-04-20T23:04:45Z

Well, it's just a single example, but with scummtr -g monkeysega -if original-text.txt I have:

2802 hits for the (n >= 16) case
617778 hits for the (n < 16) case

==> so, at least for a V5 game like Monkey1 Sega CD, it looks like _xorBuffer is called many times, mainly on very small buffers. So there, the optimization is called very rarely, and so it won't much effect, in the end.

And if I move the threshold to (n >= 12) (probably the lowest value where setting up the "optimization pipeline" could be worth it, maybe), I have 3476 hits for 614308 misses.

Thomas' original code had some logic to xor the values with 4 bytes at once, to improve performance. However, this caused -Wcast-align issues on some platforms (see issue ecff30b. This new approach restores a bit of this logic, but it's now done in a way to avoid unaligned loads. It seems to work on that old mips64el board I have. Since it appears that _xorBuffer() is called on a majority of very small buffers, it doesn't make sense to set up this "pipeline" if there are just a few bytes to copy, though.

dwatteau force-pushed the feat/common-xorbuffer-potential-performance-improvement branch 2 times, most recently from 1efd148 to 6c5cba5 Compare April 18, 2025 21:34

dwatteau force-pushed the main branch from f43bfa9 to e307acc Compare May 2, 2025 17:33

dwatteau force-pushed the feat/common-xorbuffer-potential-performance-improvement branch 2 times, most recently from dd9a7f5 to 4ef0862 Compare May 13, 2025 21:54

dwatteau added 2 commits July 8, 2025 10:03

fix(common): Fix MSVC C4146 warning in previous change

273c51d

dwatteau force-pushed the feat/common-xorbuffer-potential-performance-improvement branch from 0966627 to 273c51d Compare July 8, 2025 08:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(common): Improve _xorBuffer() performance#66

feat(common): Improve _xorBuffer() performance#66
dwatteau wants to merge 2 commits into
mainfrom
feat/common-xorbuffer-potential-performance-improvement

dwatteau commented Apr 16, 2025

Uh oh!

dwatteau commented Apr 19, 2025

Uh oh!

dwatteau commented Apr 19, 2025 •

edited

Loading

Uh oh!

dwatteau commented Apr 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dwatteau commented Apr 16, 2025

Uh oh!

dwatteau commented Apr 19, 2025

Uh oh!

dwatteau commented Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dwatteau commented Apr 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dwatteau commented Apr 19, 2025 •

edited

Loading