feat(common): Improve _xorBuffer() performance#66
Conversation
1efd148 to
6c5cba5
Compare
|
It looks like most of the ==> to be checked a bit more, though. |
|
I'd need to add a logger to register how often the If it's very rare, then this optimization is probably just not worth it, and could be discarded (still, some unrelated parts of this PR would be worth a cherry-pick). ==> also see if |
|
Well, it's just a single example, but with
==> so, at least for a V5 game like Monkey1 Sega CD, it looks like And if I move the threshold to |
dd9a7f5 to
4ef0862
Compare
Thomas' original code had some logic to xor the values with 4 bytes at once, to improve performance. However, this caused -Wcast-align issues on some platforms (see issue ecff30b. This new approach restores a bit of this logic, but it's now done in a way to avoid unaligned loads. It seems to work on that old mips64el board I have. Since it appears that _xorBuffer() is called on a majority of very small buffers, it doesn't make sense to set up this "pipeline" if there are just a few bytes to copy, though.
0966627 to
273c51d
Compare
Thomas' original code had some logic to xor the values with 4 bytes at once, to improve performance.
However, this caused -Wcast-align issues on some platforms (see issue #10 for more details). This code was then removed in commit ecff30b.
This new approach restores a bit of this logic, but it's now done in a way to avoid unaligned loads. It seems to work on that old mips64el board I have.
TODO: