Convert loaded image to native byteorder#316
Conversation
Makes a small (but repeatable) performance improvement of around 5% on intel/amd architecture for the SKA ref case I have and also allows use libraries which do not handle non-native byteorder. (NB Fits are always big-endian)
|
Just to note that we removed reordering in a4a1bf9 because the overhead was making performance worse. So it's important to check that this different approach doesn't have the same problem... |
|
Thanks for pointing this out. Do you still have the image / and/or testing script that showed the previous behaviour that you can share? I can investigate The approach looks very similar but various things can have changed (such as performance of libraries with non-native orders) or perhaps the performance impact depends on the details of the use case... |
|
I think it was just running on 20000 x 20000 LOFAR images (for single pointings, so we needed to load in two images, PB corrected and apparent flux). So not dissimilar to what I would imagine your use case is. A lot has changed in the code base since then as well as the libraries so I'd be willing to believe it might no longer be a problem. |
|
Yes, I'm using a 24k x 24k image for these tests, I captured it and the processing parameters from a SKA/LOFAR pipeline intermediate stage last year. I'm working on some other performance improvements so I propose we leave this in draft while doing all that just to see if something else related to this might come up |
Makes a small (but repeatable) performance improvement of around 10% on intel/amd architecture for the SKA ref case I have and also allows use libraries which do not handle non-native byteorder. (NB Fits are always big-endian)