In cases where the number of input rows can be known ahead of time, it would be advantageous for both fragmentation and performance reasons to pre-allocate the hash table and has rows as the are received, rather than first gathering into an expanding array and then hashing into a second array.
Also, as currently coded, both the 'gather array' and the hash table are kept in separate halves of the same allocated array - this means that the gather array cannot be released once the hash table is built.
In cases where the number of input rows can be known ahead of time, it would be advantageous for both fragmentation and performance reasons to pre-allocate the hash table and has rows as the are received, rather than first gathering into an expanding array and then hashing into a second array.
Also, as currently coded, both the 'gather array' and the hash table are kept in separate halves of the same allocated array - this means that the gather array cannot be released once the hash table is built.