Skip to content

Persistent IndexMap support? #159

@Dorumin

Description

@Dorumin

Would it be considered within the scope of the crate to have an insertion-ordered persistent HashMap implementation? I was reminded of it by a comment by jplatte (who I do not want to ping twice in one day!) in issue #158, and because I've personally wanted this in the past.

Taking the approach that indexmap uses, a way of designing this struct in userland could be:

struct IndexMap<K, V> {
    indices: imbl::HashMap<K, usize>,
    entries: imbl::Vector<Bucket<K, V>>
}

where Bucket is effectively a tuple of the hash, key, and value; I'm not entirely certain that storing the hash is necessary though. The key is stored in the Vector to be able to scan the matching hashes in the HashMap while also comparing equality with the key until a match, if any, is found.

This approach, however, has fairly atrocious time complexity in the case of shift_removing a key/index.

  • Not only can you not memcpy the Buckets inside the entries Vector, since it's no longer a dense Vec, but you'll also either have to fully scan the indices table to subtract one from the stored mapped indices or do a lookup for each key (the decrement_indices in the indexmap Core does this).
    • At the very least we'd preserve structural sharing and you could mutate in place if you have your own copy that's yet to be cloned
  • It looks pretty bad unless it also implements:
    • some sort of tombstone system
    • or having some sort of globally-incrementing index for insertions and a value we can use to iterate live indices. This could probably be done with an OrdMap and keeping a sequence_id (probably u64 and not usize; we're not constrained by memory in this case but rather how many times you can insert + remove)
      • Thinking about this approach some more, I think it would perform rather well in nearly all cases, but it could not offer a get_index(usize) -> Option<(&K, &V)> like indexmap does unless the issue 158 were implemented in order to have O(log n) access by index. This is not a use case I personally care about for an indexmap, where insertion ordered iteration is more important than getting a property by index

This library seems to try and offer decent-perf for the most common use cases, and worrying about whether to shift or swap does not quite match with that idea.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions