Describe the bug
I noticed some unpredictable behavior in my app while using NodeId as Map keys. I'm not sure if this is actually supported/encouraged, but didn't see anything in the docs to indicate otherwise. At a glance, many of Milo's internals (e.g. DataTypeTree) seem to use NodeId as a hashtable key. I tracked it down to the hashCode method of the NodeId class - the hashCode computation yields a very high likelihood of hash collisions for a server with several active namespaces & number of nodes. For my particular project, there were 38 hash collisions out of 1433 monitored/relevant nodes across 3 namespaces (2.6% collision rate).
Expected behavior
I expect NodeId to be reliably hashable, perhaps not completely uniquely, but at least "unique-enough" across the range of namespaces and nodes for my server, so that 2 NodeIds don't resolve to the same hash.
Logs and Packet Captures
I wrote this small test to see hash collisions for what I figured to be a feasible range of namespaces (8) and node identifiers (32767) and compared it to a couple alternatives to Java's .hashCode()
void checkMiloNodeIdHashes() {
checkNodeIdHashes(NodeId::hashCode, "NodeId::hashCode");
checkNodeIdHashes((nodeId) -> hashNodeId(nodeId, Hashing.goodFastHash(32)), "Hashing.goodFastHash(32)");
checkNodeIdHashes((nodeId) -> hashNodeId(nodeId, Hashing.murmur3_32_fixed()), "Hashing.murmur3_32_fixed()");
checkNodeIdHashes((nodeId) -> hashNodeId(nodeId, Hashing.murmur3_128()), "Hashing.murmur3_128()");
checkNodeIdHashes((nodeId) -> hashNodeId(nodeId, Hashing.sha256()), "Hashing.sha256()");
checkNodeIdHashes((nodeId) -> hashNodeId(nodeId, Hashing.crc32()), "Hashing.crc32()");
}
void checkNodeIdHashes(Function<NodeId, Integer> hashMethod, String method) {
Multimap<Integer, NodeId> seen = ArrayListMultimap.create();
// hash node ids for namespaces 0-7 & node identifiers 0-32767
int nNamespace = 8;
int nIndex = 32768;
for (int ns = 0; ns < nNamespace; ns++) {
for (int nid = 0; nid < nIndex; nid++) {
NodeId nodeId = new NodeId(ns, nid);
int hash = hashMethod.apply(nodeId);
seen.put(hash, nodeId);
}
}
Logger.debug("hash method: {}", method);
checkCollisions(seen, nNamespace * nIndex);
seen = null; // much RAM, garbage collect
}
int hashNodeId(NodeId nodeId, HashFunction hashFunction) {
return hashFunction.newHasher()
.putInt(nodeId.getNamespaceIndex().intValue())
.putInt(((UInteger) nodeId.getIdentifier()).intValue())
.hash()
.asInt();
}
void checkCollisions(Multimap<Integer, NodeId> seen, int nNode) {
int collisions = 0;
int maxCollisions = 0;
for (var entry : seen.asMap().entrySet()) {
int key = entry.getKey();
Collection<NodeId> values = entry.getValue();
int nCollisionForKey = values.size();
if (nCollisionForKey > 1) {
Logger.trace("Hash Collision: {} <- {}", key, values);
collisions += nCollisionForKey;
maxCollisions = Math.max(maxCollisions, nCollisionForKey);
}
}
displayCollisionStats(collisions, nNode, maxCollisions);
}
void displayCollisionStats(int collisions, int nNode, int maxCollisions) {
float collisionPercent = (float) collisions / nNode * 100;
Logger.debug("Total collisions: {}", collisions);
Logger.debug("Collision percentage: {}%", collisionPercent);
Logger.debug("Max collisions: {}\n", maxCollisions);
}
outputs:
DEBUG: hash method: NodeId::hashCode
DEBUG: Total collisions: 262082
DEBUG: Collision percentage: 99.97635%
DEBUG: Max collisions: 8
DEBUG: hash method: Hashing.goodFastHash(32)
DEBUG: Total collisions: 0
DEBUG: Collision percentage: 0.0%
DEBUG: Max collisions: 0
DEBUG: hash method: Hashing.murmur3_32_fixed()
DEBUG: Total collisions: 12
DEBUG: Collision percentage: 0.0045776367%
DEBUG: Max collisions: 2
DEBUG: hash method: Hashing.murmur3_128()
DEBUG: Total collisions: 14
DEBUG: Collision percentage: 0.005340576%
DEBUG: Max collisions: 2
DEBUG: hash method: Hashing.sha256()
DEBUG: Total collisions: 22
DEBUG: Collision percentage: 0.008392334%
DEBUG: Max collisions: 2
DEBUG: hash method: Hashing.crc32()
DEBUG: Total collisions: 0
DEBUG: Collision percentage: 0.0%
DEBUG: Max collisions: 0
Additional context
artifacts = ["org.eclipse.milo:sdk-client:0.6.15", "org.eclipse.milo:stack-core:0.6.15",]
Describe the bug
I noticed some unpredictable behavior in my app while using
NodeIdas Map keys. I'm not sure if this is actually supported/encouraged, but didn't see anything in the docs to indicate otherwise. At a glance, many of Milo's internals (e.g.DataTypeTree) seem to useNodeIdas a hashtable key. I tracked it down to thehashCodemethod of theNodeIdclass - the hashCode computation yields a very high likelihood of hash collisions for a server with several active namespaces & number of nodes. For my particular project, there were 38 hash collisions out of 1433 monitored/relevant nodes across 3 namespaces (2.6% collision rate).Expected behavior
I expect NodeId to be reliably hashable, perhaps not completely uniquely, but at least "unique-enough" across the range of namespaces and nodes for my server, so that 2 NodeIds don't resolve to the same hash.
Logs and Packet Captures
I wrote this small test to see hash collisions for what I figured to be a feasible range of namespaces (8) and node identifiers (32767) and compared it to a couple alternatives to Java's .hashCode()
outputs:
DEBUG: hash method: NodeId::hashCode
DEBUG: Total collisions: 262082
DEBUG: Collision percentage: 99.97635%
DEBUG: Max collisions: 8
DEBUG: hash method: Hashing.goodFastHash(32)
DEBUG: Total collisions: 0
DEBUG: Collision percentage: 0.0%
DEBUG: Max collisions: 0
DEBUG: hash method: Hashing.murmur3_32_fixed()
DEBUG: Total collisions: 12
DEBUG: Collision percentage: 0.0045776367%
DEBUG: Max collisions: 2
DEBUG: hash method: Hashing.murmur3_128()
DEBUG: Total collisions: 14
DEBUG: Collision percentage: 0.005340576%
DEBUG: Max collisions: 2
DEBUG: hash method: Hashing.sha256()
DEBUG: Total collisions: 22
DEBUG: Collision percentage: 0.008392334%
DEBUG: Max collisions: 2
DEBUG: hash method: Hashing.crc32()
DEBUG: Total collisions: 0
DEBUG: Collision percentage: 0.0%
DEBUG: Max collisions: 0
Additional context
artifacts = ["org.eclipse.milo:sdk-client:0.6.15", "org.eclipse.milo:stack-core:0.6.15",]