Skip to content
Discussion options

You must be logged in to vote

There's no need to use LSH for threshold=1.0 since that would only match items with the exact same MinHash. Instead, you can store items in a dict keyed by the hash of the entire MinHash, or even skip using MinHash and just use a single hash of the entire item.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by ekzhu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #243 on November 05, 2025 03:08.