Skip to content

Add b+ tree secondary indexing for fast search and retrieval#22

Open
bkal01 wants to merge 20 commits into
aimhubio:developfrom
bkal01:feature/bplustree
Open

Add b+ tree secondary indexing for fast search and retrieval#22
bkal01 wants to merge 20 commits into
aimhubio:developfrom
bkal01:feature/bplustree

Conversation

@bkal01
Copy link
Copy Markdown

@bkal01 bkal01 commented Aug 6, 2020

This feature adds a secondary indexing option using a B+ tree closely mirroring the existing primary indexing option. I added Node and BPlusTree classes for the tree structure and included testing. I also added a secondary_indexing option when adding/reading records. For example, if the user wants to index by 'val', they pass secondary_indexing={'val': val} into the append_record method. The tree is saved in .aimrecords_storage/<insert_artifact_name>/val.tree and can be loaded/saved using pickle.

There are issues with consecutive runs of the secondary_indexing_example. For example, if we save 1000 records with val=1,2,...,1000 and then run the example again, There are 2000 records remembered but still only 1000 keys for the 1000 new records. This causes some issues when trying to access records using negative indexing

@arkkln arkkln added the on hold label Nov 1, 2020
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants