Skip to content

pavanchhatpar/llm_retriever

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Information Retrieval

Vector DB + LLM chaining using langchain with open source models for an information retrieval system on domain specific data. It enhances the experience of using a search engine to get direct concise answers besides pointing to the source document referred to generate the answer. I'm using this repository to document my experiments with generative llm as new methods/ tricks are released in the open source.

Requirements

  • Packages required are installed at the beginning of the notebook
  • Standard_NC64as_T4_v3 Azure VM node type was used to run the notebook

Overview

Vector Index Setup

  • Collect all documents of your corpus into a single folder in pdf format
  • Index is created by reading each document page-by-page and ahead of this each page will be referred as a document
  • Embeddings for the vector index are generated by a text embedding model, various sentence-transformers models are available to choose from here
  • FAISS is used to create an index of all these vectors and can be designed as complex as necessary to trade between faster retrieval speed and accuracy of retrieval

Generative LLM Setup

  • From this leaderboard make a choice of the model
  • Each model comes with its own complexities of hardware needed to load it and the packages that were used for training it
  • Most leading models on huggingface provide guidance on both of these and its best to follow them before trying customizations
  • Tweaking around the generation parameters like temperature, top_p, top_k, etc. helps in controlling quality of the generation

Steps of running a query

  • The question is first run against the vector index to get top hits of documents
  • Number of topk hits that can be used is limited by context length supported by the Generative LLM and chunking used to decide the length of each document
  • A prompt template helps in explaining the task to the model with some examples given showing to set expectations for the generated tokens
  • It is also prompted to return the document identifier as a source reference and the template gives explicit instructions on how this should be formatted
  • A limited set of topk hits are sent in the prompt template to the model with the question to generate the answer
  • Since it was prompted the follow a format in the answer in order to cite the reference, checking whether the format was used or not can help in discarding one of the cases where the Generative LLM definitely halucinated

Models used

Text embedding

Generative LLM

About

Vector DB + LLM chaining using langchain with open source models for an information retrieval system on domain specific data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors