The Enterprise Reliability Guard is a stateful, multi-document Retrieval-Augmented Generation (RAG) agent designed to query dense financial filings with high precision. Built on a LangGraph state machine, the architecture prioritizes data integrity and deterministic execution. It utilizes an LLM-as-a-judge framework paired with heuristic short-circuits to minimize hallucinations and enforce strict fail-closed safety parameters during API degradation.
- Orchestration: LangGraph (Cyclic State Machine)
- Inference: Llama-3-8B (via Groq API)
- Embedding Model: sentence-transformers/all-MiniLM-L6-v2
- Vector Database: ChromaDB (Dense Semantic Search with Metadata Partitioning)
- Cross-Encoder Reranking: FlashRank (ms-marco-MiniLM-L-12-v2)
- Neural Evaluation: RAGAS (Faithfulness metric)
- Observability: LangSmith (Graph execution tracing)
- Zero-Shot Extraction: Utilizes an LLM routing node to extract target entity constraints (e.g., stock tickers) from unstructured user queries.
- Robust Parsing: Implements a regex-based sanitization layer to catch and format conversational LLM outputs into strict JSON.
- Search Space Isolation: Dynamically injects the extracted metadata filters into the ChromaDB retriever, physically isolating the vector search and preventing context leakage across the multi-document data lake.
- Latency Optimization: Addresses the inherent API latency and rate-limit bottlenecks associated with LLM-as-a-judge (RAGAS) evaluation frameworks.
- Deterministic Heuristics: Implements a Fast Pass pre-grader that uses regular expressions to verify continuous numeric claims directly against the retrieved context chunks.
- Dynamic Handoff: Bypasses the expensive neural evaluator entirely if heuristic validation passes, reducing evaluation latency by over 80 percent while maintaining strict grounding accuracy.
- Execution Tracing: Leverages LangGraph to capture execution iterations and manage cyclic retries during API rate limits or timeouts.
- Fail-Closed Security: Hardcoded to terminate the evaluation loop if the neural evaluator raises exceptions beyond the maximum retry threshold.
- Hallucination Prevention: The system actively refuses to generate an answer upon evaluation failure, prioritizing enterprise trust and data reliability over conversational fluidity.
-
Clone the repository:
git clone [https://github.com/your-username/reliability-guard.git](https://github.com/your-username/reliability-guard.git) cd reliability-guard -
Install dependencies:
pip install -r requirements.txt
-
Environment Configuration: Create a .env file in the root directory. To enable LangSmith execution tracing, include the LangChain environment variables:
GROQ_API_KEY=your_groq_api_key LANGCHAIN_TRACING_V2=true LANGCHAIN_ENDPOINT=[https://api.smith.langchain.com](https://api.smith.langchain.com) LANGCHAIN_API_KEY=your_langsmith_api_key LANGCHAIN_PROJECT=reliability-guard
-
Data Ingestion: Create a data/ directory. Provision the data lake with PDF documents utilizing the strict naming convention TICKER_10K_YEAR.pdf to ensure accurate metadata extraction during vectorization.
-
Initialize the Agent:
python main.py