Genome-AI-Pathway-Designer

An AI-powered Metabolic Engineering Assistant designed to map biochemical pathways, extract enzymatic data, and optimize protein selection for synthetic biology. This tool leverages Neo4j Knowledge Graphs, ESM-2 Transformer Embeddings, and LLMs to provide detailed blueprints for producing high-value compounds like Astaxanthin.

Features

**Metabolic Pathfinding: Automatically identifies the "Shortest Path" between a precursor (e.g., beta-Carotene) and a target molecule (e.g., Astaxanthin).
**Enzyme Extraction: Integrates with KEGG and UniProt to retrieve specific EC numbers and amino acid sequences for each reaction step.
**Transformer Embeddings: Uses Meta's facebook/esm2_t6_8M_UR50D model to convert protein sequences into 320-dimensional vectors for functional similarity search.
**Graph-Based Intelligence: Stores complex biological relationships in Neo4j, allowing for the detection of "metabolic leaks" and co-factor requirements.
**Genome-Host Alignment: Evaluates enzyme compatibility based on source organism data to ensure pathways are stable within a specific microbial host.
**Automated Ingestion: Streamlined pipeline for crawling KEGG reaction maps and enriching the graph with real-time protein data.

Architecture

**Knowledge Graph (Neo4j): Acts as the "Brain," storing nodes for Compounds, Reactions, and Enzymes, and edges representing metabolic transformations.
**Protein Language Model (ESM-2): Applies a Transformer architecture to understand the "biological grammar" of enzymes and generate functional embeddings.
**Bio-Informatics Extractor: A custom ingestion engine that bridges the gap between raw web-based databases (KEGG/UniProt) and a structured graph.
**Vector Search: Enables mathematical comparison of enzymes to find the most efficient catalysts across different genomes.

Getting Started

Prerequisites

Python 3.11+
Docker (optional, for containerized deployment)
Access to OpenAI API or other LLM providers like Ollama
Vector database setup (e.g., ChromaDB)

Installation

Clone the repository:

git clone https://github.com/SilasPenda/Policy-Compliance-Agent
cd policy-compliance-auditor

Create & activate virtual environment:

python -m venv .venv
source .venv/bin/activate (Linux & Mac)
./.venv/Scripts/activate (Windows)

Install requirements:

python -m pip install --upgrade pip
pip install -r requirements.txt

Create a .env file and add your credentials:
Ingest biochemical data and generate AI embeddings:
```
   python ingestion/graph_ingestor.py
```
Launch API
```
uvicorn deployment.api:app --reload
```
Start App
```
streamlit run deployment/app.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
deployment		deployment
ingestion		ingestion
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Genome-AI-Pathway-Designer

Features

Architecture

Getting Started

Prerequisites

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Genome-AI-Pathway-Designer

Features

Architecture

Getting Started

Prerequisites

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages