SpaceHealthAgent

A Multi-Modal Health Intelligence Agent for Biomedical Data Analysis

SpaceHealthAgent is an advanced AI-powered system that integrates multiple foundation models and knowledge graphs to analyze diverse health data types and provide evidence-based risk assessments and recommendations.

🏗️ Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                         USER INTERFACE                              │
│   app = SpaceHealthAgent()                                          │
│   report = app.analyze(query, data_files)                          │
└──────────────────────────┬──────────────────────────────────────────┘
                           │
                           ▼
              ┌────────────────────────────────┐
              │  SpaceHealthAgent              │
              │  - Creates orchestrator        │
              │  - Builds workflow             │
              │  - Executes pipeline           │
              └────────────┬───────────────────┘
                           │
                           ▼
              ┌─────────────────────────────────────────────────────┐
              │  HealthIntelligenceOrchestrator                     │
              │  (MASTER CONTROLLER)                                │
              │                                                     │
              │  Initializes All Agents:                            │
              │  ├─ DataValidationAgent                            │
              │  ├─ OmicsModelAgent (scGPT, Geneformer)           │
              │  ├─ ClinicalModelAgent (Clinical-T5)               │
              │  ├─ RadiationAgent (Physics models)                │
              │  ├─ KnowledgeGraphAgent (PrimeKG, UMLS)           │
              │  ├─ IntegrationEngine                              │
              │  └─ ReasoningEngine                                │
              └────────────┬────────────────────────────────────────┘
                           │
                           ▼
              ┌─────────────────────────────────┐
              │  LangGraph Workflow Pipeline    │
              └─────────────────────────────────┘

                     WORKFLOW EXECUTION:

  ┌──────────────┐        ┌──────────────────────────────────┐
  │ route_query  │───────▶│ Classifies data type             │
  └──────┬───────┘        │ (gene_expression, microbiome,    │
         │                │  radiation, clinical)             │
         │                └──────────────────────────────────┘
         ▼
  ┌──────────────┐        ┌──────────────────────────────────┐
  │   validate   │───────▶│ DataValidationAgent              │
  └──────┬───────┘        │ - Quality control                │
         │                │ - Format conversion              │
         │                │ - Confidence scoring             │
         │                └──────────────────────────────────┘
         ▼
  ┌──────────────┐        ┌──────────────────────────────────┐
  │  run_models  │───────▶│ Model Agents (data-type specific)│
  └──────┬───────┘        │ - OmicsModelAgent                │
         │                │ - ClinicalModelAgent             │
         │                │ - RadiationAgent                 │
         │                └──────────────────────────────────┘
         ▼
  ┌──────────────┐        ┌──────────────────────────────────┐
  │  query_kg    │───────▶│ KnowledgeGraphAgent              │
  └──────┬───────┘        │ - Query PrimeKG                  │
         │                │ - Query UMLS                     │
         │                │ - Query DrugBank                 │
         │                └──────────────────────────────────┘
         ▼
  ┌──────────────┐        ┌──────────────────────────────────┐
  │  integrate   │───────▶│ IntegrationEngine                │
  └──────┬───────┘        │ - Fuse multi-modal outputs       │
         │                │ - Resolve conflicts              │
         │                │ - Track uncertainty              │
         │                └──────────────────────────────────┘
         ▼
  ┌──────────────┐        ┌──────────────────────────────────┐
  │ assess_risks │───────▶│ ReasoningEngine                  │
  └──────┬───────┘        │ - LLM-based risk analysis        │
         │                │ - Evidence synthesis             │
         │                │ - Confidence quantification      │
         │                └──────────────────────────────────┘
         ▼
  ┌──────────────┐        ┌──────────────────────────────────┐
  │  recommend   │───────▶│ ReasoningEngine                  │
  └──────┬───────┘        │ - Generate mitigation strategies │
         │                │ - Evidence-based recommendations │
         │                └──────────────────────────────────┘
         ▼
  ┌──────────────┐        ┌──────────────────────────────────┐
  │    report    │───────▶│ Format & add safety disclaimers  │
  └──────┬───────┘        │ - Compile all results            │
         │                │ - Add medical disclaimers        │
         │                │ - Return final report            │
         ▼                └──────────────────────────────────┘
    ┌─────────┐
    │  OUTPUT │
    └─────────┘

🚀 Quick Start

Prerequisites

Python 3.10+
OpenAI API key or Anthropic API key
8GB+ RAM recommended
GPU optional (for local foundation models)

Installation

# Clone or download SpaceHealthAgent.py

# Install dependencies
pip install langchain langgraph langchain-openai langchain-anthropic
pip install openai anthropic
pip install pandas numpy
pip install fastapi uvicorn  # Optional, for API deployment

# Optional: For working with omics data
pip install scanpy anndata
pip install biopython

Environment Setup

# Set your API key (choose one)
export OPENAI_API_KEY="your-openai-api-key"
# OR
export ANTHROPIC_API_KEY="your-anthropic-api-key"

📖 Usage Examples

Example 1: Gene Expression Analysis

Input:

from SpaceHealthAgent import HealthIntelligenceApp

# Initialize the agent
app = HealthIntelligenceApp(llm_provider="openai")

# Prepare your data
gene_expression_data = {
    "gene_expression_matrix": {
        "IL6": [10.5, 12.3, 9.8, 11.2],
        "TNF": [8.2, 9.1, 7.8, 8.9],
        "BRCA1": [3.2, 3.0, 2.9, 3.1],
        "ATM": [2.5, 2.3, 2.1, 2.2],
        # ... more genes
    }
}

# Run analysis
report = app.analyze(
    query="Given gene expression data from a blood sample, determine potential health risks and provide mitigation strategies",
    data_files=gene_expression_data
)

print(report)

Output:

======================================================================
HEALTH INTELLIGENCE ASSESSMENT REPORT
======================================================================

QUERY: Given gene expression data from a blood sample, determine 
       potential health risks and provide mitigation strategies
DATA TYPE: gene_expression
TIMESTAMP: 2025-10-07 14:32:15

======================================================================
MODEL ANALYSIS RESULTS
======================================================================

{
  "cell_types": ["T cells", "B cells", "Monocytes"],
  "cell_type_proportions": {
    "T cells": 0.45,
    "B cells": 0.30,
    "Monocytes": 0.25
  },
  "dysregulated_pathways": [
    {
      "name": "Inflammatory response",
      "score": 0.85,
      "direction": "up"
    },
    {
      "name": "DNA repair",
      "score": -0.65,
      "direction": "down"
    }
  ],
  "disease_signatures": [
    {
      "disease": "Chronic inflammation",
      "similarity": 0.78
    }
  ],
  "model_used": "scGPT",
  "confidence": 0.82
}

======================================================================
KNOWLEDGE GRAPH INSIGHTS
======================================================================

{
  "diseases": [
    {
      "name": "Chronic inflammation",
      "confidence": 0.78,
      "evidence_count": 150,
      "source": "PrimeKG"
    },
    {
      "name": "Autoimmune disorder risk",
      "confidence": 0.65,
      "evidence_count": 89,
      "source": "PrimeKG"
    }
  ],
  "treatments": [
    {
      "drug": "Anti-inflammatory agents",
      "mechanism": "COX-2 inhibition",
      "evidence_level": "A"
    },
    {
      "intervention": "Mediterranean diet",
      "mechanism": "Omega-3 fatty acids",
      "evidence_level": "B"
    }
  ]
}

======================================================================
RISK ASSESSMENT
======================================================================

{
  "risks": [
    {
      "risk": "Chronic inflammatory state",
      "severity": "moderate",
      "confidence": 0.82,
      "evidence": [
        "Elevated IL6 and TNF expression",
        "Increased inflammatory pathway activation",
        "Knowledge graph links to cardiovascular disease"
      ]
    },
    {
      "risk": "Reduced DNA repair capacity",
      "severity": "low-moderate",
      "confidence": 0.65,
      "evidence": [
        "Downregulation of BRCA1 and ATM",
        "May increase cancer susceptibility"
      ]
    }
  ],
  "overall_confidence": 0.73
}

======================================================================
RECOMMENDATIONS
======================================================================

IMMEDIATE ACTIONS (0-1 week):
1. Schedule comprehensive inflammatory panel with physician
   - CRP, ESR, complete blood count
   - Evidence: Standard clinical practice for inflammatory markers

2. Avoid pro-inflammatory triggers
   - Reduce processed foods, excess sugar
   - Evidence: Multiple meta-analyses show dietary impact on inflammation

SHORT-TERM ACTIONS (1-3 months):
1. Implement anti-inflammatory diet
   - Mediterranean diet pattern
   - Increase omega-3 intake (fish, flax seeds)
   - Evidence Level: A (Multiple RCTs)

2. Regular moderate exercise
   - 150 minutes/week of aerobic activity
   - Evidence: Reduces inflammatory markers (CRP, IL-6)

3. Stress reduction techniques
   - Meditation, yoga, adequate sleep
   - Evidence: Shown to reduce inflammatory cytokines

LONG-TERM MONITORING:
1. Repeat blood work in 3 months to assess changes
2. Consider genetic counseling for BRCA1/ATM variants
3. Age-appropriate cancer screening (personalized schedule)

⚠️ CRITICAL: Consult your healthcare provider before making any 
health decisions. This analysis is for informational purposes only.

======================================================================
CONFIDENCE & UNCERTAINTY
======================================================================

Overall Confidence: 73%

Confidence Breakdown:
[
  {"source": "Data Validation", "confidence": 0.80},
  {"source": "scGPT Analysis", "confidence": 0.82},
  {"source": "Knowledge Graphs", "confidence": 0.75},
  {"source": "Integration", "confidence": 0.70}
]

Limitations:
- Single timepoint measurement (no trend data)
- No genomic variant data available
- No clinical history provided
- Simulated foundation model outputs (demo version)

======================================================================
⚠️ IMPORTANT MEDICAL DISCLAIMER:
This analysis is generated by AI models for informational purposes only.
It is NOT medical advice and should not replace consultation with qualified
healthcare professionals. Always consult your doctor before making health 
decisions.
======================================================================

Example 2: Radiation Exposure Analysis

Input:

# Initialize the agent
app = HealthIntelligenceApp(llm_provider="openai")

# Radiation exposure data
radiation_data = {
    "dose_mGy": 20,
    "type": "gamma",
    "duration_hours": 2
}

# Run analysis
report = app.analyze(
    query="20mGray radiation is detected in the atmosphere, what are the health risks and what mitigation strategies should be tried?",
    data_files=radiation_data
)

print(report)

Output:

======================================================================
HEALTH INTELLIGENCE ASSESSMENT REPORT
======================================================================

QUERY: 20mGray radiation is detected in the atmosphere, what are the 
       health risks and what mitigation strategies should be tried?
DATA TYPE: radiation
TIMESTAMP: 2025-10-07 14:45:22

======================================================================
MODEL ANALYSIS RESULTS
======================================================================

{
  "absorbed_dose_mGy": 20,
  "effective_dose_mSv": 20.0,
  "lifetime_cancer_risk_increase": "0.1100%",
  "risk_category": "Low",
  "acute_effects_expected": false,
  "model_used": "ICRP/BEIR physics models",
  "confidence": 0.90
}

======================================================================
RISK ASSESSMENT
======================================================================

{
  "risks": [
    {
      "risk": "Increased lifetime cancer risk",
      "severity": "low",
      "confidence": 0.90,
      "evidence": [
        "20 mSv effective dose",
        "BEIR VII model: ~0.11% additional lifetime risk",
        "Well-established linear no-threshold model"
      ]
    },
    {
      "risk": "No acute radiation syndrome expected",
      "severity": "none",
      "confidence": 0.95,
      "evidence": [
        "Dose well below 1000 mSv threshold",
        "ICRP guidelines for acute effects"
      ]
    }
  ],
  "overall_confidence": 0.92
}

======================================================================
RECOMMENDATIONS
======================================================================

IMMEDIATE ACTIONS (0-1 week):
1. Minimize further exposure
   - Move to area with lower radiation levels if possible
   - Limit time in contaminated area
   - Evidence: ALARA principle (As Low As Reasonably Achievable)

2. Monitor for any acute symptoms
   - Nausea, fatigue, skin changes (unlikely at this dose)
   - Seek medical attention if symptoms develop

SHORT-TERM ACTIONS (1-3 months):
1. Medical evaluation and blood work
   - Complete blood count to check blood cell levels
   - Evidence: Standard protocol for radiation exposure

2. Consider potassium iodide if thyroid exposure
   - Only if radioactive iodine is present
   - Must be taken within 24 hours of exposure
   - Evidence: FDA and WHO guidelines

3. Maintain healthy immune function
   - Adequate nutrition, sleep, stress management
   - Antioxidant-rich foods (may help with oxidative stress)

LONG-TERM MONITORING:
1. Annual medical checkups
2. Age-appropriate cancer screening
3. Inform healthcare providers of exposure history

RISK IN CONTEXT:
- This dose (20 mSv) is equivalent to ~2 years of natural background 
  radiation
- For comparison: CT scan abdomen/pelvis = 10-15 mSv
- Annual occupational limit for radiation workers = 50 mSv
- Risk is LOW but non-negligible

⚠️ Report exposure to public health authorities if part of larger 
   incident

======================================================================
CONFIDENCE & UNCERTAINTY
======================================================================

Overall Confidence: 92%

This confidence is HIGH because:
- Physics-based calculations (well-established models)
- ICRP and BEIR models extensively validated
- Dose measurement assumed accurate

======================================================================

Example 3: Microbiome Analysis

Input:

# Initialize the agent
app = HealthIntelligenceApp(llm_provider="openai")

# Microbiome data (simplified - normally would be FASTQ files)
microbiome_data = {
    "sequences": ["ATCGATCG" * 1000],  # Placeholder
    "read_count": 10000000,
    "sample_type": "fecal"
}

# Run analysis
report = app.analyze(
    query="Given shotgun whole genome sequence data from a fecal sample, what is my health status, and if there are signs of diseases what are mitigation strategies I should attempt?",
    data_files=microbiome_data
)

print(report)

Output:

======================================================================
HEALTH INTELLIGENCE ASSESSMENT REPORT
======================================================================

QUERY: Given shotgun whole genome sequence data from a fecal sample,
       what is my health status...
DATA TYPE: microbiome
TIMESTAMP: 2025-10-07 15:02:18

======================================================================
MODEL ANALYSIS RESULTS
======================================================================

{
  "taxonomy": {
    "phylum": {
      "Firmicutes": 0.45,
      "Bacteroidetes": 0.35,
      "Proteobacteria": 0.15
    },
    "genus": {
      "Bacteroides": 0.25,
      "Faecalibacterium": 0.18,
      "Escherichia": 0.08
    }
  },
  "diversity": {
    "shannon_index": 3.2,
    "simpson_index": 0.85
  },
  "dysbiosis_score": 0.35,
  "functional_pathways": {
    "SCFA_production": "reduced",
    "inflammatory_markers": "elevated"
  },
  "model_used": "gNOMO pipeline",
  "confidence": 0.75
}

======================================================================
RISK ASSESSMENT
======================================================================

{
  "risks": [
    {
      "risk": "Mild-moderate gut dysbiosis",
      "severity": "moderate",
      "confidence": 0.75,
      "evidence": [
        "Reduced Faecalibacterium (beneficial SCFA producer)",
        "Elevated Escherichia (potential inflammatory marker)",
        "Below-optimal diversity (Shannon index 3.2, healthy >3.5)",
        "Dysbiosis score 0.35 (0-1 scale)"
      ]
    },
    {
      "risk": "Reduced short-chain fatty acid production",
      "severity": "low-moderate",
      "confidence": 0.70,
      "evidence": [
        "Low Faecalibacterium abundance",
        "SCFA-producing pathways reduced",
        "May affect gut barrier function"
      ]
    }
  ],
  "overall_confidence": 0.72
}

======================================================================
RECOMMENDATIONS
======================================================================

IMMEDIATE ACTIONS (0-1 week):
1. Increase dietary fiber intake
   - Target 25-35g fiber per day
   - Evidence: Feeds beneficial bacteria

2. Consider probiotic foods
   - Yogurt, kefir, sauerkraut, kimchi
   - Evidence: May improve gut diversity

SHORT-TERM ACTIONS (1-3 months):
1. Prebiotic supplementation
   - Inulin, FOS (fructooligosaccharides)
   - Dosage: 5-10g daily
   - Evidence: Increases Bifidobacterium and Lactobacillus

2. Specific probiotic strains targeting dysbiosis:
   - Lactobacillus rhamnosus GG
   - Bifidobacterium longum
   - Faecalibacterium prausnitzii (if available)
   - Evidence Level: B (Multiple clinical trials)

3. Dietary modifications:
   - Reduce processed foods and refined sugars
   - Increase whole grains, legumes, vegetables
   - Mediterranean diet pattern
   - Evidence: Improves microbiome diversity

4. Consider reducing if applicable:
   - Unnecessary antibiotic use
   - Artificial sweeteners
   - Emulsifiers in processed foods

LONG-TERM MONITORING:
1. Repeat microbiome analysis in 3-6 months
2. Track symptoms (digestive health, energy, immunity)
3. Continue healthy dietary patterns

POSITIVE INDICATORS:
- Overall diversity still reasonable (3.2)
- Good Bacteroides levels (fiber metabolism)
- No pathogenic species detected

======================================================================

🔧 Configuration Options

Choose Your LLM Provider

# Option 1: OpenAI (default)
app = HealthIntelligenceApp(llm_provider="openai")

# Option 2: Anthropic Claude
app = HealthIntelligenceApp(llm_provider="anthropic")

Advanced Features

from SpaceHealthAgent import AdvancedHealthAgent

# Enable advanced features
app = AdvancedHealthAgent(
    llm_provider="openai",
    enable_caching=True
)

# Async processing
import asyncio
report = await app.analyze_async(query, data_files)

# Batch processing
queries = [
    {"query": "...", "data_files": {...}},
    {"query": "...", "data_files": {...}}
]
reports = app.analyze_batch(queries)

# With patient context
report = app.analyze_with_context(
    query="...",
    data_files={...},
    patient_history={"conditions": ["diabetes"], "age": 45}
)

🧪 Testing

from SpaceHealthAgent import TestHealthAgent

# Run test suite
tester = TestHealthAgent()
tester.run_all_tests()

Expected output:

======================================================================
RUNNING TEST SUITE
======================================================================

Testing gene expression pipeline...
✓ Gene expression pipeline test passed
Testing radiation pipeline...
✓ Radiation pipeline test passed
Testing microbiome pipeline...
✓ Microbiome pipeline test passed

✓ All tests passed!

🌐 API Deployment

Running as FastAPI Service

# In SpaceHealthAgent.py, use the FastAPI configuration at the bottom

# Run the API server
uvicorn SpaceHealthAgent:api_app --host 0.0.0.0 --port 8000

API Usage

# POST request to analyze health data
curl -X POST "http://localhost:8000/analyze" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Analyze gene expression for health risks",
    "data": {"gene_1": [10.5, 12.3], "gene_2": [5.2, 4.9]}
  }'

🐳 Docker Deployment

Dockerfile:

FROM python:3.10-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy agent
COPY SpaceHealthAgent.py .

# Set environment variables
ENV OPENAI_API_KEY=""
ENV ANTHROPIC_API_KEY=""

# Run the agent
CMD ["python", "SpaceHealthAgent.py"]

docker-compose.yml:

version: '3.8'

services:
  spacehealthagent:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    volumes:
      - ./data:/app/data
      - ./output:/app/output

Build and run:

docker-compose up --build

📊 Supported Data Types

Data Type	Foundation Models	Status	Example Input
Gene Expression	scGPT, Geneformer	✅ Supported	CSV/AnnData matrix
Microbiome	gNOMO, MintTea pipelines	✅ Supported	FASTQ files, taxonomy tables
Radiation	Physics-based (ICRP/BEIR)	✅ Supported	Dose, type, duration
Clinical Records	Clinical-T5, CEHR-BERT	⚠️ Simulated	EHR data, clinical notes
Medical Imaging	BiomedCLIP, UNI	🔄 Planned	DICOM, PNG, JPG
Wearables	Google Wearable FM	🔄 Planned	Time-series sensor data

⚠️ Important Limitations

Current Demo Version

The provided SpaceHealthAgent.py uses simulated foundation model outputs for demonstration purposes. To use actual foundation models:

Install model packages:
```
pip install scgpt geneformer
```

Download model weights:

# Example for scGPT
from scgpt import load_model
model = load_model("scgpt_whole_human")

Replace placeholder code in OmicsModelAgent.analyze_gene_expression() with actual model inference calls

Medical Disclaimer

THIS IS NOT MEDICAL ADVICE. SpaceHealthAgent is a research tool for informational purposes only. Always consult qualified healthcare professionals for medical decisions.

📝 Input Data Formats

Gene Expression Data

# Format 1: Dictionary
data = {
    "gene_name_1": [expr_sample1, expr_sample2, ...],
    "gene_name_2": [expr_sample1, expr_sample2, ...],
}

# Format 2: File path to CSV
data = "path/to/expression_matrix.csv"

# Format 3: AnnData object (recommended)
import anndata
adata = anndata.read_h5ad("data.h5ad")
data = adata

Microbiome Data

# Format 1: FASTQ file paths
data = {
    "sequences": "path/to/sample.fastq",
    "sample_type": "fecal"
}

# Format 2: Pre-computed taxonomy
data = {
    "taxonomy": {
        "Firmicutes": 0.45,
        "Bacteroidetes": 0.35
    },
    "read_count": 10000000
}

Radiation Data

data = {
    "dose_mGy": 20.0,           # Absorbed dose in milliGray
    "type": "gamma",             # Type: gamma, beta, alpha, neutron
    "duration_hours": 2.0        # Exposure duration
}

🔍 Troubleshooting

Common Issues

1. API Key Not Found

Error: OpenAI API key not found
Solution: export OPENAI_API_KEY="your-key-here"

2. Import Errors

Error: No module named 'langgraph'
Solution: pip install langgraph langchain langchain-openai

3. Memory Issues

Error: Out of memory
Solution: Process smaller datasets or use GPU for large foundation models

4. Model Loading Fails

Error: Model weights not found
Solution: Ensure you've downloaded the foundation model weights

📚 Additional Resources

Foundation Models Documentation:
Knowledge Graphs:
- PrimeKG
- UMLS
Related Papers:
- scGPT: Nature Methods 2024
- Nucleotide Transformer: Nature Methods 2025
- AlphaFold2: Nature 2021

🤝 Contributing

To extend SpaceHealthAgent with new models or data types:

Add new agent class in the appropriate section
Add node method in HealthIntelligenceOrchestrator
Update build_workflow() to include new node
Update route_query() for new data type detection

Example:

class NewModelAgent:
    def analyze_new_datatype(self, data):
        # Your implementation
        return results

📄 License

This software is provided for research and educational purposes. Commercial use requires separate licensing for foundation models and knowledge graph databases.

📧 Support

For questions or issues, please refer to the inline documentation in SpaceHealthAgent.py or create an issue in your repository.

Version: 1.0.0
Last Updated: October 2025
Maintainer: Your Name/Organization

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Implementation_Guide.md		Implementation_Guide.md
README.md		README.md
SpaceHealthAgent.py		SpaceHealthAgent.py
agentic_workflow_architecture.md		agentic_workflow_architecture.md

Folders and files

Latest commit

History

Repository files navigation

SpaceHealthAgent

🏗️ Architecture Overview

🚀 Quick Start

Prerequisites

Installation

Environment Setup

📖 Usage Examples

Example 1: Gene Expression Analysis

Example 2: Radiation Exposure Analysis

Example 3: Microbiome Analysis

🔧 Configuration Options

Choose Your LLM Provider

Advanced Features

🧪 Testing

🌐 API Deployment

Running as FastAPI Service

API Usage

🐳 Docker Deployment

📊 Supported Data Types

⚠️ Important Limitations

Current Demo Version

Medical Disclaimer

📝 Input Data Formats

Gene Expression Data

Microbiome Data

Radiation Data

🔍 Troubleshooting

Common Issues

📚 Additional Resources

🤝 Contributing

📄 License

📧 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages