A Multi-Modal Health Intelligence Agent for Biomedical Data Analysis
SpaceHealthAgent is an advanced AI-powered system that integrates multiple foundation models and knowledge graphs to analyze diverse health data types and provide evidence-based risk assessments and recommendations.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER INTERFACE β
β app = SpaceHealthAgent() β
β report = app.analyze(query, data_files) β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββ
β SpaceHealthAgent β
β - Creates orchestrator β
β - Builds workflow β
β - Executes pipeline β
ββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HealthIntelligenceOrchestrator β
β (MASTER CONTROLLER) β
β β
β Initializes All Agents: β
β ββ DataValidationAgent β
β ββ OmicsModelAgent (scGPT, Geneformer) β
β ββ ClinicalModelAgent (Clinical-T5) β
β ββ RadiationAgent (Physics models) β
β ββ KnowledgeGraphAgent (PrimeKG, UMLS) β
β ββ IntegrationEngine β
β ββ ReasoningEngine β
ββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββ
β LangGraph Workflow Pipeline β
βββββββββββββββββββββββββββββββββββ
WORKFLOW EXECUTION:
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
β route_query βββββββββΆβ Classifies data type β
ββββββββ¬ββββββββ β (gene_expression, microbiome, β
β β radiation, clinical) β
β ββββββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
β validate βββββββββΆβ DataValidationAgent β
ββββββββ¬ββββββββ β - Quality control β
β β - Format conversion β
β β - Confidence scoring β
β ββββββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
β run_models βββββββββΆβ Model Agents (data-type specific)β
ββββββββ¬ββββββββ β - OmicsModelAgent β
β β - ClinicalModelAgent β
β β - RadiationAgent β
β ββββββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
β query_kg βββββββββΆβ KnowledgeGraphAgent β
ββββββββ¬ββββββββ β - Query PrimeKG β
β β - Query UMLS β
β β - Query DrugBank β
β ββββββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
β integrate βββββββββΆβ IntegrationEngine β
ββββββββ¬ββββββββ β - Fuse multi-modal outputs β
β β - Resolve conflicts β
β β - Track uncertainty β
β ββββββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
β assess_risks βββββββββΆβ ReasoningEngine β
ββββββββ¬ββββββββ β - LLM-based risk analysis β
β β - Evidence synthesis β
β β - Confidence quantification β
β ββββββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
β recommend βββββββββΆβ ReasoningEngine β
ββββββββ¬ββββββββ β - Generate mitigation strategies β
β β - Evidence-based recommendations β
β ββββββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
β report βββββββββΆβ Format & add safety disclaimers β
ββββββββ¬ββββββββ β - Compile all results β
β β - Add medical disclaimers β
β β - Return final report β
βΌ ββββββββββββββββββββββββββββββββββββ
βββββββββββ
β OUTPUT β
βββββββββββ
- Python 3.10+
- OpenAI API key or Anthropic API key
- 8GB+ RAM recommended
- GPU optional (for local foundation models)
# Clone or download SpaceHealthAgent.py
# Install dependencies
pip install langchain langgraph langchain-openai langchain-anthropic
pip install openai anthropic
pip install pandas numpy
pip install fastapi uvicorn # Optional, for API deployment
# Optional: For working with omics data
pip install scanpy anndata
pip install biopython# Set your API key (choose one)
export OPENAI_API_KEY="your-openai-api-key"
# OR
export ANTHROPIC_API_KEY="your-anthropic-api-key"Input:
from SpaceHealthAgent import HealthIntelligenceApp
# Initialize the agent
app = HealthIntelligenceApp(llm_provider="openai")
# Prepare your data
gene_expression_data = {
"gene_expression_matrix": {
"IL6": [10.5, 12.3, 9.8, 11.2],
"TNF": [8.2, 9.1, 7.8, 8.9],
"BRCA1": [3.2, 3.0, 2.9, 3.1],
"ATM": [2.5, 2.3, 2.1, 2.2],
# ... more genes
}
}
# Run analysis
report = app.analyze(
query="Given gene expression data from a blood sample, determine potential health risks and provide mitigation strategies",
data_files=gene_expression_data
)
print(report)Output:
======================================================================
HEALTH INTELLIGENCE ASSESSMENT REPORT
======================================================================
QUERY: Given gene expression data from a blood sample, determine
potential health risks and provide mitigation strategies
DATA TYPE: gene_expression
TIMESTAMP: 2025-10-07 14:32:15
======================================================================
MODEL ANALYSIS RESULTS
======================================================================
{
"cell_types": ["T cells", "B cells", "Monocytes"],
"cell_type_proportions": {
"T cells": 0.45,
"B cells": 0.30,
"Monocytes": 0.25
},
"dysregulated_pathways": [
{
"name": "Inflammatory response",
"score": 0.85,
"direction": "up"
},
{
"name": "DNA repair",
"score": -0.65,
"direction": "down"
}
],
"disease_signatures": [
{
"disease": "Chronic inflammation",
"similarity": 0.78
}
],
"model_used": "scGPT",
"confidence": 0.82
}
======================================================================
KNOWLEDGE GRAPH INSIGHTS
======================================================================
{
"diseases": [
{
"name": "Chronic inflammation",
"confidence": 0.78,
"evidence_count": 150,
"source": "PrimeKG"
},
{
"name": "Autoimmune disorder risk",
"confidence": 0.65,
"evidence_count": 89,
"source": "PrimeKG"
}
],
"treatments": [
{
"drug": "Anti-inflammatory agents",
"mechanism": "COX-2 inhibition",
"evidence_level": "A"
},
{
"intervention": "Mediterranean diet",
"mechanism": "Omega-3 fatty acids",
"evidence_level": "B"
}
]
}
======================================================================
RISK ASSESSMENT
======================================================================
{
"risks": [
{
"risk": "Chronic inflammatory state",
"severity": "moderate",
"confidence": 0.82,
"evidence": [
"Elevated IL6 and TNF expression",
"Increased inflammatory pathway activation",
"Knowledge graph links to cardiovascular disease"
]
},
{
"risk": "Reduced DNA repair capacity",
"severity": "low-moderate",
"confidence": 0.65,
"evidence": [
"Downregulation of BRCA1 and ATM",
"May increase cancer susceptibility"
]
}
],
"overall_confidence": 0.73
}
======================================================================
RECOMMENDATIONS
======================================================================
IMMEDIATE ACTIONS (0-1 week):
1. Schedule comprehensive inflammatory panel with physician
- CRP, ESR, complete blood count
- Evidence: Standard clinical practice for inflammatory markers
2. Avoid pro-inflammatory triggers
- Reduce processed foods, excess sugar
- Evidence: Multiple meta-analyses show dietary impact on inflammation
SHORT-TERM ACTIONS (1-3 months):
1. Implement anti-inflammatory diet
- Mediterranean diet pattern
- Increase omega-3 intake (fish, flax seeds)
- Evidence Level: A (Multiple RCTs)
2. Regular moderate exercise
- 150 minutes/week of aerobic activity
- Evidence: Reduces inflammatory markers (CRP, IL-6)
3. Stress reduction techniques
- Meditation, yoga, adequate sleep
- Evidence: Shown to reduce inflammatory cytokines
LONG-TERM MONITORING:
1. Repeat blood work in 3 months to assess changes
2. Consider genetic counseling for BRCA1/ATM variants
3. Age-appropriate cancer screening (personalized schedule)
β οΈ CRITICAL: Consult your healthcare provider before making any
health decisions. This analysis is for informational purposes only.
======================================================================
CONFIDENCE & UNCERTAINTY
======================================================================
Overall Confidence: 73%
Confidence Breakdown:
[
{"source": "Data Validation", "confidence": 0.80},
{"source": "scGPT Analysis", "confidence": 0.82},
{"source": "Knowledge Graphs", "confidence": 0.75},
{"source": "Integration", "confidence": 0.70}
]
Limitations:
- Single timepoint measurement (no trend data)
- No genomic variant data available
- No clinical history provided
- Simulated foundation model outputs (demo version)
======================================================================
β οΈ IMPORTANT MEDICAL DISCLAIMER:
This analysis is generated by AI models for informational purposes only.
It is NOT medical advice and should not replace consultation with qualified
healthcare professionals. Always consult your doctor before making health
decisions.
======================================================================
Input:
# Initialize the agent
app = HealthIntelligenceApp(llm_provider="openai")
# Radiation exposure data
radiation_data = {
"dose_mGy": 20,
"type": "gamma",
"duration_hours": 2
}
# Run analysis
report = app.analyze(
query="20mGray radiation is detected in the atmosphere, what are the health risks and what mitigation strategies should be tried?",
data_files=radiation_data
)
print(report)Output:
======================================================================
HEALTH INTELLIGENCE ASSESSMENT REPORT
======================================================================
QUERY: 20mGray radiation is detected in the atmosphere, what are the
health risks and what mitigation strategies should be tried?
DATA TYPE: radiation
TIMESTAMP: 2025-10-07 14:45:22
======================================================================
MODEL ANALYSIS RESULTS
======================================================================
{
"absorbed_dose_mGy": 20,
"effective_dose_mSv": 20.0,
"lifetime_cancer_risk_increase": "0.1100%",
"risk_category": "Low",
"acute_effects_expected": false,
"model_used": "ICRP/BEIR physics models",
"confidence": 0.90
}
======================================================================
RISK ASSESSMENT
======================================================================
{
"risks": [
{
"risk": "Increased lifetime cancer risk",
"severity": "low",
"confidence": 0.90,
"evidence": [
"20 mSv effective dose",
"BEIR VII model: ~0.11% additional lifetime risk",
"Well-established linear no-threshold model"
]
},
{
"risk": "No acute radiation syndrome expected",
"severity": "none",
"confidence": 0.95,
"evidence": [
"Dose well below 1000 mSv threshold",
"ICRP guidelines for acute effects"
]
}
],
"overall_confidence": 0.92
}
======================================================================
RECOMMENDATIONS
======================================================================
IMMEDIATE ACTIONS (0-1 week):
1. Minimize further exposure
- Move to area with lower radiation levels if possible
- Limit time in contaminated area
- Evidence: ALARA principle (As Low As Reasonably Achievable)
2. Monitor for any acute symptoms
- Nausea, fatigue, skin changes (unlikely at this dose)
- Seek medical attention if symptoms develop
SHORT-TERM ACTIONS (1-3 months):
1. Medical evaluation and blood work
- Complete blood count to check blood cell levels
- Evidence: Standard protocol for radiation exposure
2. Consider potassium iodide if thyroid exposure
- Only if radioactive iodine is present
- Must be taken within 24 hours of exposure
- Evidence: FDA and WHO guidelines
3. Maintain healthy immune function
- Adequate nutrition, sleep, stress management
- Antioxidant-rich foods (may help with oxidative stress)
LONG-TERM MONITORING:
1. Annual medical checkups
2. Age-appropriate cancer screening
3. Inform healthcare providers of exposure history
RISK IN CONTEXT:
- This dose (20 mSv) is equivalent to ~2 years of natural background
radiation
- For comparison: CT scan abdomen/pelvis = 10-15 mSv
- Annual occupational limit for radiation workers = 50 mSv
- Risk is LOW but non-negligible
β οΈ Report exposure to public health authorities if part of larger
incident
======================================================================
CONFIDENCE & UNCERTAINTY
======================================================================
Overall Confidence: 92%
This confidence is HIGH because:
- Physics-based calculations (well-established models)
- ICRP and BEIR models extensively validated
- Dose measurement assumed accurate
======================================================================
Input:
# Initialize the agent
app = HealthIntelligenceApp(llm_provider="openai")
# Microbiome data (simplified - normally would be FASTQ files)
microbiome_data = {
"sequences": ["ATCGATCG" * 1000], # Placeholder
"read_count": 10000000,
"sample_type": "fecal"
}
# Run analysis
report = app.analyze(
query="Given shotgun whole genome sequence data from a fecal sample, what is my health status, and if there are signs of diseases what are mitigation strategies I should attempt?",
data_files=microbiome_data
)
print(report)Output:
======================================================================
HEALTH INTELLIGENCE ASSESSMENT REPORT
======================================================================
QUERY: Given shotgun whole genome sequence data from a fecal sample,
what is my health status...
DATA TYPE: microbiome
TIMESTAMP: 2025-10-07 15:02:18
======================================================================
MODEL ANALYSIS RESULTS
======================================================================
{
"taxonomy": {
"phylum": {
"Firmicutes": 0.45,
"Bacteroidetes": 0.35,
"Proteobacteria": 0.15
},
"genus": {
"Bacteroides": 0.25,
"Faecalibacterium": 0.18,
"Escherichia": 0.08
}
},
"diversity": {
"shannon_index": 3.2,
"simpson_index": 0.85
},
"dysbiosis_score": 0.35,
"functional_pathways": {
"SCFA_production": "reduced",
"inflammatory_markers": "elevated"
},
"model_used": "gNOMO pipeline",
"confidence": 0.75
}
======================================================================
RISK ASSESSMENT
======================================================================
{
"risks": [
{
"risk": "Mild-moderate gut dysbiosis",
"severity": "moderate",
"confidence": 0.75,
"evidence": [
"Reduced Faecalibacterium (beneficial SCFA producer)",
"Elevated Escherichia (potential inflammatory marker)",
"Below-optimal diversity (Shannon index 3.2, healthy >3.5)",
"Dysbiosis score 0.35 (0-1 scale)"
]
},
{
"risk": "Reduced short-chain fatty acid production",
"severity": "low-moderate",
"confidence": 0.70,
"evidence": [
"Low Faecalibacterium abundance",
"SCFA-producing pathways reduced",
"May affect gut barrier function"
]
}
],
"overall_confidence": 0.72
}
======================================================================
RECOMMENDATIONS
======================================================================
IMMEDIATE ACTIONS (0-1 week):
1. Increase dietary fiber intake
- Target 25-35g fiber per day
- Evidence: Feeds beneficial bacteria
2. Consider probiotic foods
- Yogurt, kefir, sauerkraut, kimchi
- Evidence: May improve gut diversity
SHORT-TERM ACTIONS (1-3 months):
1. Prebiotic supplementation
- Inulin, FOS (fructooligosaccharides)
- Dosage: 5-10g daily
- Evidence: Increases Bifidobacterium and Lactobacillus
2. Specific probiotic strains targeting dysbiosis:
- Lactobacillus rhamnosus GG
- Bifidobacterium longum
- Faecalibacterium prausnitzii (if available)
- Evidence Level: B (Multiple clinical trials)
3. Dietary modifications:
- Reduce processed foods and refined sugars
- Increase whole grains, legumes, vegetables
- Mediterranean diet pattern
- Evidence: Improves microbiome diversity
4. Consider reducing if applicable:
- Unnecessary antibiotic use
- Artificial sweeteners
- Emulsifiers in processed foods
LONG-TERM MONITORING:
1. Repeat microbiome analysis in 3-6 months
2. Track symptoms (digestive health, energy, immunity)
3. Continue healthy dietary patterns
POSITIVE INDICATORS:
- Overall diversity still reasonable (3.2)
- Good Bacteroides levels (fiber metabolism)
- No pathogenic species detected
======================================================================
# Option 1: OpenAI (default)
app = HealthIntelligenceApp(llm_provider="openai")
# Option 2: Anthropic Claude
app = HealthIntelligenceApp(llm_provider="anthropic")from SpaceHealthAgent import AdvancedHealthAgent
# Enable advanced features
app = AdvancedHealthAgent(
llm_provider="openai",
enable_caching=True
)
# Async processing
import asyncio
report = await app.analyze_async(query, data_files)
# Batch processing
queries = [
{"query": "...", "data_files": {...}},
{"query": "...", "data_files": {...}}
]
reports = app.analyze_batch(queries)
# With patient context
report = app.analyze_with_context(
query="...",
data_files={...},
patient_history={"conditions": ["diabetes"], "age": 45}
)from SpaceHealthAgent import TestHealthAgent
# Run test suite
tester = TestHealthAgent()
tester.run_all_tests()Expected output:
======================================================================
RUNNING TEST SUITE
======================================================================
Testing gene expression pipeline...
β Gene expression pipeline test passed
Testing radiation pipeline...
β Radiation pipeline test passed
Testing microbiome pipeline...
β Microbiome pipeline test passed
β All tests passed!
# In SpaceHealthAgent.py, use the FastAPI configuration at the bottom
# Run the API server
uvicorn SpaceHealthAgent:api_app --host 0.0.0.0 --port 8000# POST request to analyze health data
curl -X POST "http://localhost:8000/analyze" \
-H "Content-Type: application/json" \
-d '{
"query": "Analyze gene expression for health risks",
"data": {"gene_1": [10.5, 12.3], "gene_2": [5.2, 4.9]}
}'Dockerfile:
FROM python:3.10-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy agent
COPY SpaceHealthAgent.py .
# Set environment variables
ENV OPENAI_API_KEY=""
ENV ANTHROPIC_API_KEY=""
# Run the agent
CMD ["python", "SpaceHealthAgent.py"]docker-compose.yml:
version: '3.8'
services:
spacehealthagent:
build: .
ports:
- "8000:8000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./data:/app/data
- ./output:/app/outputBuild and run:
docker-compose up --build| Data Type | Foundation Models | Status | Example Input |
|---|---|---|---|
| Gene Expression | scGPT, Geneformer | β Supported | CSV/AnnData matrix |
| Microbiome | gNOMO, MintTea pipelines | β Supported | FASTQ files, taxonomy tables |
| Radiation | Physics-based (ICRP/BEIR) | β Supported | Dose, type, duration |
| Clinical Records | Clinical-T5, CEHR-BERT | EHR data, clinical notes | |
| Medical Imaging | BiomedCLIP, UNI | π Planned | DICOM, PNG, JPG |
| Wearables | Google Wearable FM | π Planned | Time-series sensor data |
The provided SpaceHealthAgent.py uses simulated foundation model outputs for demonstration purposes. To use actual foundation models:
-
Install model packages:
pip install scgpt geneformer
-
Download model weights:
# Example for scGPT from scgpt import load_model model = load_model("scgpt_whole_human")
-
Replace placeholder code in
OmicsModelAgent.analyze_gene_expression()with actual model inference calls
THIS IS NOT MEDICAL ADVICE. SpaceHealthAgent is a research tool for informational purposes only. Always consult qualified healthcare professionals for medical decisions.
# Format 1: Dictionary
data = {
"gene_name_1": [expr_sample1, expr_sample2, ...],
"gene_name_2": [expr_sample1, expr_sample2, ...],
}
# Format 2: File path to CSV
data = "path/to/expression_matrix.csv"
# Format 3: AnnData object (recommended)
import anndata
adata = anndata.read_h5ad("data.h5ad")
data = adata# Format 1: FASTQ file paths
data = {
"sequences": "path/to/sample.fastq",
"sample_type": "fecal"
}
# Format 2: Pre-computed taxonomy
data = {
"taxonomy": {
"Firmicutes": 0.45,
"Bacteroidetes": 0.35
},
"read_count": 10000000
}data = {
"dose_mGy": 20.0, # Absorbed dose in milliGray
"type": "gamma", # Type: gamma, beta, alpha, neutron
"duration_hours": 2.0 # Exposure duration
}1. API Key Not Found
Error: OpenAI API key not found
Solution: export OPENAI_API_KEY="your-key-here"2. Import Errors
Error: No module named 'langgraph'
Solution: pip install langgraph langchain langchain-openai3. Memory Issues
Error: Out of memory
Solution: Process smaller datasets or use GPU for large foundation models4. Model Loading Fails
Error: Model weights not found
Solution: Ensure you've downloaded the foundation model weights-
Foundation Models Documentation:
-
Knowledge Graphs:
-
Related Papers:
- scGPT: Nature Methods 2024
- Nucleotide Transformer: Nature Methods 2025
- AlphaFold2: Nature 2021
To extend SpaceHealthAgent with new models or data types:
- Add new agent class in the appropriate section
- Add node method in
HealthIntelligenceOrchestrator - Update
build_workflow()to include new node - Update
route_query()for new data type detection
Example:
class NewModelAgent:
def analyze_new_datatype(self, data):
# Your implementation
return resultsThis software is provided for research and educational purposes. Commercial use requires separate licensing for foundation models and knowledge graph databases.
For questions or issues, please refer to the inline documentation in SpaceHealthAgent.py or create an issue in your repository.
Version: 1.0.0
Last Updated: October 2025
Maintainer: Your Name/Organization