Skip to content

Gabrielm3/clinical-trials-knowledge-graph

Repository files navigation

🧬 Clinical Trials Knowledge Graph

This project loads clinical trial data into a Neo4j graph database and models the relationships between trials, conditions, interventions, sponsors, and collaborators. It enables graph-based exploration of clinical research data from ClinicalTrials.gov

alt text

🚀 Quick Start with Docker

# Clone the repository
git clone https://github.com/Gabrielm3/clinical-trials-knowledge-graph
cd clinical-trials-knowledge-graph

# Start Neo4j database
docker-compose up -d neo4j

# Wait for Neo4j to be ready (about 30 seconds)
docker-compose logs -f neo4j

# Install Python dependencies (local development)
pip install -r requirements.txt

# Load data into Neo4j
python scripts/load_to_neo4j.py

# Access Neo4j Browser at http://localhost:7474
# Login: neo4j / clinicaltrials123

📋 Prerequisites

  • Docker & Docker Compose
  • Python 3.11+ (for local development)
  • Git

📊 Graph Schema

Each clinical trial is connected to related entities through clear relationships:

graph TD
    T[Trial] --> |HAS_CONDITION| C[Condition]
    T --> |HAS_INTERVENTION| I[Intervention]
    T --> |SPONSORED_BY| S[Sponsor]
    T --> |COLLABORATED_BY| CB[Collaborator]
Loading

Cypher Schema:

(Trial)-[:HAS_CONDITION]->(Condition)
(Trial)-[:HAS_INTERVENTION]->(Intervention)
(Trial)-[:SPONSORED_BY]->(Sponsor)
(Trial)-[:COLLABORATED_BY]->(Collaborator)

🐳 Docker Commands

# Start only Neo4j
docker-compose up -d neo4j

# Start everything (Neo4j + App)
docker-compose --profile app up -d

# View logs
docker-compose logs -f neo4j

# Stop services
docker-compose down

# Clean up (removes volumes)
docker-compose down -v

🧪 Example Mapping

A sample CSV row like:

Field Value
NCT Number NCT001
Conditions Diabetes; Obesity
Interventions Metformin
Sponsor NIH
Collaborators Harvard; UCSF

Creates this graph structure:

graph LR
    T["Trial<br/>NCT001"] --> |HAS_CONDITION| D[Diabetes]
    T --> |HAS_CONDITION| O[Obesity]
    T --> |HAS_INTERVENTION| M[Metformin]
    T --> |SPONSORED_BY| N[NIH]
    T --> |COLLABORATED_BY| H[Harvard]
    T --> |COLLABORATED_BY| U[UCSF]
Loading

🔍 Visualize in Neo4j Browser

  1. Open Neo4j browser at http://localhost:7474
  2. Login with credentials: neo4j / clinicaltrials123
  3. Run this Cypher query to view a subgraph:
MATCH (t:Trial)-[r]->(n)
RETURN t, r, n
LIMIT 50

📈 Sample Queries

// Find trials for specific condition
MATCH (t:Trial)-[:HAS_CONDITION]->(c:Condition {name: "Coronavirus Infections"})
RETURN t.title, t.status
LIMIT 10

// Most common sponsors
MATCH (s:Sponsor)<-[:SPONSORED_BY]-(t:Trial)
RETURN s.name, count(t) as trial_count
ORDER BY trial_count DESC
LIMIT 10

// Trials with multiple conditions
MATCH (t:Trial)-[:HAS_CONDITION]->(c:Condition)
WITH t, count(c) as condition_count
WHERE condition_count > 1
RETURN t.title, condition_count
ORDER BY condition_count DESC

🛠️ Development

# Local development setup
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt

# Copy environment file
cp .env.example .env
# Edit .env with your credentials

# Run locally
python scripts/load_to_neo4j.py

📁 Project Structure

├── data/
│   └── ctg-studies.csv          # Clinical trials dataset
├── scripts/
│   └── load_to_neo4j.py         # Data loading script
├── docker-compose.yml           # Docker services configuration
├── Dockerfile                   # Application container
├── requirements.txt             # Python dependencies
├── config.py                    # Configuration management
└── .env                         # Environment variables

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

📜 License

MIT License

About

This project loads clinical trial data into a Neo4j graph database and models the relationships between trials, conditions, interventions, sponsors, and collaborators. It enables graph-based exploration of clinical research data from ClinicalTrials.gov

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors