Complete documentation for using SentinelOS to analyze your organization's codebase and customer data.
- Getting Started
- Core Concepts
- API Reference
- Common Workflows
- Understanding Analysis Results
- Configuration
- Troubleshooting
- Python 3.9 or higher
- 4GB RAM minimum (8GB recommended)
- Network access to GitHub (for repository analysis)
Option 1: Docker (Recommended)
git clone https://github.com/yourorg/sentinelos.git
cd sentinelos
cp .env.example .env
docker-compose up -dOption 2: Local Installation
git clone https://github.com/yourorg/sentinelos.git
cd sentinelos
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env# Local
python -m uvicorn backend.api.main:app --host 0.0.0.0 --port 8000
# Docker
docker-compose up -dcurl http://localhost:8000/healthExpected response:
{"status": "healthy", "version": "0.1.0"}SentinelOS ingests data from multiple sources:
| Source | Description | Use Case |
|---|---|---|
| GitHub Repository | Code, commits, contributors | Bug risk analysis |
| CSV Feedback | Customer feedback data | Churn prediction |
| System Logs | Application logs | Anomaly detection |
| Analysis | Description | Output |
|---|---|---|
| Bug Risk | Identifies code likely to contain bugs | Risk score 0-100% per file |
| Churn Prediction | Predicts customer churn probability | Churn score by segment |
| Sentiment Analysis | Analyzes customer feedback tone | Positive/Neutral/Negative |
| Trend Analysis | Detects patterns over time | Time-series data |
- Risk Score: Probability (0-1) that a code module contains bugs
- Churn Risk: Probability (0-1) that a customer segment will churn
- Confidence: Model's certainty in its prediction (0-1)
- Health Score: Overall system health (0-100)
GET /healthReturns service status.
POST /api/v1/ingest/github
Content-Type: application/json
{
"repo_url": "https://github.com/organization/repository",
"branch": "main"
}Parameters:
| Field | Type | Required | Description |
|---|---|---|---|
| repo_url | string | Yes | Full GitHub repository URL |
| branch | string | No | Branch to analyze (default: main) |
Response:
{
"job_id": "job_20260113120000",
"status": "processing",
"estimated_time_seconds": 60
}POST /api/v1/ingest/csv
Content-Type: multipart/form-data
file: @customer_feedback.csvCSV Format:
| Column | Required | Description |
|---|---|---|
| text | Yes | Feedback text content |
| date | No | Feedback timestamp |
| rating | No | Customer rating (1-5) |
| customer_id | No | Customer identifier |
| region | No | Geographic region |
| category | No | Feedback category |
Example CSV:
text,date,rating,customer_id,region
"Great product",2026-01-10,5,C001,US
"Login keeps failing",2026-01-11,2,C002,GermanyGET /api/v1/dashboardReturns comprehensive metrics including:
- Health scores (overall, engineering, customer, operations)
- Key metrics with trends
- Top risks
- Top churn drivers
- AI system health
POST /api/v1/ask
Content-Type: application/json
{
"question": "Which code modules have the highest bug risk?",
"include_sources": true
}Example Questions:
- "Which modules should we prioritize for refactoring?"
- "What are the top customer complaints this week?"
- "Why is churn increasing in Germany?"
- "What patterns predict customer churn?"
Response includes:
- Answer text
- Confidence score
- Source citations
- Suggested follow-up questions
POST /api/v1/reports/generate
Content-Type: application/json
{
"report_type": "cto_weekly",
"format": "markdown"
}Report Types:
| Type | Description |
|---|---|
| cto_weekly | Executive summary for leadership |
| risk_assessment | Detailed risk analysis |
| action_plan | Prioritized action items |
| churn_analysis | Customer churn deep-dive |
| engineering_health | Engineering metrics report |
# 1. Ingest the repository
curl -X POST http://localhost:8000/api/v1/ingest/github \
-H "Content-Type: application/json" \
-d '{"repo_url": "https://github.com/your-org/your-repo"}'
# 2. Wait for processing (check status)
curl http://localhost:8000/api/v1/jobs/job_xxx
# 3. View dashboard
curl http://localhost:8000/api/v1/dashboard
# 4. Ask questions
curl -X POST http://localhost:8000/api/v1/ask \
-H "Content-Type: application/json" \
-d '{"question": "Which files have the highest bug risk?"}'
# 5. Generate report
curl -X POST http://localhost:8000/api/v1/reports/generate \
-H "Content-Type: application/json" \
-d '{"report_type": "engineering_health"}'# 1. Upload feedback CSV
curl -X POST http://localhost:8000/api/v1/ingest/csv \
-F "file=@feedback.csv"
# 2. View dashboard for sentiment
curl http://localhost:8000/api/v1/dashboard
# 3. Ask about churn
curl -X POST http://localhost:8000/api/v1/ask \
-H "Content-Type: application/json" \
-d '{"question": "Which regions have the highest churn risk?"}'
# 4. Generate churn report
curl -X POST http://localhost:8000/api/v1/reports/generate \
-H "Content-Type: application/json" \
-d '{"report_type": "churn_analysis"}'# Generate weekly CTO report
curl -X POST http://localhost:8000/api/v1/reports/generate \
-H "Content-Type: application/json" \
-d '{"report_type": "cto_weekly", "format": "markdown"}'| Score | Level | Interpretation |
|---|---|---|
| 0.0-0.3 | Low | Normal maintenance |
| 0.3-0.5 | Medium | Monitor for changes |
| 0.5-0.7 | High | Consider refactoring |
| 0.7-1.0 | Critical | Immediate attention needed |
Factors that increase risk:
- High cyclomatic complexity
- Frequent changes (high churn)
- Low test coverage
- Many bug-fix commits
- Single author dependency
| Score | Level | Interpretation |
|---|---|---|
| 0.0-0.3 | Low | Stable segment |
| 0.3-0.5 | Medium | Monitor sentiment |
| 0.5-0.7 | High | Proactive outreach needed |
| 0.7-1.0 | Critical | Immediate intervention required |
Factors that increase churn risk:
- Negative sentiment in feedback
- Low ratings (1-2 stars)
- Churn keywords ("cancel", "refund", "leaving")
- Declining engagement
| Score | Interpretation |
|---|---|
| 0.9+ | High confidence, based on substantial data |
| 0.7-0.9 | Good confidence, typical result |
| 0.5-0.7 | Moderate confidence, limited data |
| <0.5 | Low confidence, results need verification |
| Variable | Description | Default |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | - |
REDIS_URL |
Redis connection string | - |
OPENAI_API_KEY |
OpenAI API key for LLM features | - |
GITHUB_TOKEN |
GitHub token for private repos | - |
APP_ENV |
Environment (development/production) | development |
LOG_LEVEL |
Logging level | INFO |
For LLM-powered features (Q&A, report generation):
OPENAI_API_KEY=sk-your-key-here
OPENAI_MODEL=gpt-4-turbo-preview
LLM_TEMPERATURE=0.1
LLM_MAX_TOKENS=4096To analyze private repositories:
GITHUB_TOKEN=ghp_your-token-hereIssue: API fails to start
Solution:
# Check logs
docker-compose logs api
# Verify database is running
docker-compose ps dbIssue: GitHub ingestion returns error
Solutions:
- Verify repository URL is correct
- For private repos, ensure GITHUB_TOKEN is set
- Check repository size (default limit: 500MB)
- Verify network connectivity
Issue: All predictions have low confidence
Solutions:
- Ensure sufficient data is ingested
- For code analysis: ingest more commit history
- For feedback: provide more feedback records
Issue: Q&A or reports return generic responses
Solutions:
- Verify OPENAI_API_KEY is set correctly
- Check API key has remaining quota
- Review logs for API errors
For technical issues, check:
- API documentation:
http://localhost:8000/docs - Health endpoint:
http://localhost:8000/health - Logs:
./logs/sentinelos.log