Skip to content

Arxchibobo/skill-self-evolution

Repository files navigation

skill-self-evolution

A meta-skill for Claude Code and OpenClaw that automatically monitors, analyzes, and optimizes other skills through continuous data-driven self-improvement.

License: MIT Python OpenClaw


Overview

skill-self-evolution is a background meta-skill that runs alongside Claude Code and OpenClaw to collect execution data, learn from user feedback, discover usage patterns, and automatically tune the weights and configurations of other skills — all without interrupting your workflow.

It closes the loop between skill execution and improvement:

Skill Execution → Quality Assessment → Pattern Discovery → Weight Optimization → Framework Evolution
       ↑                                                                                 |
       └─────────────────────────── Improved Skill Configuration ────────────────────────┘

Data is stored locally only. No external uploads. Sensitive information is anonymized automatically.


Features

Feature Description
Quality Evaluation Scores skill outputs across 5 dimensions: efficiency, tool diversity, error rate, completeness, and token efficiency
Feedback Learning Extracts improvement signals from user edits and explicit feedback
Pattern Discovery Mines frequent itemsets and sequences using Apriori/FP-Growth algorithms
Weight Optimization Adjusts element, tool, and skill weights daily using time-decay and smoothing
Framework Evolution Runs A/B tests with statistical significance validation to evolve skill rules
Knowledge Transfer Adapts patterns across domains based on similarity scoring

How It Works

Three JavaScript hooks capture data automatically:

  • hooks/record-execution.js — PostToolUse hook: records skill execution metadata and quality scores
  • hooks/detect-modifications.js — OnFileEdit hook: tracks user code modifications as feedback signals
  • hooks/collect-feedback.js — SessionEnd hook: analyzes session satisfaction

Python scripts then process this data through a pipeline:

  1. openclaw_adapter.py — Converts OpenClaw session JSONL data into the internal format
  2. quality_evaluator.py — Computes weighted multi-dimensional quality scores
  3. pattern_discovery.py — Discovers high-confidence patterns (support ≥ 5%, confidence ≥ 60%)
  4. weight_optimizer.py — Applies exponential time decay (60-day half-life) and smoothing
  5. framework_evolver.py — Runs Welch's t-tests and applies improvements above a 10% threshold
  6. knowledge_transfer.py — Finds similar domains and adapts validated patterns

Requirements

  • Python 3.8+
  • Node.js (for hooks)

Python dependencies (requirements.txt):

PyYAML>=6.0
numpy>=1.24.0
pandas>=2.0.0
scipy>=1.10.0
scikit-learn>=1.3.0

Installation

# 1. Clone or install the skill into your skills directory
cd ~/.claude/skills/skill-self-evolution

# 2. Install Python dependencies
pip3 install -r requirements.txt

# 3. Verify setup
python3 scripts/analyze.py

The skill is configured for auto-activation in skill.json (priority 1) — once installed, it runs in the background automatically.


Usage

CLI

# Run the full analysis pipeline (last 30 days)
python cli.py analyze --window 30

# Optimize weights (last 7 days)
python cli.py optimize --window 7

# Generate code templates from high-quality patterns
python cli.py template --min-quality 0.75

# Apply framework evolution (A/B test results)
python cli.py evolve --apply

# Show system status
python cli.py status

# Open the web dashboard
python cli.py dashboard

# Start the background scheduler daemon
python cli.py schedule --daemon

# Clean up data older than 90 days (dry run)
python cli.py cleanup --days 90 --dry-run

OpenClaw Integration

# Convert OpenClaw session data and evaluate quality
python3 scripts/openclaw_adapter.py --days 30 --evaluate --summary

# Run the complete analysis pipeline
python3 scripts/analyze.py --days 30

OpenClaw session data is read from ~/.openclaw/agents/main/sessions/ in JSONL format.

Automated Scheduling

Linux/macOS (cron):

# Daily analysis at 2 AM
0 2 * * * cd ~/.claude/skills/self-evolution && python3 scripts/analyze.py

# Weekly report at 3 AM every Sunday
0 3 * * 1 cd ~/.claude/skills/self-evolution && python3 scripts/weekly_report.py

Windows (Task Scheduler):

schtasks /create /tn "SelfEvolutionDaily" /tr "python scripts\analyze.py" /sc daily /st 02:00

Configuration

Edit config.yaml to control all components. Key options:

Data Collection

data_collection:
  enabled: true
  storage_path: "data/"
  retention_days: 90
  anonymize: true
  sampling_rate: 1.0

Quality Evaluator

quality_evaluator:
  weights:
    efficiency: 0.25
    tool_diversity: 0.20
    error_rate: 0.25
    completeness: 0.15
    token_efficiency: 0.15

Weight Optimizer

weight_optimizer:
  update_frequency: "daily"
  smoothing_factor: 0.3      # 0 = no smoothing, 1 = full smoothing
  time_decay_half_life: 60   # days
  weight_bounds: [0.1, 10.0]

Framework Evolution (A/B Testing)

framework_evolver:
  ab_testing:
    enabled: true
    traffic_split: 0.20      # 20% to variant
    significance_level: 0.05
  auto_apply_threshold: 0.10  # apply if improvement > 10%
  rollback_threshold: 0.05    # rollback if degradation > 5%

Monitored Skills

Configure which skills are tracked in skill.json:

{
  "monitored_skills": ["ui-ux-pro-max", "browser-use", "frontend-design", "code-review", "commit-commands"]
}

Project Structure

skill-self-evolution/
├── cli.py                    # Unified CLI entry point
├── config.yaml               # Main configuration
├── skill.json                # Skill definition and auto-activation
├── requirements.txt          # Python dependencies
├── dashboard.html            # Web dashboard
├── scripts/
│   ├── analyze.py            # Main pipeline orchestrator
│   ├── openclaw_adapter.py   # OpenClaw session data converter
│   ├── quality_evaluator.py  # Multi-dimensional quality scoring
│   ├── pattern_discovery.py  # Apriori/FP-Growth pattern mining
│   ├── weight_optimizer.py   # Time-decay weight optimization
│   ├── framework_evolver.py  # A/B testing and rule evolution
│   ├── knowledge_transfer.py # Cross-domain pattern adaptation
│   ├── scheduler.py          # Background task scheduler
│   ├── template_generator.py # Code template generation
│   ├── weekly_report.py      # Report generation
│   ├── cleanup.py            # Data retention and archival
│   └── ab_testing.py         # A/B testing utilities
├── hooks/
│   ├── record-execution.js   # PostToolUse hook
│   ├── detect-modifications.js # OnFileEdit hook
│   └── collect-feedback.js   # SessionEnd hook
├── tests/
│   ├── test_weight_optimizer.py
│   └── test_ab_testing.py
├── data/                     # Local data storage (gitignored)
│   ├── executions/
│   ├── feedback/
│   ├── modifications/
│   ├── patterns/
│   └── weights/
└── reports/                  # Generated analysis reports

Privacy

  • All data is stored locally — nothing is sent to external services
  • Sensitive content (credentials, API keys, personal data) is excluded from collection
  • Anonymization is enabled by default
  • Users retain full access to and control over all collected data

License

MIT License — Copyright (c) 2026 Bobo (Arxchibobo). See LICENSE for details.

About

claude skill self-evolution

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors