Skip to content
This repository was archived by the owner on Jun 26, 2026. It is now read-only.

TWN-Systems/prompt-polyglot

Repository files navigation

Prompt Compress

Optimize prompts with multilingual token compression and Bayesian confidence scoring

Version: v0.4 (Consolidated - Database-Backed Patterns) | Status: Production-Ready (62/62 tests passing)

NEW in v0.4: Database-backed pattern optimization with HITL feedback integration! See CONSOLIDATED-ARCHITECTURE.md for details.

Overview

prompt-compress is a Rust-based tool that aggressively optimizes verbose prompts by:

  • Aggressive phrase-level compression (15 new v0.3 patterns)
  • Removing boilerplate and filler words (19+ patterns)
  • Eliminating 31+ common filler words
  • Consolidating redundant synonyms and phrases
  • Compressing verbose instructions (6 patterns)
  • Evidence-based Mandarin substitution (only 7 proven token-equal replacements)
  • Structural optimizations (units, formatting, JSON keys)
  • Protected regions (never corrupts code, templates, URLs)
  • Maintaining semantic meaning with Bayesian confidence scoring
  • Proper capitalization and no orphaned phrases (v0.2+)

Key Features:

  • 70-85% token savings on boilerplate-heavy prompts (aggressive mode)
  • 40-60% savings on typical prompts
  • Zero semantic loss - preserves all key information
  • Bayesian confidence scoring (87-97% per pattern)
  • Multi-tokenizer support (GPT-4, Claude, Llama3)
  • REST API with webhook support for automated parsing
  • CLI for batch processing and analysis
  • Protected regions prevent code/instruction corruption

Real-World Example (v0.3 Aggressive):

Original (127 words, ~98 tokens):
"I would really appreciate it if you could please take the time to carefully
analyze this code snippet that I'm working on. I want you to provide a very
detailed and thorough explanation of what the code does, how it works, and why
it was implemented in this particular way. Please make sure to look into any
potential bugs or issues that you might find, and also check for any performance
problems or areas where the code could be improved or optimized. I would also
like you to research and explain whether this code follows best practices and
coding standards. If you find any problems or issues, please provide detailed
suggestions on how to fix them. Thank you so much in advance for your help!"

Optimized (17 words, ~16 tokens) - 83.7% reduction:
"Analyze this code. Explain: functionality, implementation, rationale.
Identify: bugs, performance issues, improvements. Verify best practices.
Suggest fixes."

Installation

Prerequisites

Build from source

git clone https://github.com/your-org/prompt-polyglot.git
cd prompt-polyglot
cargo build --release

Binaries will be available in target/release/:

  • prompt-compress - CLI tool
  • prompt-compress-server - API server

Quick Start

Database-Backed Pattern System (v0.4+)

NEW: Patterns are now stored in SQLite and can be updated via HITL feedback!

Setup (One-Time)

  1. Run pattern migration:

    cargo run --bin migrate_patterns -- atlas.db

    This migrates all 102 patterns from code into the database.

  2. Use database-backed optimizer:

    use prompt_compress::init_database_optimizer;
    
    let mut optimizer = init_database_optimizer("atlas.db")?;

Benefits:

  • ✅ Patterns stored in database, not hardcoded
  • ✅ HITL feedback updates confidence automatically
  • ✅ Pattern usage tracking and statistics
  • ✅ Hot reload patterns without restart
  • ✅ Filter patterns by confidence threshold

See CONSOLIDATED-ARCHITECTURE.md for full details.

CLI Usage

Basic Optimization

# Optimize a prompt
prompt-compress optimize \
  --input prompt.txt \
  --output optimized.txt \
  --output-lang english

# With custom confidence threshold
prompt-compress optimize \
  --input prompt.txt \
  --threshold 0.90 \
  --output-lang mandarin

# Aggressive mode (lower threshold, more compression)
prompt-compress optimize \
  --input prompt.txt \
  --aggressive

Analyze Without Optimizing

prompt-compress analyze \
  --input prompt.txt \
  --report savings_report.json

Batch Processing

prompt-compress batch \
  --input prompts/ \
  --output optimized/ \
  --output-lang english

API Server

Start the Server

prompt-compress-server

The server will start on http://0.0.0.0:8080

API Endpoints

Health Check

curl http://localhost:8080/api/v1/health

Optimize Prompt

curl -X POST http://localhost:8080/api/v1/optimize \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "I would really appreciate it if you could please help me with this task.",
    "output_language": "english",
    "confidence_threshold": 0.85,
    "aggressive_mode": false
  }'

Response:

{
  "result": {
    "original_prompt": "I would really appreciate it if you could please help me with this task.",
    "optimized_prompt": "Help me with this task.\n\n[output_language: english]",
    "original_tokens": 18,
    "optimized_tokens": 12,
    "token_savings": 6,
    "savings_percentage": 33.3,
    "optimizations": [...],
    "requires_review": [],
    "output_language": "english"
  },
  "review_session_id": null
}

Webhook for Automated Parsing

curl -X POST http://localhost:8080/api/v1/webhook/optimize \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Please analyze this code carefully and provide detailed feedback.",
    "output_language": "english",
    "callback_url": "https://your-service.com/webhook/callback"
  }'

Response:

{
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "optimized_prompt": "Analyze this code: detailed feedback.\n\n[output_language: english]",
  "original_tokens": 12,
  "optimized_tokens": 9,
  "token_savings": 3,
  "savings_percentage": 25.0,
  "status": "completed"
}

If callback_url is provided, the same response will be POSTed to that URL asynchronously.

Analyze Prompt

curl -X POST http://localhost:8080/api/v1/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Your prompt here...",
    "output_language": "english"
  }'

Optimization Strategies

1. Boilerplate Removal (High Confidence: 90-98%)

Common patterns removed:

  • "I would really appreciate if you could..."
  • "Please make sure to..."
  • "Thank you in advance for..."

2. Filler Word Removal (80-90%)

Removes:

  • "really", "very", "quite", "just"
  • "actually", "basically", "essentially"

3. Synonym Consolidation (85-95%)

Examples:

  • "analyze and examine" → "analyze"
  • "check and verify" → "verify"
  • "improve and enhance" → "improve"

4. Mandarin Substitution (90-94%) Evidence-Based Only

v0.2+ uses ONLY proven token-equal substitutions (never increases tokens):

  • "verify" → "验证" (1 token → 1 token)
  • "comprehensive" → "全面" (2 tokens → 2 tokens)
  • "optimization" → "优化" (2 tokens → 2 tokens)
  • "step by step" → "逐步" (3 tokens → 3 tokens)
  • "issues" → "问题" (1 token → 1 token)
  • "bugs" → "错误" (1 token → 1 token)
  • "code" → "代码" (1 token → 1 token)

Note: Only 7 substitutions are used (tested with cl100k_base tokenizer). Substitutions that increase token count were removed in v0.2 based on empirical evidence.

5. Instruction Compression (88-95%)

  • "I would like you to provide" → "Provide"
  • "Can you please explain" → "Explain"

Confidence Scoring

Uses Bayesian inference to calculate confidence:

Confidence Action Example
95-100% Auto-apply "I would appreciate if" → DELETE
85-94% Auto-apply + log "look into/research" → "research"
70-84% Require HITL review Context-dependent synonym consolidation
50-69% Suggest, don't apply Ambiguous pattern matches
<50% Ignore Low-confidence matches

Example Transformations

Light Optimization (15% savings)

Before (52 tokens):

I would really appreciate it if you could please analyze this Python
function and explain what it does. I want you to provide a detailed
explanation of the algorithm and also look into potential performance
issues. Thank you!

After (44 tokens, 15.4% savings):

Analyze this Python function: algorithm explanation + performance issues.
要详细。

[output_language: english]

Heavy Optimization (40% savings)

Before (128 tokens):

I would really appreciate it if you could please take the time to
carefully review and analyze this code snippet. I want you to provide
a very thorough and detailed explanation of what it does, how it works,
and why it was implemented this way. Please make sure to look into any
potential bugs, performance issues, or areas for improvement.

After (76 tokens, 40.6% savings):

Analyze code: functionality, implementation rationale. Identify: bugs,
performance issues, improvements. Research best practices compliance.
Provide fix suggestions. 要详细和全面。

[output_language: english]

Webhook Integration

The webhook endpoint allows seamless integration with other systems:

Use Cases

  1. CI/CD Pipeline: Automatically optimize prompts in your test suite
  2. Content Management: Optimize user-submitted prompts before processing
  3. Analytics: Track token savings across your organization

Integration Example

import requests

# Optimize a prompt via webhook
response = requests.post(
    'http://localhost:8080/api/v1/webhook/optimize',
    json={
        'prompt': 'Your verbose prompt here...',
        'output_language': 'english',
        'confidence_threshold': 0.85,
        'callback_url': 'https://your-app.com/webhook/receive'
    }
)

result = response.json()
print(f"Saved {result['token_savings']} tokens ({result['savings_percentage']:.1f}%)")

Architecture

Input Prompt
     ↓
[1. Tokenize & Count]
     ↓
[2. Pattern Detection]
     ↓
[3. Confidence Scoring] ←─ Bayesian Priors
     ↓
[4. Auto-apply High-Confidence]
     ↓
[5. Queue Low-Confidence for HITL]
     ↓
[6. Apply Approved Optimizations]
     ↓
[7. Add Output Language Directive]
     ↓
Output Optimized Prompt

Configuration

Create a prompt-compress.toml file:

[optimization]
confidence_threshold = 0.85
aggressive_mode = false
output_language = "english"
directive_format = "bracketed"

[hitl]
enabled = true
auto_accept_threshold = 0.95

[patterns]
boilerplate_enabled = true
synonym_consolidation = true
filler_removal = true
mandarin_substitution = true

[bayesian]
prior_corpus_path = "data/priors.json"
update_priors_on_feedback = true
min_confidence = 0.50

Development

Run Tests

cargo test

Run with Logging

RUST_LOG=debug cargo run -- optimize --input test.txt

API Development

RUST_LOG=info cargo run --bin prompt-compress-server

Testing & Verification

Running Tests

# Run all tests (62 tests)
cargo test

# Run specific test suites
cargo test patterns
cargo test concept_optimizer
cargo test protected_regions
cargo test mandarin_efficiency  # Validates Mandarin token counts

Testing Without Building

If you cannot build the project due to dependency/network issues, you can verify the optimization logic using Python simulations:

# Test the optimization patterns
python3 manual_test.py

# Verify optimization goals are met
python3 test_optimization_goals.py

# Generate correct optimized output
python3 generate_correct_optimized.py

These scripts simulate the v0.2+ optimization behavior and verify:

  • ✓ Boilerplate removal
  • ✓ Filler word elimination
  • ✓ Proper capitalization
  • ✓ No orphaned phrases
  • ✓ Token savings achieved
  • ✓ Semantic preservation

Example Test Output

$ cargo test test_no_orphaned_phrases
running 1 test
test optimizer::tests::test_no_orphaned_phrases ... ok

test result: ok. 1 passed; 0 failed

Quality Assurance

All optimizations maintain:

  1. Grammatical correctness - Proper capitalization, no fragments
  2. Semantic preservation - All key information retained
  3. No corruption - Code blocks, URLs, identifiers protected
  4. Measurable savings - 15-40% token reduction verified
  5. Evidence-based - All patterns tested and validated

Documentation

See the comprehensive documentation in the /docs folder:

License

MIT

Contributing

Contributions welcome! Please see CONTRIBUTORS.md for developer guidelines.

Support

  • Issues: GitHub Issues
  • Documentation: See /docs folder for comprehensive documentation
  • Specification: See CLAUDE.md for detailed specification

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors