Releases: StressTestor/PromptPressure
Releases · StressTestor/PromptPressure
PromptPressure v3.3.0-macos.2
PromptPressure macOS DMG for v3.3.0-macos.2.
- commit: 9077ca9
- macOS target: 14+
- package state: unsigned
- checksum: attached as PromptPressure-v3.3.0-macos.2.dmg.sha256
v3.1.0 - multi-turn behavioral drift infrastructure
multi-turn behavioral drift detection infrastructure. the foundation for converting promptpressure from single-turn eval to multi-turn drift detection.
what's new
4-tier run system
--tier smoke|quick|full|deepwith--smokeand--quickshortcuts- cumulative filtering:
--tier quickruns smoke + quick entries - exits non-zero when 0 entries match (prevents CI false-passes)
tiervalidated at config load via pydantic Literal type
per-turn metrics
response_length_ratiocomputed after each turn without LLM calls- metrics attached to turn_responses and aggregated per-sequence
- detects terse/verbose drift across conversation turns
multi-turn hardening
- per-turn timeout scaling capped at 5x base (prevents indefinite hangs)
- context window token estimation with warning at ~6k tokens
- traceback preservation on timeout errors
dataset changes
- 30 refusal sensitivity entries archived to
archive/adversarial/(run with--dataset archive/adversarial/refusal_sensitivity.json) - all 190 entries tagged with tier/subcategory/difficulty
schema.jsondocuments the full entry format
cleanup
- removed 19 tracked
.pycfiles (already in .gitignore)
what's next (v3.2)
multi-turn content generation. converting all 9 categories to graduated pressure sequences and adding 3 new categories (hallucination under pressure, epistemic calibration, multi-turn behavioral drift). 305 total entries targeted.
install
pip install -e .