fix(honesty): future-tense completed-autonomy overclaims on $0/proof=0 surfaces#553
Closed
Victor "David" Medina (Victor-David-Medina) wants to merge 1 commit into
Closed
Conversation
…0 surfaces
The console asserted COMPLETED autonomous sends ("ran overnight", a hardcoded ranOvernight:2 mock pill, "scanned overnight and found these") on a fresh $0/proof_events=0 account — contradicting the locked "AI prepares. You approve. We never auto-send." A send never runs without owner approval, and none has run at all (proof_events=0), so "ran overnight" is an overclaim.
Fix (per the adversarial cascade: gate/future-tense, keep the reassurance):
- DailyPulse: ranOvernight -> preparedOvernight
- MorningBriefCard: "Ran overnight" badge + caption -> "Prepared overnight"
- FocusModeClient: "scans overnight" -> "prepares overnight" (7AM-tomorrow promise kept)
- SmartInsight: "scanned overnight and found these" -> "reviewed ... and surfaced these for your approval"
- WelcomeTour: "AI scans overnight while you sleep" -> "AI prepares your next actions overnight"
Honest future promises ("first brief lands tomorrow at 7am", "we never auto-send anything") are unchanged.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Someone is attempting to deploy this pull request to the davidmedina-8534's projects Team on Vercel. No GitHub account was found matching the commit author email address. To deploy this pull request, the commit author's email address needs to be associated with a GitHub account. Learn more about how to change the commit author information. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🛡️ Cascade Quality Score: 100/100
Threshold: 85/100 | Result: PASS ✅ |
Collaborator
Author
|
Superseded by the clean-author replacement PR with the same honesty copy fix. #553 was blocked by Vercel preview deployment. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Routed to claude-console by the adversarial hardening cascade (HIGH honesty-rail finding).
What was wrong
The console asserted completed autonomous sends on a fresh $0 /
proof_events=0account — directly contradicting the locked positioning "AI prepares. You approve. We never auto-send."DailyPulse— a hardcodedranOvernight: 2mock rendered as a live emerald "2 ran overnight" pillMorningBriefCard— "Ran overnight" badge + "Ran overnight. Tap to review"FocusModeClient— "Your AI team scans overnight..."SmartInsight— "Your AI team scanned overnight and found these"WelcomeTour— "AI scans overnight while you sleep"A send never runs without owner approval, and none has run at all (proof_events=0). "Ran overnight" is an overclaim a sharp buyer (or an honesty audit) catches.
The fix (cascade guidance: gate/future-tense, keep the reassurance)
All completed-/active-autonomy claims → "prepared/prepares overnight" (true in every account state — the AI does prepare; only the send awaits approval).
Unchanged (honest): "first brief lands tomorrow at 7am", "we never auto-send anything", empty-state "actions will appear ... overnight".
Scope discipline
Claim-boarded; collision-checked (the 3 open console PRs #517/#430/#429 don't touch these files); branched off fresh
origin/main. Copy-only — no color/contrast change, so it won't trip the brand ratchet.Co-authored with Claude (claude-console lane).