openai provider: pinned in_memory prompt cache rejects gpt-5.x models (require 24h tier)

## Summary

On the `openai` provider (`api.openai.com`), Gini hard-pins `prompt_cache_retention: "in_memory"` on every request. OpenAI's gpt-5.x models reportedly accept **only** the `24h` extended prompt-cache tier and reject `in_memory` with an HTTP 400:

> This model is compatible only with 24h extended prompt caching

So selecting a gpt-5.x model on the standard `openai` provider fails.

(Surfaced by an agent investigating the codex caching story; the `api.openai.com` gpt-5.x behavior is from that report and not yet independently reproduced here. The codex path is a separate backend — see below.)

## Not affected

- **codex** (`chatgpt.com/backend-api/codex`): the codex `/responses` builders deliberately **omit** `prompt_cache_retention` — that backend 400s on the parameter itself (`{"detail":"Unsupported parameter: prompt_cache_retention"}`, verified). So codex/gpt-5.5 is unaffected and works today.
- gpt-4o / gpt-4.1 / o-series on the `openai` provider accept `in_memory`, so they're fine.

## Root cause

- The pin is hard-coded on the OpenAI-compatible builders in `src/provider.ts` (chat-completions + the openai `/responses` path).
- It's deliberately listed in `RESERVED_EXTRA_BODY_KEYS` (`src/provider.ts`), so a user **cannot** override it via `extraBody` — the sanitizer strips any `prompt_cache_retention` they pass. That was intentional: keep the Zero-Data-Retention (ZDR) opt-out from being silently shadowed by user config. See ADR `docs/adr/prompt-cache-in-memory-tier.md`.

So fixing gpt-5.x on the `openai` path needs a **code change**, not a setting.

## Why it's a tradeoff, not just a bug

`in_memory` is **ZDR-eligible**; `24h` is **not** (the prompt is retained up to 24 hours). So gpt-5.x can't simply be flipped to `24h` silently — that drops the privacy posture for those models.

## Options

1. **Special-case gpt-5.x to `24h`** on the `openai` path — makes the models work, but silently loses ZDR for them (silent privacy regression).
2. **Make the retention tier configurable per provider** (and drop it from the reserved-key blocklist on that path) so the user opts into `24h` knowingly. Preferred — keeps ZDR a conscious choice.
3. **Leave it** — gpt-5.x stays unusable on the `openai` provider; document the limitation.

## Suggested next step

Option 2, behind its own ADR (the ZDR opt-out becomes user-selectable for the `openai` provider), gated on a `cached_tokens` measurement confirming `24h` actually engages on gpt-5.x before relying on it.

## Context

Surfaced while removing the cache warmer (#265). That PR is orthogonal — it keeps the `in_memory` pin unchanged — so this is a pre-existing limitation, filed separately.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

openai provider: pinned in_memory prompt cache rejects gpt-5.x models (require 24h tier) #279

Summary

Not affected

Root cause

Why it's a tradeoff, not just a bug

Options

Suggested next step

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

openai provider: pinned in_memory prompt cache rejects gpt-5.x models (require 24h tier) #279

Description

Summary

Not affected

Root cause

Why it's a tradeoff, not just a bug

Options

Suggested next step

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions