Flex processing support for OpenAI models

Would be cool to add support for OpenAI's Flex Processing tier when making requests to OpenAI models.

Flex processing is a service tier OpenAI offers where requests are processed with variable latency in exchange for lower costs([around 50%.](https://developers.openai.com/api/docs/pricing?latest-pricing=flex) If capacity isn't available, the request fails fast rather than queuing, which could be a worthy tradeoff? Or maybe just send the request again to the non flex? idk

[openai docs](https://developers.openai.com/api/docs/guides/flex-processing) for more info!
## why this should be a thing 
I think this would be useful as OA's models are slowly getting more and more expensive over time ([gpt 5 mini ](https://openrouter.ai/openai/gpt-5-mini)to [gpt 5.4 mini](https://openrouter.ai/openai/gpt-5.4-mini)), making us restricted to mostly cheaper and OSS models with the $4 daily limit(which isnt an issue, im not being ungrateful yall doing a excellent job lol). With flex, gpt 5.4 mini is around the same price as gpt 5 mini (without flex) so basically its a almost free upgrade with a much higher intelligence. Also, plenty of real-world use cases don't actually need fast responses such as news summariser that sends you a morning digest; this doesnt  need low latency and would benefit from the cost decrease.


Should be a relatively low-effort change too as you pass `service_tier: "flex"` in the request body

would be happy to create a pr if its worthwhile
also let me know if i can plead my case better!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flex processing support for OpenAI models #85

why this should be a thing

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Flex processing support for OpenAI models #85

Description

why this should be a thing

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions