[FEATURE] SLO contracts on InferenceService + SLO-aware routing and scheduling

## What

Make SLOs a first-class, *enforced* concern, not just a dashboard.

- Declare SLO targets per InferenceService (e.g. `ttft_p99_ms`, error budget, `min_throughput_tok_s`). Use **Pyrra** (#415) to generate the Prometheus recording + multi-window burn-rate alerting rules from the declared SLO. The operator surfaces an evaluated `SLOBreached` condition on InferenceService status.
- **ModelRouter circuit breaker consumes `SLOBreached`:** a breached backend is treated as degraded and traffic redistributes. Today the breaker is HTTP-health only, so a slow-but-alive backend keeps getting traffic.
- **Foreman scheduler defers** dispatching new AgenticTasks to a fleet/InferenceService that is SLO-breached (load-shed rather than pile on a degraded backend).

## Why

This is the production-readiness piece for the B200 + edge deployment: SLO breach causes automatic traffic/work redistribution without a human reading Grafana. It turns SLOs from observation into enforcement.

## Approach / dependencies

- Depends on #409 (the TTFT + error-rate metrics SLOs are computed from) and #415 (Pyrra declaration + alerting). Both M2.
- Couples into #437 (ModelRouter) for the circuit-breaker change.
- Supersedes the *enforcement* intent of #10; #10's auto-model-downgrade remediation is deferred as an opt-in last resort.

## Definition of done

SLO targets declarable per InferenceService; `SLOBreached` condition surfaced from Pyrra-evaluated state; ModelRouter redistributes off breached backends; Foreman scheduler defers dispatch on breach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] SLO contracts on InferenceService + SLO-aware routing and scheduling #629

What

Why

Approach / dependencies

Definition of done

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[FEATURE] SLO contracts on InferenceService + SLO-aware routing and scheduling #629

Description

What

Why

Approach / dependencies

Definition of done

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions