Health & Routing

> Minimum-discount routing: The /min{N}/ URL segment adds a minimum estimated-discount filter on marketplace seller offers, applied *after* health and trust filters and *before* the provider call. It never filters buyer-owned priority/fallback providers. See Minimum-Discount Routing.

The marketplace uses passive health monitoring — no proactive pings, no cost to sellers.

Routing Logic

1. Resolve model name (canonical, alias, provider-specific, or display name)

2. Check buyer's priority provider → try first if configured

3. Get all active marketplace offers for the requested model

4. If a request includes a provider hint, keep only offers from matching provider hosts

5. Skip unhealthy offers (in backoff period)

6. Skip offers that have hit spending caps

7. Apply price threshold filter (X-Max-Price-Per-1M header or body param)

8. Estimate per-request cost using the buyer's input tokens + max_tokens

9. Sort by estimated total cost (not just unit price — accounts for input/output ratio)

10. Try cheapest first

11. If seller uses Anthropic API format → auto-translate request/response

12. On failure → mark unhealthy, try next seller

13. If all sellers fail → try the buyer's catch-all fallback provider (a provider registered with model: null via /v1/buyer/providers)

14. Buyer never sees provider failures — failover is transparent

Routing order: Priority provider → marketplace sellers (cheapest first) → catch-all fallback provider (model: null) → universal fallback. Cross-provider translation: Requests are always accepted in OpenAI format. OpenAI-compatible sellers receive normalized OpenAI requests. Native Anthropic sellers use the OpenAI ⇄ Anthropic translator described in Models & Providers. Current active Claude liquidity is mainly OpenAI-compatible providers, not native Anthropic offers.

Provider Pinning

The normal buyer path is provider-agnostic: request a model and let the marketplace route to the cheapest healthy eligible offer. Advanced clients can optionally pin routing to a provider by passing provider, provider_url, or provider_base_url in a chat completion request.

Pinning does not select a specific seller offer. It narrows the eligible order book to matching provider hosts, then the router still chooses the cheapest healthy offer within that filtered set. If no matching provider offer is active and healthy, the request can fail even if other providers have liquidity for the same model.

See POST /v1/chat/completions for request examples.

Health States

active = seller intent only (create/cancel). The health system never touches it. healthy = system-managed, self-healing via exponential backoff.
[healthy] ──failure──→ [unhealthy, backoff timer set]

(backoff expires)

──next request succeeds──→ [healthy, failures reset]

──next request fails────→ [unhealthy, longer backoff]

Offers with dead keys (permanently revoked, closed account) will cycle through hourly retries indefinitely. This is an intentional tradeoff — one wasted attempt per hour per dead offer vs. the risk of permanently losing offers that could recover (e.g. DIEM credit resets at midnight UTC).

Failure Categories

CategoryExamplesBackoffSelf-heals?
Server error5xx, connection refused30s → 60s → 120s → 10min capUsually
TimeoutRequest timeout30s → 60s → 120s → 10min capUsually
Auth/config401, 402, 40310min → 20min → 40min → 1hr capOften (DIEM resets, key renewals)
Rate limited42910-30sYes
Bad responseMalformed body60s → 120s → 10min capMaybe

Health Log

All health events are logged with:

  • Error category and HTTP status
  • Truncated error message
  • Action taken (backoff duration)

Sellers can view their health log via GET /v1/seller/health-log.

Seller Quality and Model Authenticity

Health routing measures operational health: provider errors, auth failures, timeouts, backoff, and recovery after successful requests. It does not prove model identity.

Today SI does not cryptographically or semantically verify that a seller advertising claude-opus-4.6 is serving that exact model rather than proxying to another model. Trusted provider domains and health stats reduce risk but are not model attestation. Trusted-only routing is the default for new accounts; allowing untrusted providers requires an explicit opt-in via Routing preferences on /buy (account-level) or per-API-key in Key Management. Buyers who need stronger guarantees should keep the default or use a BYOK priority provider until reputation/model-quality scoring ships.