Minimum-Discount Routing
Minimum-discount base URLs filter SI marketplace seller offers by estimated buyer discount before routing. The threshold is evaluated before the provider request using estimated request tokens or media units. Final realized discount may differ when actual usage differs from the estimate. Buyer-owned priority/fallback providers are not marketplace seller offers and are not covered by the minimum-discount guarantee.
How it works
Prefix the request path with a min{N} segment (the same form is used on x402 paths):
https://api.surplusintelligence.ai/min30/v1
For every marketplace seller candidate the router computes:
estimated_discount_pct = (1 - estimated_buyer_cost / estimated_direct_cost) * 100
An offer qualifies only when estimated_discount_pct >= N. The estimated buyer cost includes the marketplace fee (and, on x402 paths, the x402 convenience fee and any conservative buffer). If the direct/reference cost is unknown or zero, the offer cannot prove a discount and is skipped.
Grammar
- Valid:
min0throughmin100. - No leading zeros:
min030is invalid;min0is the only zero form. - No decimals or signs:
min30.5,min-1are invalid. - Invalid segments return
400 invalid_minimum_discount.
Supported endpoints
| Method | Path |
|---|---|
| POST | /min{N}/v1/chat/completions |
| POST | /min{N}/v1/responses |
| POST | /min{N}/v1/completions |
| POST | /min{N}/v1/images/generations |
| POST | /min{N}/v1/audio/speech |
| POST | /min{N}/v1/audio/transcriptions |
| POST | /min{N}/v1/video/generations (async submit) |
| GET/DELETE | /min{N}/v1/video/generations/:id (poll/cancel) |
| POST | /min{N}/v1/music/generations (async submit) |
| GET/DELETE | /min{N}/v1/music/generations/:id (poll/cancel) |
| GET | /v1/models (shared catalog read; no min variant) |
| GET | /v1/prices (shared catalog read; no min variant) |
x402-paid inference uses these same /min{N}/v1/* paths — the payment rail is selected by the x402 headers on the request, not by a separate path prefix. (There is no /x402/min{N}/... URL; /x402/* is reserved for the resource relays.)
The min-discount mirror applies to inference submit endpoints. Catalog reads (/v1/models, /v1/prices) are shared routes with no min{N} variant — they return the same payload regardless of threshold, and the threshold applies only at inference routing time.
Scope and limits (v1)
- BYOK providers are not covered. Buyer-owned priority/fallback providers are not marketplace seller offers, so they keep the normal routing order and are never subject to the discount threshold. A
min30request can still be served by a priority provider before the marketplace, or by a fallback provider after the marketplace fails. - Embeddings are excluded.
/v1/embeddingshas no min-discount route yet; it ships with the embeddings marketplace work. - Async video/music enforce the threshold at submit time only. Poll/cancel do not re-apply it. Submit responses preserve the
min{N}segment inpoll_url/cancel_url. Idempotency is scoped by threshold, so amin50submit will not return a job created earlier by amin0submit with the same idempotency key. - Artifact URLs stay non-min:
/v1/media/artifacts/.... - x402 text wrappers preserve the
min{N}resource URL in payment challenges. MPP / dual-402 does not — the MPP side advertises the non-min resource, so min-discount applies to x402 and API-key paths only in v1. min0means offers must be at least no more expensive than direct/reference pricing — it is not identical to plain/v1routing if above-direct offers exist.min100requires estimated buyer cost to be zero; with any flat fee, even a zero seller-cost offer fails.
Error shape
When no qualifying marketplace seller meets the threshold and no buyer-owned route applies:
{
"error": {
"message": "No marketplace sellers currently meet the minimum 30% estimated discount for 'claude-opus-4.6'. Best otherwise-eligible estimated discount is 18.4%.",
"type": "no_discounted_offer",
"code": "minimum_discount_not_met"
},
"minimum_discount_pct": 30,
"best_available_discount_pct": 18.4
}
This is returned (HTTP 503) only when minimum discount is provably the blocking condition. Trust filters, caps, health backoff, and price ceilings retain their existing error semantics and precedence.
Examples
OpenAI SDK:
const client = new OpenAI({
apiKey: process.env.SURPLUS_API_KEY,
baseURL: 'https://api.surplusintelligence.ai/min30/v1',
})
curl:
curl https://api.surplusintelligence.ai/min30/v1/chat/completions \
-H "Authorization: Bearer $SURPLUS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4.6",
"messages": [{"role":"user","content":"Hello"}],
"stream": true
}'