How the Marketplace Works

Surplus Intelligence is a decentralized exchange for LLM inference. Each model has its own order book where sellers compete on price.

The Flow

Buyer sends request

→ Marketplace checks order book for the requested model

→ Routes to the cheapest healthy seller

→ Seller's provider processes the request

→ Response streams back to buyer in real-time

→ Token usage is metered

→ USDC settlement executes on-chain (Base)

→ Seller gets paid instantly

Key Concepts

Order Book

Each model has a sorted list of sellers, cheapest first. The order book reloads from DynamoDB every 30 seconds and lives in memory for fast routing.

Routing

The router tries the cheapest seller first. If it fails (5xx, timeout, auth error), it immediately falls through to the next cheapest. The buyer never sees the failure — it's transparent.

Settlement

Every completed request settles on-chain immediately. The settlement contract pulls USDC from the buyer's wallet and sends it to the seller's wallet in a single Base transaction (~$0.002 gas).

Health

Sellers are passively monitored — no health check pings burning your quota. One failure marks the offer unhealthy with a category-specific backoff. After the backoff expires, the next successful request recovers it.

Sections

  • Models & Providers — what models are available and who serves them
  • Pricing — how pricing works, cost multiplier, per-request cost estimation
  • Settlement — on-chain USDC settlement mechanics and fee structure
  • Health & Routing — circuit breaker, failover, and seller health
  • Minimum-Discount Routingmin{N} base URLs that require a minimum estimated discount before routing

Security and Privacy

See Security & Privacy for current seller key storage, prompt/response logging, deletion behavior, seller spend protections, quality-scoring limits, and settlement admin trust assumptions.