How the Marketplace Works
Surplus Intelligence is a decentralized exchange for LLM inference. Each model has its own order book where sellers compete on price.
The Flow
Buyer sends request
→ Marketplace checks order book for the requested model
→ Routes to the cheapest healthy seller
→ Seller's provider processes the request
→ Response streams back to buyer in real-time
→ Token usage is metered
→ USDC settlement executes on-chain (Base)
→ Seller gets paid instantly
Key Concepts
Order Book
Each model has a sorted list of sellers, cheapest first. The order book reloads from DynamoDB every 30 seconds and lives in memory for fast routing.
Routing
The router tries the cheapest seller first. If it fails (5xx, timeout, auth error), it immediately falls through to the next cheapest. The buyer never sees the failure — it's transparent.
Settlement
Every completed request settles on-chain immediately. The settlement contract pulls USDC from the buyer's wallet and sends it to the seller's wallet in a single Base transaction (~$0.002 gas).
Health
Sellers are passively monitored — no health check pings burning your quota. One failure marks the offer unhealthy with a category-specific backoff. After the backoff expires, the next successful request recovers it.
Sections
- Models & Providers — what models are available and who serves them
- Pricing — how pricing works, cost multiplier, per-request cost estimation
- Settlement — on-chain USDC settlement mechanics and fee structure
- Health & Routing — circuit breaker, failover, and seller health
- Minimum-Discount Routing —
min{N}base URLs that require a minimum estimated discount before routing
Security and Privacy
See Security & Privacy for current seller key storage, prompt/response logging, deletion behavior, seller spend protections, quality-scoring limits, and settlement admin trust assumptions.