# Surplus Intelligence > LLM inference marketplace. Cheapest seller wins, USDC settlement on Base. > Pay per-request via x402 (no account needed) or use an API key. ## Base URL https://www.surplusintelligence.ai ## Quick Start — x402 Per-Request Payment (Recommended for Agents) No account and no API key. Just a wallet with USDC on Base. Surplus Intelligence advertises two x402 schemes: - `upto` (preferred): authorize a maximum, then pay only actual usage after the response completes. - `exact` (fallback): sign and settle the fixed estimated amount. For `upto`, the buyer needs a one-time USDC approval to Permit2 (`0x000000000022D473030F116dDEE9F6B43aC78BA3`). After that, requests are gasless for the buyer; CDP sponsors the x402 settlement gas when available. ### Step 1: Send request → get 402 challenge ``` POST /x402/api/inference/v1/chat/completions Content-Type: application/json {"model": "llama-3.3-70b", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100} ``` Returns HTTP 402. Payment requirements are in: - **Header:** `PAYMENT-REQUIRED` (base64-encoded JSON, x402 v2 standard) - **Compatibility header:** `x-payment-required` contains the same base64 payload for legacy clients - **Body:** `accepts[]` array with scheme, network, asset, amount, payTo Do not rely on the order of `accepts[]`. Select by `scheme`: prefer `upto` when your client supports it, otherwise use `exact` as compatibility fallback. ### Step 2: Sign payment and retry For `upto`, sign a Permit2 witness authorization for the maximum `amount` in the 402 response. SI settles only the actual usage after inference succeeds. For `exact`, sign an EIP-3009 `TransferWithAuthorization` for the fixed amount. Resend the same request with the signed payload: ``` POST /x402/api/inference/v1/chat/completions Content-Type: application/json PAYMENT-SIGNATURE: {"model": "llama-3.3-70b", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100} ``` Returns 200 with an OpenAI-compatible chat completion response. ### Using @x402 SDKs (simplest) ```typescript import { wrapFetchWithPayment } from '@x402/fetch'; import { x402Client } from '@x402/core/client'; import { ExactEvmScheme } from '@x402/evm/exact/client'; import { privateKeyToAccount } from 'viem/accounts'; const signer = privateKeyToAccount(process.env.EVM_PRIVATE_KEY); const client = new x402Client(); client.register('eip155:*', new ExactEvmScheme(signer)); const fetchWithPayment = wrapFetchWithPayment(fetch, client); const response = await fetchWithPayment( 'https://www.surplusintelligence.ai/x402/api/inference/v1/chat/completions', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ model: 'llama-3.3-70b', messages: [{ role: 'user', content: 'Hello' }], max_tokens: 100, }), } ); const data = await response.json(); console.log(data.choices[0].message.content); ``` ### Direct `upto` signing with @x402/evm + viem If you sign `upto` manually, use a signer object with an explicit `address`. Do **not** pass a raw viem `walletClient` to a helper if `walletClient.address` is undefined — the resulting Permit2 payload will be rejected by the facilitator. ```typescript import { createPublicClient, createWalletClient, http, publicActions } from 'viem'; import { base } from 'viem/chains'; import { privateKeyToAccount } from 'viem/accounts'; import { UptoEvmScheme } from '@x402/evm'; const endpoint = 'https://www.surplusintelligence.ai/x402/api/inference/v1/chat/completions'; const account = privateKeyToAccount(process.env.PRIVATE_KEY); const publicClient = createPublicClient({ chain: base, transport: http() }); createWalletClient({ account, chain: base, transport: http() }).extend(publicActions); const signer = { address: account.address, signTypedData: (msg) => account.signTypedData(msg), readContract: publicClient.readContract.bind(publicClient), getTransactionCount: publicClient.getTransactionCount.bind(publicClient), estimateFeesPerGas: publicClient.estimateFeesPerGas.bind(publicClient), }; const body = { model: 'llama-3.3-70b', messages: [{ role: 'user', content: 'Hello' }], max_tokens: 100 }; const challenge = await fetch(endpoint, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(body) }); const paymentRequired = JSON.parse(Buffer.from(challenge.headers.get('PAYMENT-REQUIRED'), 'base64').toString()); const upto = paymentRequired.accepts.find((a) => a.scheme === 'upto'); const paymentPayload = await new UptoEvmScheme(signer).createPaymentPayload(2, upto); const paymentHeader = Buffer.from(JSON.stringify(paymentPayload)).toString('base64'); const paid = await fetch(endpoint, { method: 'POST', headers: { 'Content-Type': 'application/json', 'PAYMENT-SIGNATURE': paymentHeader }, body: JSON.stringify(body) }); console.log(await paid.json()); ``` ### Using Python ```python from x402 import x402_fetch response = x402_fetch( 'https://www.surplusintelligence.ai/x402/api/inference/v1/chat/completions', wallet=wallet, method='POST', json={ 'model': 'llama-3.3-70b', 'messages': [{'role': 'user', 'content': 'Hello'}], 'max_tokens': 100, } ) ``` ### 402 Response Format The response includes both `exact` and `upto` payment requirements. Current production responses emit `exact` first and `upto` second, but clients should select by `scheme` rather than array position. ```json { "x402Version": 2, "accepts": [ { "scheme": "exact", "network": "eip155:8453", "asset": "0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913", "amount": "1000", "payTo": "0x8581784D3E598cCa3482375CFF2409Ac9DD8c402", "maxTimeoutSeconds": 120, "extra": { "name": "USD Coin", "version": "2" } }, { "scheme": "upto", "network": "eip155:8453", "asset": "0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913", "amount": "1000", "payTo": "0x8581784D3E598cCa3482375CFF2409Ac9DD8c402", "maxTimeoutSeconds": 120, "extra": { "name": "USD Coin", "version": "2", "facilitatorAddress": "0x..." } } ], "resource": { "url": "/api/inference/v1/chat/completions", "description": "LLM inference request for model llama-3.3-70b", "mimeType": "application/json" } } ``` - `amount` is in USDC micro-units (6 decimals). "1000" = $0.001 USDC. - `payTo` is the settlement address on Base. - `asset` is the USDC contract on Base. - The same data is in the `PAYMENT-REQUIRED` response header (base64-encoded). ### Pricing Pricing is model-specific and market-based. SI routes to the cheapest available seller for the requested model. Prices are often below direct provider rates, but the final amount is always returned in the 402 challenge. For `upto`, `amount` is a maximum; actual settlement is based on real usage plus any x402 flat fee. For `exact`, the full estimated amount is settled. --- ## Endpoints | Method | Path | Description | |--------|------|-------------| | GET | /api/inference/v1/models | List models with live pricing | | GET | /api/inference/v1/prices | Direct provider price comparison matrix | | POST | /api/inference/v1/chat/completions | Chat completions (OpenAI-compatible) | | POST | /api/inference/v1/completions | Legacy completions | | GET | /api/inference/markets | Market summary per model | | GET | /api/inference/markets/:model | Order book + recent trades | ## Available Models Call `GET /api/inference/v1/models` for the live catalog and `GET /api/inference/markets` for models with active seller liquidity. Use `GET /api/inference/v1/prices` for the live provider price comparison matrix. Current live public API snapshot: 104 catalog models and 73 active marketplace models. Active production seller liquidity is currently concentrated on Venice and Bankr LLM Gateway. Examples: Claude Opus 4.6, Claude Sonnet 4.5, GPT-5.4, GPT-5.2, Gemini 3 Pro/Flash, DeepSeek V3.2, Qwen3 235B, Llama 3.3 70B, Mistral Large, Grok 4, Kimi K2. ## Payment Methods ### 1. x402 — Per-Request (see Quick Start above) No account needed. Agent signs USDC payment per request. `upto` is best for autonomous agents because it pays actual usage; `exact` remains available as fallback. ### 2. API Key — Persistent auth Sign in with a wallet (SIWE) → get `inf_xxx` buyer API key → use like OpenAI: ``` Authorization: Bearer inf_xxx ``` Requires USDC allowance on the settlement contract (`0x0770d2124C0a581C28Cfc47a659817145e6Cc137`). ### 3. MPP (Tempo) — Per-Request `Authorization: Payment ` header. See 402 response `www-authenticate` header. ## Contracts (Base, chain ID 8453) - Settlement proxy: `0x0770d2124C0a581C28Cfc47a659817145e6Cc137` - USDC: `0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913` - Current fee multiplier: `10000` (1.0x) - Contract has role-based admin/upgrade controls but no contract-level timelock today ## For Sellers Sell inference capacity via the marketplace. SIWE auth, no browser needed. 1. `GET /api/inference/sellers/auth/challenge?address=0xYourWallet` → SIWE message 2. `POST /api/inference/sellers/auth/key` → `si_seller_xxx` API key 3. `POST /api/inference/sellers/offers` → list your model + price Full seller API: /docs/api-reference/seller-endpoints.md Security/privacy notes: /docs/marketplace/security-privacy.md Seller key handling summary: provider API keys are encrypted in Postgres with AES-256-GCM using an application secret currently stored as a Vercel env var, not KMS/HSM. Offer DELETE is a soft delete today; revoke the upstream provider key for immediate invalidation. Normal chat/completions usage logs store metadata/token/cost/settlement fields, not full prompt/response bodies. There is no marketplace per-hour seller quota cap or model-authenticity attestation today. ## Discovery - `/.well-known/x402.json` — x402 resource discovery - `/api/x402/info` — x402 inference payment info - `/api/docs/payments/x402` — x402 integration guide - `/openapi.json` — OpenAPI spec - `/llms.txt` — this file ## Deprecated Credit Purchases `/api/x402/purchase` previously sold Venice API credit packages via x402. That flow is deprecated and now returns `410 Gone`. Use x402 per-request inference endpoints instead. ## x402 API Resource Endpoints Surplus Intelligence also exposes seller-backed API resources via x402. Sellers list upstream API keys and configure a discount/multiplier versus official upstream prices; agents pay per request in USDC on Base. ### Twitter/X API v2 — read-only Base path: `https://www.surplusintelligence.ai/x402/api/twitter/v2/...` Examples: ```bash curl -i "https://www.surplusintelligence.ai/x402/api/twitter/v2/tweets/search/recent?query=x402&max_results=10" curl -i "https://www.surplusintelligence.ai/x402/api/twitter/v2/users/by/username/mac_eth" ``` The first request returns an x402 402 challenge. Sign an `accepts[]` requirement and retry with `PAYMENT-SIGNATURE`. Write/action endpoints are intentionally not supported. User-context engagement reads such as liking/retweeted-by require the buyer to provide `X-Twitter-OAuth-Token`; seller credentials are never used for account-control access. ### Venice Crypto RPC Base path: `https://www.surplusintelligence.ai/x402/api/venice/crypto/rpc/{network}` Example: ```bash curl -i https://www.surplusintelligence.ai/x402/api/venice/crypto/rpc/base-mainnet \ -H 'Content-Type: application/json' \ -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' ``` Supported networks mirror Venice's crypto RPC network slugs, including `ethereum-mainnet`, `base-mainnet`, `arbitrum-mainnet`, `optimism-mainnet`, `polygon-mainnet`, `linea-mainnet`, `avalanche-mainnet`, `bsc-mainnet`, `blast-mainnet`, `zksync-mainnet`, and `starknet-mainnet` plus their documented testnets. Discovery: see `/.well-known/x402.json`, `/.well-known/x402`, `/api/x402/info`, and `/openapi.json`.