# Surplus Intelligence

> LLM inference marketplace. Cheapest seller wins, USDC settlement on Base.
> Pay per-request via x402 (no account needed) or use an API key.

## Base URL

https://www.surplusintelligence.ai

## Quick Start — x402 Per-Request Payment (Recommended for Agents)

No account and no API key. Just a wallet with USDC on Base.

Surplus Intelligence advertises two x402 schemes:
- `upto` (preferred): authorize a maximum, then pay only actual usage after the response completes.
- `exact` (fallback): sign and settle the fixed estimated amount.

For `upto`, the buyer needs a one-time USDC approval to Permit2 (`0x000000000022D473030F116dDEE9F6B43aC78BA3`). After that, requests are gasless for the buyer; CDP sponsors the x402 settlement gas when available.

### Step 1: Send request → get 402 challenge

```
POST /x402/api/inference/v1/chat/completions
Content-Type: application/json

{"model": "llama-3.3-70b", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}
```

Returns HTTP 402. Payment requirements are in:
- **Header:** `PAYMENT-REQUIRED` (base64-encoded JSON, x402 v2 standard)
- **Compatibility header:** `x-payment-required` contains the same base64 payload for legacy clients
- **Body:** `accepts[]` array with scheme, network, asset, amount, payTo

Do not rely on the order of `accepts[]`. Select by `scheme`: prefer `upto` when your client supports it, otherwise use `exact` as compatibility fallback.

### Step 2: Sign payment and retry

For `upto`, sign a Permit2 witness authorization for the maximum `amount` in the 402 response. SI settles only the actual usage after inference succeeds. For `exact`, sign an EIP-3009 `TransferWithAuthorization` for the fixed amount.

Resend the same request with the signed payload:

```
POST /x402/api/inference/v1/chat/completions
Content-Type: application/json
PAYMENT-SIGNATURE: <base64-encoded signed payment payload>

{"model": "llama-3.3-70b", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}
```

Returns 200 with an OpenAI-compatible chat completion response.

### Using @x402 SDKs (simplest)

```typescript
import { wrapFetchWithPayment } from '@x402/fetch';
import { x402Client } from '@x402/core/client';
import { ExactEvmScheme } from '@x402/evm/exact/client';
import { privateKeyToAccount } from 'viem/accounts';

const signer = privateKeyToAccount(process.env.EVM_PRIVATE_KEY);
const client = new x402Client();
client.register('eip155:*', new ExactEvmScheme(signer));

const fetchWithPayment = wrapFetchWithPayment(fetch, client);

const response = await fetchWithPayment(
  'https://www.surplusintelligence.ai/x402/api/inference/v1/chat/completions',
  {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      model: 'llama-3.3-70b',
      messages: [{ role: 'user', content: 'Hello' }],
      max_tokens: 100,
    }),
  }
);

const data = await response.json();
console.log(data.choices[0].message.content);
```


### Direct `upto` signing with @x402/evm + viem

If you sign `upto` manually, use a signer object with an explicit `address`. Do **not** pass a raw viem `walletClient` to a helper if `walletClient.address` is undefined — the resulting Permit2 payload will be rejected by the facilitator.

```typescript
import { createPublicClient, createWalletClient, http, publicActions } from 'viem';
import { base } from 'viem/chains';
import { privateKeyToAccount } from 'viem/accounts';
import { UptoEvmScheme } from '@x402/evm';

const endpoint = 'https://www.surplusintelligence.ai/x402/api/inference/v1/chat/completions';
const account = privateKeyToAccount(process.env.PRIVATE_KEY);
const publicClient = createPublicClient({ chain: base, transport: http() });
createWalletClient({ account, chain: base, transport: http() }).extend(publicActions);

const signer = {
  address: account.address,
  signTypedData: (msg) => account.signTypedData(msg),
  readContract: publicClient.readContract.bind(publicClient),
  getTransactionCount: publicClient.getTransactionCount.bind(publicClient),
  estimateFeesPerGas: publicClient.estimateFeesPerGas.bind(publicClient),
};

const body = { model: 'llama-3.3-70b', messages: [{ role: 'user', content: 'Hello' }], max_tokens: 100 };
const challenge = await fetch(endpoint, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(body) });
const paymentRequired = JSON.parse(Buffer.from(challenge.headers.get('PAYMENT-REQUIRED'), 'base64').toString());
const upto = paymentRequired.accepts.find((a) => a.scheme === 'upto');
const paymentPayload = await new UptoEvmScheme(signer).createPaymentPayload(2, upto);
const paymentHeader = Buffer.from(JSON.stringify(paymentPayload)).toString('base64');
const paid = await fetch(endpoint, { method: 'POST', headers: { 'Content-Type': 'application/json', 'PAYMENT-SIGNATURE': paymentHeader }, body: JSON.stringify(body) });
console.log(await paid.json());
```

### Using Python

```python
from x402 import x402_fetch

response = x402_fetch(
    'https://www.surplusintelligence.ai/x402/api/inference/v1/chat/completions',
    wallet=wallet,
    method='POST',
    json={
        'model': 'llama-3.3-70b',
        'messages': [{'role': 'user', 'content': 'Hello'}],
        'max_tokens': 100,
    }
)
```

### 402 Response Format

The response includes both `exact` and `upto` payment requirements. Current production responses emit `exact` first and `upto` second, but clients should select by `scheme` rather than array position.

```json
{
  "x402Version": 2,
  "accepts": [
    {
      "scheme": "exact",
      "network": "eip155:8453",
      "asset": "0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913",
      "amount": "1000",
      "payTo": "0x8581784D3E598cCa3482375CFF2409Ac9DD8c402",
      "maxTimeoutSeconds": 120,
      "extra": { "name": "USD Coin", "version": "2" }
    },
    {
      "scheme": "upto",
      "network": "eip155:8453",
      "asset": "0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913",
      "amount": "1000",
      "payTo": "0x8581784D3E598cCa3482375CFF2409Ac9DD8c402",
      "maxTimeoutSeconds": 120,
      "extra": {
        "name": "USD Coin",
        "version": "2",
        "facilitatorAddress": "0x..."
      }
    }
  ],
  "resource": {
    "url": "/api/inference/v1/chat/completions",
    "description": "LLM inference request for model llama-3.3-70b",
    "mimeType": "application/json"
  }
}
```

- `amount` is in USDC micro-units (6 decimals). "1000" = $0.001 USDC.
- `payTo` is the settlement address on Base.
- `asset` is the USDC contract on Base.
- The same data is in the `PAYMENT-REQUIRED` response header (base64-encoded).

### Pricing

Pricing is model-specific and market-based. SI routes to the cheapest available seller for the requested model. Prices are often below direct provider rates, but the final amount is always returned in the 402 challenge.

For `upto`, `amount` is a maximum; actual settlement is based on real usage plus any x402 flat fee. For `exact`, the full estimated amount is settled.

---

## Endpoints

| Method | Path | Description |
|--------|------|-------------|
| GET | /api/inference/v1/models | List models with live pricing |
| GET | /api/inference/v1/prices | Direct provider price comparison matrix |
| POST | /api/inference/v1/chat/completions | Chat completions (OpenAI-compatible) |
| POST | /api/inference/v1/completions | Legacy completions |
| GET | /api/inference/markets | Market summary per model |
| GET | /api/inference/markets/:model | Order book + recent trades |

## Available Models

Call `GET /api/inference/v1/models` for the live catalog and `GET /api/inference/markets` for models with active seller liquidity. Use `GET /api/inference/v1/prices` for the live provider price comparison matrix.
Current live public API snapshot: 104 catalog models and 73 active marketplace models. Active production seller liquidity is currently concentrated on Venice and Bankr LLM Gateway.

Examples: Claude Opus 4.6, Claude Sonnet 4.5, GPT-5.4, GPT-5.2, Gemini 3 Pro/Flash,
DeepSeek V3.2, Qwen3 235B, Llama 3.3 70B, Mistral Large, Grok 4, Kimi K2.

## Payment Methods

### 1. x402 — Per-Request (see Quick Start above)
No account needed. Agent signs USDC payment per request. `upto` is best for autonomous agents because it pays actual usage; `exact` remains available as fallback.

### 2. API Key — Persistent auth
Sign in with a wallet (SIWE) → get `inf_xxx` buyer API key → use like OpenAI:
```
Authorization: Bearer inf_xxx
```
Requires USDC allowance on the settlement contract (`0x0770d2124C0a581C28Cfc47a659817145e6Cc137`).

### 3. MPP (Tempo) — Per-Request
`Authorization: Payment <credential>` header. See 402 response `www-authenticate` header.

## Contracts (Base, chain ID 8453)

- Settlement proxy: `0x0770d2124C0a581C28Cfc47a659817145e6Cc137`
- USDC: `0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913`
- Current fee multiplier: `10000` (1.0x)
- Contract has role-based admin/upgrade controls but no contract-level timelock today

## For Sellers

Sell inference capacity via the marketplace. SIWE auth, no browser needed.

1. `GET /api/inference/sellers/auth/challenge?address=0xYourWallet` → SIWE message
2. `POST /api/inference/sellers/auth/key` → `si_seller_xxx` API key
3. `POST /api/inference/sellers/offers` → list your model + price

Full seller API: /docs/api-reference/seller-endpoints.md
Security/privacy notes: /docs/marketplace/security-privacy.md

Seller key handling summary: provider API keys are encrypted in Postgres with AES-256-GCM using an application secret currently stored as a Vercel env var, not KMS/HSM. Offer DELETE is a soft delete today; revoke the upstream provider key for immediate invalidation. Normal chat/completions usage logs store metadata/token/cost/settlement fields, not full prompt/response bodies. There is no marketplace per-hour seller quota cap or model-authenticity attestation today.

## Discovery

- `/.well-known/x402.json` — x402 resource discovery
- `/api/x402/info` — x402 inference payment info
- `/api/docs/payments/x402` — x402 integration guide
- `/openapi.json` — OpenAPI spec
- `/llms.txt` — this file

## Deprecated Credit Purchases

`/api/x402/purchase` previously sold Venice API credit packages via x402. That flow is deprecated and now returns `410 Gone`. Use x402 per-request inference endpoints instead.

## x402 API Resource Endpoints

Surplus Intelligence also exposes seller-backed API resources via x402. Sellers list upstream API keys and configure a discount/multiplier versus official upstream prices; agents pay per request in USDC on Base.

### Twitter/X API v2 — read-only

Base path: `https://www.surplusintelligence.ai/x402/api/twitter/v2/...`

Examples:

```bash
curl -i "https://www.surplusintelligence.ai/x402/api/twitter/v2/tweets/search/recent?query=x402&max_results=10"
curl -i "https://www.surplusintelligence.ai/x402/api/twitter/v2/users/by/username/mac_eth"
```

The first request returns an x402 402 challenge. Sign an `accepts[]` requirement and retry with `PAYMENT-SIGNATURE`. Write/action endpoints are intentionally not supported. User-context engagement reads such as liking/retweeted-by require the buyer to provide `X-Twitter-OAuth-Token`; seller credentials are never used for account-control access.

### Venice Crypto RPC

Base path: `https://www.surplusintelligence.ai/x402/api/venice/crypto/rpc/{network}`

Example:

```bash
curl -i https://www.surplusintelligence.ai/x402/api/venice/crypto/rpc/base-mainnet \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}'
```

Supported networks mirror Venice's crypto RPC network slugs, including `ethereum-mainnet`, `base-mainnet`, `arbitrum-mainnet`, `optimism-mainnet`, `polygon-mainnet`, `linea-mainnet`, `avalanche-mainnet`, `bsc-mainnet`, `blast-mainnet`, `zksync-mainnet`, and `starknet-mainnet` plus their documented testnets.

Discovery: see `/.well-known/x402.json`, `/.well-known/x402`, `/api/x402/info`, and `/openapi.json`.