Music Generations

> Minimum-discount routing: Prefix the path with a min{N} segment (e.g. /min30/v1/music/generations) to require marketplace seller offers to meet a minimum estimated buyer discount before routing. Buyer-owned providers are not covered. See Minimum-Discount Routing.

Async music and text-to-audio generation via the marketplace. Same flow as video: submit, poll, retrieve.

Submit Job

POST /v1/music/generations

Auth: API key required (Authorization: Bearer *).

Request Body:

FieldTypeRequiredDescription
modelstringModel ID, e.g. venice-ace-step-15; legacy venice-music-gen maps to ACE-Step
promptstringPrompt or narration text (max 2000 chars)
duration_secondsintegerTrack/audio duration. Optional; omitted values use the model's default duration when known
lyrics_promptstringLyrics text (max 5000 chars) for models that support lyrics
force_instrumentalbooleanGenerate without vocals when supported
voicestringVoice selection for TTS-like audio models
language_codestringLanguage for lyrics or speech models when supported
webhook_urlstringURL for terminal state callbacks
metadataobjectClient pass-through metadata

Current Venice-backed examples include venice-ace-step-15, venice-elevenlabs-music, venice-minimax-music-v26, venice-stable-audio-25, venice-elevenlabs-sound-effects-v2, venice-mmaudio-v2-text-to-audio, and Venice audio TTS models. Query /v1/models and /api/markets for the live catalog and active order book.

Response: 202 Accepted

Same envelope as video:

{

"id": "550e8400-e29b-41d4-a716-446655440000",

"object": "media.job",

"kind": "music",

"status": "queued",

"created": 1713200000,

"expires_at": 1713200900,

"estimated_cost_usdc": "30000",

"max_cost_usdc": "33000",

"poll_url": "/v1/music/generations/550e8400-...",

"cancel_url": "/v1/music/generations/550e8400-...",

"job_token": "mjt_abc123..."

}

Poll Status

GET /v1/music/generations/:id

Auth:** API key OR X-Job-Token header.

{

"id": "550e8400-...",

"object": "media.job",

"kind": "music",

"status": "succeeded",

"results": [

{

"artifact_index": 0,

"url": "https://media-artifacts.s3.amazonaws.com/async-media/job-id/0.mp3?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Expires=900&X-Amz-Signature=…",

"download_url": "/v1/media/artifacts/job-id/0",

"content_type": "audio/mpeg",

"duration_seconds": 60,

"bytes": 1200000

}

],

"usage": {

"seller_cost_usdc": "30000",

"buyer_cost_usdc": "30000",

"settlement_status": "confirmed"

}

}

Cancel Job

DELETE /v1/music/generations/:id

Same as video.

Differences from Video

  • Default expiry: 15 minutes.
  • Poll schedule is faster: 2s initial, 10s cap.
  • Supports lyrics_prompt, force_instrumental, voice, and language_code where the selected Venice model supports them.
  • Seller validation and offer health use /audio/quote so onboarding does not spend provider credits on full generations.
  • Download artifacts are cached/proxied for buyer access. Default artifact retention is 3 hours (MEDIA_ARTIFACT_TTL_SECONDS can override it).