API Reference
Base URL in local dev: http://localhost:4100.
The API is split into:
- OpenAI-compatible gateway:
/v1/chat/completions. - Anthropic-compatible gateway:
/v1/messages. - Anthropic-compatible token counting:
/v1/messages/count_tokens. - Chorus governed intelligence route:
model: "punk/chorus"on supported gateway wires. - Punk management/runtime API:
/api/v1/*. - Health check:
/health. - Static dashboard and docs:
/,/docs.
Authentication
If PUNK_API_KEY is not set and login mode has not been activated by a user account or PUNK_REQUIRE_LOGIN=true, Punk runs in open dev mode:
- Headerless protected requests are allowed.
- The default tenant is used.
- Requests are treated as admin.
- Private web fetches are allowed for local demos.
If PUNK_API_KEY is set:
- Protected
/v1/*and/api/*routes requireAuthorization: Bearer <token>. - The bootstrap token is admin for the default tenant.
- Tenant API keys created through
/api/v1/keysare hashed at rest.
Dashboard login sessions can also authenticate /api/v1/* through the HttpOnly punk_session cookie. Gateway traffic under /v1/* stays API-key oriented. See Configuration for open dev, protected, and login mode details.
Identity Headers
| Header | Purpose |
|---|---|
X-Punk-App | Logical app id. Optional but recommended. |
X-Punk-Agent | Agent id/name for trust and audit. |
X-Punk-Subject | Pseudonymous user/subject; also a cache-key safety dimension. |
If an API key is pinned to an app, the pinned app overrides X-Punk-App.
Response Headers
| Header | Meaning |
|---|---|
x-punk-run-id | Run id for trace lookup, feedback, and replay bundle export. |
x-punk-route | Selected route such as live, exact_cache, artifact, or blocked. |
These headers are exposed through CORS.
Gateway
POST /v1/chat/completions
OpenAI-style chat completions wire format.
Minimum body:
{
"model": "gpt-4o",
"messages": [{ "role": "user", "content": "hello" }]
}
Behavior:
- Uses live OpenAI backend when
OPENAI_API_KEYis set for OpenAI/default model ids. - Uses OpenRouter, DeepSeek, or Moonshot/Kimi backends for their model id families when configured.
- Uses deterministic mock provider when no matching live provider is configured.
- Preserves OpenAI-shaped response body.
- Supports streaming through gateway clients.
- Records run, traces, route explanation, cost, latency, policy, and cache/artifact decisions.
- Set
model: "punk/chorus"to activate Chorus while preserving the OpenAI-shaped response body.
POST /v1/messages
Anthropic-compatible Messages endpoint.
Minimum body:
{
"model": "claude-3-5-sonnet-latest",
"max_tokens": 256,
"messages": [{ "role": "user", "content": "hello" }]
}
Behavior:
- Uses live Anthropic backend when
ANTHROPIC_API_KEYis set forclaude-*models. - Uses the configured provider registry for Chorus final answers, including Anthropic, OpenAI, OpenRouter, DeepSeek, and Moonshot/Kimi when their keys are present.
- Uses deterministic mock provider when no matching live provider is configured.
- Requires
model, non-emptymessages, and positivemax_tokens. - Returns Anthropic-shaped responses and validation errors.
- Supports Anthropic streaming events, including structured
tool_useblocks andinput_json_deltachunks when the upstream provider emits them. - Records the same runs, traces, route explanations, policy, cache, artifact, and cost data as the OpenAI-compatible gateway.
- Set
model: "punk/chorus"to activate Chorus while preserving the Anthropic-shaped response body.
POST /v1/messages/count_tokens
Anthropic-compatible token count endpoint. This is useful for Anthropic SDKs and Claude Code clients that preflight prompt size before sending a message request.
Minimum body:
{
"model": "claude-3-5-sonnet-latest",
"messages": [{ "role": "user", "content": "hello" }]
}
Behavior:
- Accepts the same Anthropic-shaped
system,messages,tools,tool_choice, and structured content blocks Punk understands for/v1/messages. - Returns
{ "input_tokens": number }. - Does not create a run, write trace events, charge usage, or route to a live provider.
- Applies the same
/v1/*auth boundary as the gateway.
Chorus: model: "punk/chorus"
Chorus is Punk's governed intelligence route for harder work that benefits from routing, evidence, verification, cost controls, and receipts.
It is selected by model id, not by a separate endpoint:
| Wire | Endpoint | Response shape |
|---|---|---|
| OpenAI-style chat | POST /v1/chat/completions | OpenAI chat completion |
| Anthropic-style messages | POST /v1/messages | Anthropic message |
The route records Chorus-specific trace events and receipts:
| Trace event | Purpose |
|---|---|
chorus.contract | Request classification, policy, budget, and evidence requirements. |
chorus.claim_graph | Claim-level work plan. |
chorus.route_selected | Sparse solver assignments and rejected alternatives. |
chorus.verifier | Verifier results. |
chorus.research_pack | Source-backed research plan, source cards, and evidence gaps when research mode is enabled. |
chorus.live_synthesis | Final-answer model, provider, token, cost, and latency metadata when live answer routing is requested. |
chorus.agent_delegate | Delegate model, provider, key source, wire, and tool count for Anthropic tool-declaring agent steps. |
chorus.tool_plan | Tool calls returned by the delegate before they are serialized as Anthropic tool_use blocks. |
chorus.ledger | Accepted/rejected evidence, costs, latency, and confidence. |
chorus.receipt | Exportable receipt linked to the final answer hash. |
Retrieve receipts through /api/v1/receipts/:runId, /api/v1/runs/:runId/receipt, or /api/v1/runs/:runId/evidence-packet.
Chorus variants are selected with request fields, not separate model ids:
| Focus | Typical controls |
|---|---|
| Fast | latency_mode: "fast", optional quality_mode: "economy" |
| Balanced | latency_mode: "balanced", quality_mode: "balanced" |
| Deep reasoning | latency_mode: "deep", quality_mode: "frontier_optional" |
| Source-backed research | research_mode: "som", receipt_mode: "full" |
| Maximum quality | latency_mode: "maximum_quality", quality_mode: "maximum_quality", optional live_synthesis_model |
| Private/local | local_only: true, optional allowed_model_classes: ["local", "open_weight"] |
| Shadow evaluation | shadow_mode: true, circuit_mode: "learn" |
Core request controls:
| Field | Values |
|---|---|
budget_limit_usd | number |
latency_mode | fast, balanced, deep, maximum_quality |
quality_mode | economy, balanced, frontier_optional, maximum_quality |
receipt_mode | off, summary, full |
circuit_mode | off, reuse, learn |
shadow_mode | boolean |
research_mode | off, som, deep |
research_max_queries / research_max_sources | numbers |
live_synthesis_model / live_synthesis_required / live_synthesis_max_tokens | final-answer controls |
chorus_agent_model | delegate model for Anthropic tool-declaring agent steps |
local_only / allowed_model_classes / blocked_providers | model supply and policy constraints |
chorus | customer metadata preserved in receipts and evidence packets |
Health
GET /health
Returns gateway health and current provider information.
{
"ok": true,
"version": "0.1.0",
"provider": "mock",
"plasmate": false
}
GET /api/v1/readiness
Admin endpoint used by the dashboard's Production readiness panel. It summarizes whether the current deployment is ready to expose to real users.
Checks include:
- Dashboard/API auth.
- Credential vault encryption.
- Live model provider configuration.
- Provider failover posture.
- Background job draining.
- Private web fetch/webhook posture.
- Public marketing/app host split.
- Billing and quota enforcement.
Example response:
{
"generatedAt": 1760000000000,
"summary": { "ready": 6, "attention": 1, "info": 1, "publicReady": false },
"items": [
{
"id": "providers",
"label": "Live model providers",
"status": "ready",
"message": "Configured: OpenAI, Anthropic.",
"action": null,
"actionHref": "#/governance",
"docsHref": "/docs/configuration"
}
]
}
API Keys
Admin required.
| Method | Path | Purpose |
|---|---|---|
POST | /api/v1/keys | Create a tenant API key; token returned once. |
GET | /api/v1/keys | List tenant API keys. |
POST | /api/v1/keys/:id/revoke | Revoke a key. |
Create key body:
{
"name": "staging observe",
"mode": "observe",
"appId": "support-agent",
"admin": false
}
Modes:
observe: record what Punk would have done, but return live response and do not block.optimize: allow caches, artifacts, and policy enforcement.
Auth Sessions And Users
Session login is for dashboard humans. User management is admin-only.
| Method | Path | Purpose |
|---|---|---|
POST | /api/v1/auth/login | Create a punk_session cookie. Body: { email, password }. |
POST | /api/v1/auth/logout | Delete the active session cookie. |
GET | /api/v1/auth/me | Return the current user, tenant, admin flag, and password-change state. |
POST | /api/v1/auth/change-password | Change the logged-in user's password. Body: { current, new }. |
GET | /api/v1/users | List users; admin required. |
POST | /api/v1/users | Create a user; admin required. Body: { email, name?, role?, tempPassword }. |
DELETE | /api/v1/users/:id | Delete a user; admin required. |
POST | /api/v1/users/:id/reset-password | Reset password; returns a one-time tempPassword; admin required. |
Organizations And Invites
Organizations are the tenant boundary for dashboard users. A session has one active organization; API keys remain tenant-scoped.
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/orgs | List organizations for the current user. |
POST | /api/v1/orgs/switch | Switch the session's active organization. Body: { orgId }. |
POST | /api/v1/orgs | Create an organization. |
GET | /api/v1/orgs/active | Read active org, members, and current role. |
PATCH | /api/v1/orgs/active | Rename active org; owner/admin required. |
DELETE | /api/v1/orgs/active/members/:userId | Remove a member and invalidate their sessions; owner required. |
DELETE | /api/v1/orgs/active | Delete the active org and cascade tenant data; owner required. |
POST | /api/v1/orgs/active/invites | Invite a member by email; owner/admin required. |
GET | /api/v1/orgs/active/invites | List active org invites; owner/admin required. |
POST | /api/v1/orgs/active/invites/:id/revoke | Revoke an invite; owner/admin required. |
GET | /api/v1/invites/:token | Inspect an invite before accepting. |
POST | /api/v1/invites/:token/accept | Accept an invite and create or attach a user. |
POST | /api/v1/auth/signup | Public signup when enabled with PUNK_ALLOW_PUBLIC_SIGNUP=true. |
GET | /api/v1/auth/verify/:token | Verify signup email. |
GET | /api/v1/orgs/active/onboarding | Read zero-state onboarding checklist. |
POST | /api/v1/orgs/active/seed-demo | Seed demo workflow and agent for the active org. |
Billing And Usage
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/usage | Month-to-date usage, quota, spend, savings, and trend. |
GET | /api/v1/usage/attribution | Usage grouped by route, app, agent, model, and workflow. |
GET | /api/v1/plans | Available plans and limits. |
POST | /api/v1/orgs/active/plan | Change the active org plan directly when Stripe is not enabled; admin required. |
POST | /api/v1/billing/checkout | Create a Stripe Checkout session when Stripe is enabled. |
POST | /api/v1/billing/portal | Create a Stripe customer portal session when Stripe is enabled. |
POST | /api/v1/billing/webhook | Stripe webhook endpoint. |
Settings
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/settings | List tenant settings; secrets are redacted as [set]. |
PUT | /api/v1/settings | Update one setting; admin required. |
Known settings:
| Key | Value |
|---|---|
retention_days | Positive number. |
redaction | Boolean. |
webhook_url | Public HTTP(S) URL or null. |
webhook_secret | String or null; never echoed back. |
approval_exception_ttl_hours | Positive number. |
canary_enabled | Boolean. |
model_substitutions | Object map of requested model to cheaper model, such as { "gpt-4o": "gpt-4o-mini" }. |
model_substitution_enabled | Boolean; required before earned model substitutions can serve traffic. |
semantic_cache | "off", "shadow", or "serve". |
tripwire_action | "alert" or "block". |
streaming_dlp | Boolean; masks sensitive values in live response chunks and non-streaming responses. |
memory_quarantine | Boolean; gates high-impact actions influenced by low-trust memory. |
memory_quarantine_min_level | Integer side-effect level 0-4; default is 3. |
cross_tenant_learning | Boolean; opt in to anonymized shape-level aggregate learning. |
Savings And Opportunities
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/savings | Tenant value rollup: spend, savings, optimized share, cache/artifact hit rates, and web-context token savings. |
GET | /api/v1/opportunities | Rank not-yet-promoted patterns by estimated value and next unblocker. |
Runs
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/runs | List runs. Query: limit, offset, route, status. |
GET | /api/v1/runs/:id | Get run, trace events, and side effects. |
GET | /api/v1/runs/:id/integrity | Verify trace integrity hash chain. |
GET | /api/v1/runs/:id/receipt | Get the Chorus receipt for a run when present. |
GET | /api/v1/receipts/:id | Direct Chorus receipt lookup by run id. |
GET | /api/v1/runs/:id/replay-bundle | Export replay evidence for a run. |
GET | /api/v1/runs/:id/evidence-packet | Export a support/security evidence packet: run, route explanation, integrity result, replay bundle when available, side effects, audit rows, and trace events. |
POST | /api/v1/runs/:id/feedback | Submit rating/correction. |
Feedback body:
{
"type": "rating",
"rating": -1,
"correction": "The correct classification is billing."
}
Negative feedback on artifact-served runs creates a failed live evaluation for that artifact.
Patterns
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/patterns | List discovered patterns. |
GET | /api/v1/patterns/:id | Pattern detail with artifacts and example runs. |
GET | /api/v1/patterns/:id/evidence | Pattern evidence: attempts, evaluations, preference, and aggregate signal when opted in. |
POST | /api/v1/patterns/:id/synthesize | Synthesize candidate artifact; admin required. |
POST | /api/v1/patterns/:id/ignore | Mark pattern negative and cache that decision; admin required. |
Artifacts
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/artifacts | List artifacts. |
GET | /api/v1/artifacts/:id | Artifact detail, evaluations, and pattern. |
POST | /api/v1/artifacts/:id/promote | Promote artifact; admin required. |
POST | /api/v1/artifacts/:id/rollback | Retire artifact; admin required. |
POST | /api/v1/artifacts/:id/quarantine | Quarantine artifact; admin required. |
POST | /api/v1/artifacts/:id/replay | Re-run replay suite; admin required. |
Replay body:
{
"runIds": ["run_..."]
}
If runIds is omitted, Punk uses artifact provenance and holdout runs.
Approvals
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/approvals | List approvals. Query: status. |
POST | /api/v1/approvals/:id/approve | Approve; admin required. |
POST | /api/v1/approvals/:id/reject | Reject; admin required. |
Decision body:
{
"reason": "Approved for this deployment window."
}
Learning
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/learning/report | Last learning report and history. |
GET | /api/v1/learning/attempts | Synthesis attempt log. Query: patternId, limit. |
GET | /api/v1/learning/global-insights | Opt-in anonymized aggregate-learning insights. |
POST | /api/v1/learning/tick | Run learning loop now; admin required. |
Cache
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/cache/stats | Cache hit/miss stats. |
POST | /api/v1/cache/invalidate | Invalidate cache entries; admin required. |
Invalidate body:
{
"cacheType": "som"
}
Omit cacheType to clear all cache types for the tenant.
Governance And Audit
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/audit | List audit events. Query: limit, decision. |
GET | /api/v1/agent-identities | List agent trust identities and trust state. |
GET | /api/v1/policies | List loaded policies. |
GET | /api/v1/tripwires | List planted tripwires. |
POST | /api/v1/tripwires | Plant a tripwire; admin required. |
DELETE | /api/v1/tripwires/:id | Delete a tripwire; admin required. |
POST | /api/v1/tripwires/:id/arm | Arm a tripwire; admin required. |
POST | /api/v1/tripwires/:id/disarm | Disable a tripwire; admin required. |
GET | /api/v1/tripwire-events | List tripwire firing events. |
POST | /api/v1/runs/:runId/memory | Record memory/context influence for memory quarantine. |
Workflows And Credentials
Workflows are interpreted graphs. llm nodes loop back through the gateway; web_fetch nodes create compact page context; tool_call nodes use registered MCP servers. See Workflows for graph configuration, node config, scheduling, and template details.
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/workflows | List workflows. Query: kind=workflow or kind=agent. |
POST | /api/v1/workflows | Create workflow; admin required. |
GET | /api/v1/workflows/:id | Read workflow. |
PATCH | /api/v1/workflows/:id | Update workflow and bump version; admin required. |
DELETE | /api/v1/workflows/:id | Delete workflow and unschedule it; admin required. |
POST | /api/v1/workflows/:id/run | Execute synchronously; admin required. Body: { input?, trigger? }. |
GET | /api/v1/workflows/:id/runs | Workflow run history. Query: limit. |
GET | /api/v1/workflows/:id/savings | Cost/savings rollup with optimizedShare. |
GET | /api/v1/workflow-runs/:id | Workflow run plus node-level trace events. |
GET | /api/v1/workflow-templates | Built-in templates. |
POST | /api/v1/workflow-templates/:id/instantiate | Create from a template; admin required. Body: { name? }. |
POST | /api/v1/workflows/export | Export all workflows, or selected ids; read-only. |
POST | /api/v1/workflows/import | Atomic import; admin required. |
GET | /api/v1/credentials | List stored credentials, masked. Query: provider. |
POST | /api/v1/credentials | Store credential; admin required. Body: { name, provider, secret }. |
DELETE | /api/v1/credentials/:id | Delete credential; admin required. |
Chat And Agents
Conversations are chat threads whose assistant replies are real gateway runs with route, cost, and savings recorded per message. Agents are cron-schedulable single-task runners built on the workflow engine. See Chat & Agents for full semantics.
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/conversations | List conversations. |
POST | /api/v1/conversations | Create conversation. Body: { title?, model?, system? }. |
GET | /api/v1/conversations/:id | Read conversation with messages. |
DELETE | /api/v1/conversations/:id | Delete conversation and its messages. |
POST | /api/v1/conversations/:id/messages | Send a turn and store the assistant reply. Body: { content }. |
GET | /api/v1/agents | List agents. |
POST | /api/v1/agents | Create agent; admin required. Body: { name, instructions, prompt, model?, scheduleCron?, description?, enabled? }. |
GET | /api/v1/agents/:id | Read agent. |
PATCH | /api/v1/agents/:id | Update agent; admin required. |
DELETE | /api/v1/agents/:id | Delete agent and schedule; admin required. |
POST | /api/v1/agents/:id/run | Execute now; admin required. Body: { input? }. |
MCP Servers
Registered MCP servers back workflow tool_call nodes. Registry mutations are admin-only. Tool execution is governed before connection.
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/mcp/servers | List registered MCP servers. |
POST | /api/v1/mcp/servers | Register stdio or HTTP server; admin required. |
GET | /api/v1/mcp/servers/:id | Read server config. |
PATCH | /api/v1/mcp/servers/:id | Update server config and evict pooled connection; admin required. |
DELETE | /api/v1/mcp/servers/:id | Delete server; admin required. |
POST | /api/v1/mcp/servers/:id/test | Connect, list tools, persist status/tool count; admin required. |
GET | /api/v1/mcp/servers/:id/tools | List tools, cached from the last test and refreshed when stale. |
Jobs
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/jobs | List jobs and stats. Query: status, limit. |
Web Context
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/web/snapshots | List stored structured page snapshots. |
POST | /api/v1/web/fetch | Fetch a URL and return compact page context. |
POST | /api/v1/web/sessions | Open a governed stateful web session. Body: { url }. |
GET | /api/v1/web/sessions | List active tenant sessions. |
POST | /api/v1/web/sessions/:id/act | Execute a governed action. Body: { intent: { action, target, value? } }. |
DELETE | /api/v1/web/sessions/:id | Close a session. |
Fetch body:
{
"url": "https://example.com",
"bypassCache": false
}
Response includes:
som: structured page snapshotsource: adapter name,builtin, orcachecachedhtmlBytessomBytestokensSavedEstimatediffwhen a previous snapshot existscontext: compact prompt-ready rendering
SDK Trace And Tool Cache
| Method | Path | Purpose |
|---|---|---|
POST | /api/v1/ingest/prompt | Side-load an external prompt as an observed run. Body: { source, prompt, sessionId?, metadata? }. |
POST | /api/v1/trace | Append an SDK trace event to a run. |
POST | /api/v1/tool-cache/check | Check read-only tool-result cache. |
POST | /api/v1/tool-cache/store | Store read-only tool result. |
Most users should call these through @punk/sdk instead of raw HTTP.
Errors
Errors are JSON objects with an error key. Provider-compatible validation errors use the corresponding OpenAI-style or Anthropic-style error shape.
Common statuses:
400: invalid request body or unsupported setting.401: missing or invalid bearer token.403: policy block, private URL blocked, or admin key required.404: tenant-scoped record not found.429: rate limit exceeded;retry-afterheader included.502: upstream web fetch failure.