What's SACTL API's base_url and how do I authenticate?

base_url is https://api.sactl.ai. Authentication supports two headers: Anthropic native uses x-api-key: sk-xa-prod-***; OpenAI-compatible uses Authorization: Bearer sk-xa-prod-***. They are equivalent inside SACTL — both look up the same Virtual Key table.

Which API endpoints does SACTL support?

Inference: Anthropic native POST /v1/messages, OpenAI-compatible POST /v1/chat/completions, GET /v1/models. Files API: POST/GET/DELETE /v1/files. Batch API: POST/GET /v1/messages/batches, fetch results, cancel. System: GET /healthz, GET /metrics (ops-network only).

What's the error response format?

Unified error envelope: error.code is a stable machine-readable string, error.message is a human-readable description (in EN/zh as appropriate), error.trace_id is a ULID for support to look up the request. We never proxy raw upstream Anthropic bodies.

Which usage headers does the API return?

x-sactl-usage-prompt-tokens, x-sactl-usage-completion-tokens, x-sactl-usage-cost-usd (the actual USD cost for this request, including the × 0.30 discount), x-sactl-budget-remaining-usd (the VK's remaining budget), x-sactl-trace-id (the trace ID that matches our logs).

SACTL Gateway API — API Reference

Authentication

Every protected endpoint requires Authorization: Bearer sk-xa-{env}-{base62}. Keys are generated in the dashboard or sent directly by Telegram support; the database stores an HMAC digest (never plaintext). Each key carries four properties — allowed-model list, IP allowlist, monthly budget cap, owning tenant — checked one by one before each call. Mismatches are rejected on the spot.

Common request headers

Field	Type		Description
Authorization	string	required	`Bearer sk-xa-{env}-{base62}`. env is one of `dev`/`stg`/`prod`.
Content-Type	string	required	POST must be `application/json`; `/v1/files` uploads use `multipart/form-data`.
X-Request-Id	string		Client-supplied trace id. If absent, SACTL generates a UUIDv7 and writes it back in the response header.
anthropic-beta	string		Applies only to `/v1/messages` / Files / Batches; passed verbatim to Anthropic.

Response headers

Every 2xx response carries SACTL accounting headers. Front-ends can read X-SACTL-Usage-Cost-USD directly to build a client-side ledger — no extra /v1/usage call required.

Header	Type	When	Description
X-SACTL-Usage-Prompt-Tokens	int	2xx	Prompt tokens for this request (as reported by upstream).
X-SACTL-Usage-Completion-Tokens	int	2xx	Completion tokens for this request.
X-SACTL-Usage-Cost-USD	decimal	2xx	USD cost for this request, settled against SACTL's discount table (Claude × 0.30; GPT as low as 1/30 of official supported; Gemini coming soon), to 6 decimal places.
X-SACTL-Budget-Remaining-USD	decimal	when capped	Present when the VK has a monthly cap; reports remaining budget for the month.
Retry-After	int	429	Returned when GCRA rate-limiting fires. In seconds, computed from bucket recovery rate.

Error envelope

All 4xx/5xx responses use the envelope below. We never proxy the upstream raw error body — upstream credentials, internal URLs, and trace details never leak to clients. Use trace_id for support follow-up.

{
  "error": {
    "code": "key_invalid",
    "message": "virtual key is invalid or revoked",
    "trace_id": "01JX7W3Z5A8M9E2Q1P4K6R8TVB"
  }
}

Registered error codes

HTTP	code	Trigger
401	key_invalid	VK missing, malformed, or revoked.
403	model_forbidden	The requested model is not on this VK's `allowed_models` list.
403	ip_forbidden	Request source IP is not on this VK's IP allowlist.
403	signature_invalid	VK HMAC verification failed (e.g. pepper mismatch).
402	budget_exhausted	Pre-debit determined this call would exceed the VK's monthly USD cap.
402	context_too_long	Prompt tokens exceed the model's context window registered in the pricing registry.
429	rate_limited	GCRA rate-limit fired (any of tenant / vk / vk×model / vk×ip).
400	ssrf_forbidden	Multimodal `image_url` points at private / internal address; blocked by the SSRF filter.
400	model_unknown	Model ID not found in the pricing registry.
400	bad_request	Malformed JSON or missing required fields.
413	payload_too_large	Request body exceeds `SIDECAR_BODY_MAX_MB` (default 32MB).
415	unsupported_media	Multipart upload MIME not on the allowlist.
502	upstream_error	Anthropic or OpenAI returned a 5xx. (Gemini upstream coming soon.)
504	upstream_timeout	Upstream request timed out.
500	internal_error	Gateway-internal exception.
503	service_unavailable	All keys in the pool are in cooldown (circuit breaker tripped). preview

Inference

Core inference endpoints — Anthropic native plus OpenAI-compatible (so OpenAI SDKs reach Claude or GPT with zero client code changes). Claude + GPT upstream live; Gemini coming soon.

POST /v1/messages stable

Create message (Anthropic native)

Anthropic native Messages endpoint. Request and response bodies are compatible with the official Anthropic Messages API — Claude Code talks to this endpoint. We only add auth, usage caps, rate limiting, and billing logs in front; the request body semantics are unchanged.

Headers

Field	Type		Description
Authorization	string	required	`Bearer sk-xa-...`
Content-Type	string	required	`application/json`
anthropic-beta	string		Comma-separated list of beta flags. Passed verbatim to Anthropic.
X-Request-Id	string		Client-side trace id.

Request body

Field	Type		Description
model	string	required	Model ID. Must be on the VK's `allowed_models` list — see GET /v1/models.
max_tokens	int	required	Cap is the model's token window per the pricing registry.
messages	array	required	Anthropic-style message array. Role is `user` or `assistant`; content is a string or a block array (text / image / tool_use / tool_result).
system	string \| array		System prompt. Array form supports `cache_control`.
temperature	number		0.0 - 1.0。
top_p	number		Nucleus sampling.
top_k	int		Top-k sampling.
stop_sequences	string[]		Custom stop sequences.
tools	array		Anthropic native tools format.
tool_choice	object		`{type: "auto"\|"any"\|"tool", name?: "..."}`
stream	bool		Default false. When true, response is SSE (`text/event-stream`) with Anthropic native event names (`message_start` / `content_block_delta` / …).
thinking	object		`{type: "enabled", budget_tokens: 32000}` for extended thinking. SACTL auto-injects `anthropic-beta` if you didn't set it.
cache_control	object		Embedded in content block / tools / system. SACTL can auto-inject `{type: "ephemeral"}` on system and tools blocks (controlled by the VK's `auto_prompt_cache` flag).
metadata	object		`{user_id: "..."}` passed verbatim to upstream and logged to audit.

Request example

{
  "model": "claude-sonnet-4-6",
  "max_tokens": 1024,
  "messages": [
    { "role": "user", "content": "Hello, Claude." }
  ]
}

Response

HTTP/1.1 200 OK
X-SACTL-Usage-Prompt-Tokens: 12
X-SACTL-Usage-Completion-Tokens: 28
X-SACTL-Usage-Cost-USD: 0.000336
X-SACTL-Budget-Remaining-USD: 49.832104

{
  "id": "msg_01AbCdEfGhIjKlMnOpQrStUv",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-6",
  "content": [
    { "type": "text", "text": "Hello! How can I help you today?" }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 28
  }
}

curl

curl https://api.sactl.ai/v1/messages \
  -H "Authorization: Bearer YOUR_VK" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Hello, Claude." }
    ]
  }'

Errors

HTTP	code	Scenario
401	key_invalid	VK missing or revoked.
403	model_forbidden	Model not on the allowlist.
402	budget_exhausted	Pre-debit exceeds the cap.
402	context_too_long	Prompt exceeds the token window.
429	rate_limited	GCRA fired; carries `Retry-After`.
502	upstream_error	Anthropic returned 5xx.
504	upstream_timeout	Upstream timed out.

POST /v1/chat/completions stable

Create chat completion (OpenAI-compat)

OpenAI Chat Completions-compatible endpoint. Clients using openai-python / openai-node / LangChain or any OpenAI SDK can keep their code unchanged — just flip the base URL. SACTL translates OpenAI shape to Anthropic shape upstream, then translates the Anthropic response back to OpenAI shape. Bidirectional support for tool_calls, image_url, stream, and reasoning_effort (maps to Anthropic thinking).

Translation switch: set SIDECAR_TRANSLATE_OPENAI_MODE=openai-to-anthropic (default on). Set to passthrough to forward OpenAI-shape requests directly to the OpenAI upstream.

Request body (key fields)

Field	Type		Description
model	string	required	Accepts OpenAI model IDs and Claude model IDs (in translation mode).
messages	array	required	OpenAI message array. `role` is `system`/`user`/`assistant`/`tool`.
max_tokens	int		Aliased to `max_completion_tokens`.
temperature	number		0.0 – 2.0 (OpenAI semantics).
top_p	number
tools	array		OpenAI tools format; translated to Anthropic `tools`.
tool_choice	string \| object		`"auto"` / `"none"` / `"required"` / `{type:"function", function:{name}}`。
stream	bool		OpenAI SSE format (`data: {"choices":[...]}`); SACTL rewrites Anthropic events into OpenAI delta events.
reasoning_effort	string		`"low" \| "medium" \| "high"` → Anthropic `thinking.budget_tokens: 2048 \| 8000 \| 32000`
response_format	object		Only `{type: "json_object"}` is honored; on Claude this is mapped to a system-prompt injection.

Translation cheat sheet

OpenAI	Anthropic	Notes
messages[].role="system"	system	Merged into the top-level `system`, preserving order.
tools	tools	Field names match; parameter schema copied as-is.
tool_calls	tool_use	Extracted from the response's block array.
role="tool"	tool_result block	Merged with the prior assistant message into the user turn.
image_url	image block	URL passes through SACTL's SSRF filter; data URLs convert directly to base64.
finish_reason	stop_reason	`stop`/`length`/`tool_calls`/`content_filter` mapped pairwise.
reasoning_effort	thinking.budget_tokens	low=2048, medium=8000, high=32000。

Request example

{
  "model": "claude-sonnet-4-6",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Write a haiku about Go channels." }
  ],
  "max_tokens": 256,
  "temperature": 0.7
}

Response

{
  "id": "chatcmpl-0A1B2C3D4E5F",
  "object": "chat.completion",
  "created": 1745145600,
  "model": "claude-sonnet-4-6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Silent channels hum,\nGoroutines pass gifts in dark,\nSelect waits for dawn."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 31,
    "total_tokens": 53
  }
}

curl

curl https://api.sactl.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_VK" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Write a haiku about Go channels." }
    ],
    "max_tokens": 256,
    "temperature": 0.7
  }'

Errors

HTTP	code	Scenario
401	key_invalid	VK authentication failed.
403	model_forbidden	Model not on the VK allowlist.
400	ssrf_forbidden	`image_url` points at a private network.
400	model_unknown	Model ID not in the pricing registry.
429	rate_limited	Rate-limit fired.
502	upstream_error	Upstream 5xx.

POST /v1/completions coming soon legacy

Create completion (legacy) — available once GPT upstream is wired

Legacy OpenAI completions (non-chat) pass-through endpoint. The route is registered in the gateway, but at this stage SACTL only fronts the Claude upstream — Claude doesn't have non-chat completions, so this endpoint is not exposed today and calls return 503 service_unavailable. It will auto-enable when GPT upstream is wired up. For new integrations, use /v1/chat/completions directly.

Request body

Field	Type		Description
model	string	required	OpenAI completions model ID.
prompt	string \| array	required	String or array of strings.
max_tokens	int
temperature	number
stream	bool		OpenAI native SSE.

curl

curl https://api.sactl.ai/v1/completions \
  -H "Authorization: Bearer YOUR_VK" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "Say hello in three languages.",
    "max_tokens": 64
  }'

POST /v1/embeddings stable

Create embeddings

Embeddings (OpenAI-compatible schema). Anthropic does not offer embeddings, so this endpoint routes to GPT (text-embedding-3-small / text-embedding-3-large) upstream. Gemini (text-embedding-004) upstream coming soon. Per-token billing, with the same usage caps and rate limiting as inference endpoints.

Request body

Field	Type		Description
model	string	required	Embedding model ID.
input	string \| array	required	Single string or array of strings.
encoding_format	string		`"float"` (default) / `"base64"`.
dimensions	int		Only valid on `text-embedding-3-*`; truncates embedding dimensions.

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0091, 0.0412, /* ... 1536 floats */]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 7,
    "total_tokens": 7
  }
}

curl

curl https://api.sactl.ai/v1/embeddings \
  -H "Authorization: Bearer YOUR_VK" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "The quick brown fox jumps over the lazy dog."
  }'

Errors

HTTP	code	Scenario
401	key_invalid	VK invalid.
400	model_unknown	Model ID not registered.
413	payload_too_large	Input array too large.
429	rate_limited	Rate-limited.

GET /v1/models stable

List models

Lists every model the VK is authorized to call. Returns Anthropic native shape {data: [{id, type, ...}]}. The list is computed dynamically from the intersection of the VK's allowed_models array and the pricing registry — even if a model is in the pricing registry, it won't appear here unless the VK has it enabled.

Response

{
  "data": [
    { "type": "model", "id": "claude-opus-4-7",     "display_name": "Claude Opus 4.7",   "created_at": "2025-09-29T00:00:00Z" },
    { "type": "model", "id": "claude-opus-4-6",     "display_name": "Claude Opus 4.6",   "created_at": "2025-08-05T00:00:00Z" },
    { "type": "model", "id": "claude-sonnet-4-6",   "display_name": "Claude Sonnet 4.6", "created_at": "2025-07-10T00:00:00Z" },
    { "type": "model", "id": "claude-haiku-4-5",    "display_name": "Claude Haiku 4.5",  "created_at": "2025-05-03T00:00:00Z" }
  ],
  "has_more": false,
  "first_id": "claude-opus-4-7",
  "last_id": "claude-haiku-4-5"
}
# GPT 模型已加入此列表(与 VK allowed_models 取交集)。Gemini 上游接入后自动出现。

curl

curl https://api.sactl.ai/v1/models \
  -H "Authorization: Bearer YOUR_VK"

Files API

Anthropic Files API pass-through. Upload documents / images and reference them in subsequent /v1/messages calls. Max file size is governed by SIDECAR_FILES_MAX_MB (default 32MB); the MIME allowlist is configurable. Every Files operation writes a billing log entry (file.uploaded / file.deleted) for reconciliation.

POST /v1/files stable

Upload file

Upload a file via multipart/form-data. The returned file id can be referenced in /v1/messages content blocks as {type: "document", source: {type: "file", file_id: "..."}}.

Form fields

Field	Type		Description
file	file	required	File binary. MIME allowlist: `application/pdf`, `image/png`, `image/jpeg`, `image/gif`, `image/webp`, `text/plain`, `text/csv`, `text/markdown`.
purpose	string		Optional tag, written to audit log.

Response

{
  "id": "file_01AbCdEfGhIjKlMnOpQrStUv",
  "type": "file",
  "filename": "contract.pdf",
  "mime_type": "application/pdf",
  "size_bytes": 183204,
  "created_at": "2026-04-20T08:12:31.441Z",
  "downloadable": false
}

curl

curl https://api.sactl.ai/v1/files \
  -H "Authorization: Bearer YOUR_VK" \
  -F "[email protected];type=application/pdf" \
  -F "purpose=context"

Errors

HTTP	code	Scenario
401	key_invalid	VK invalid.
413	payload_too_large	File exceeds `SIDECAR_FILES_MAX_MB`.
415	unsupported_media	MIME not on the allowlist.

GET /v1/files stable

List files

List files uploaded by the current VK (files from other VKs are invisible). Cursor-based pagination.

Query params

Field	Type	Description
limit	int	1 – 1000, default 20.
before_id	string	Cursor; returns files before this id.
after_id	string	Cursor; returns files after this id.

Response

{
  "data": [
    { "id": "file_01AbCd...", "filename": "contract.pdf", "size_bytes": 183204, "mime_type": "application/pdf", "created_at": "2026-04-20T08:12:31.441Z" }
  ],
  "has_more": false,
  "first_id": "file_01AbCd...",
  "last_id": "file_01AbCd..."
}

curl

curl "https://api.sactl.ai/v1/files?limit=20" \
  -H "Authorization: Bearer YOUR_VK"

GET /v1/files/{id} stable

Get file metadata

Read metadata for a single file. SACTL does not store the file binary itself; this endpoint returns Anthropic upstream metadata pass-through.

curl

curl https://api.sactl.ai/v1/files/file_01AbCdEfGhIjKlMnOpQrStUv \
  -H "Authorization: Bearer YOUR_VK"

Errors

HTTP	code	Scenario
401	key_invalid	VK invalid.
404	bad_request	file id does not exist or does not belong to this VK.

DELETE /v1/files/{id} stable

Delete file

Delete a file. After deletion, /v1/messages calls referencing the file_id return bad_request. The audit log emits file.deleted.

Response

{
  "id": "file_01AbCdEfGhIjKlMnOpQrStUv",
  "type": "file_deleted"
}

curl

curl -X DELETE https://api.sactl.ai/v1/files/file_01AbCdEfGhIjKlMnOpQrStUv \
  -H "Authorization: Bearer YOUR_VK"

Batch API

Submit large numbers of Messages requests as a single batch. State machine: validating → in_progress → ended | canceled | failed. Results are returned as JSONL and are downloadable only in the ended state.

Billing: batch calls take Anthropic's 50% batch discount stacked with our × 0.30 — Claude batch price = official × 0.15 (see the pricing page). Billing settlement happens only at batch completion or cancellation — canceled / failed requests are not billed.

POST /v1/messages/batches stable

Create message batch

Submit a batch of Messages requests. Each request body is equivalent to a single /v1/messages call, wrapped with a custom_id for reconciliation.

Request body

Field	Type		Description
requests	array	required	1 – 10,000 `{custom_id, params}` items.
requests[].custom_id	string	required	Unique within the batch; used to map results back to your business id.
requests[].params	object	required	Same shape as the `/v1/messages` request body (model / max_tokens / messages / …).

Request example

{
  "requests": [
    {
      "custom_id": "job-001",
      "params": {
        "model": "claude-haiku-4-5",
        "max_tokens": 256,
        "messages": [ { "role": "user", "content": "Summarize Go channels." } ]
      }
    },
    {
      "custom_id": "job-002",
      "params": {
        "model": "claude-haiku-4-5",
        "max_tokens": 256,
        "messages": [ { "role": "user", "content": "What is GCRA?" } ]
      }
    }
  ]
}

Response

{
  "id": "msgbatch_01Wx9...",
  "type": "message_batch",
  "processing_status": "in_progress",
  "request_counts": { "processing": 2, "succeeded": 0, "errored": 0, "canceled": 0, "expired": 0 },
  "ended_at": null,
  "created_at": "2026-04-20T08:12:31.441Z",
  "expires_at": "2026-04-21T08:12:31.441Z",
  "cancel_initiated_at": null,
  "results_url": null
}

curl

curl https://api.sactl.ai/v1/messages/batches \
  -H "Authorization: Bearer YOUR_VK" \
  -H "Content-Type: application/json" \
  -d @batch.json

GET /v1/messages/batches stable

List message batches

List all batches for the current VK. Supports cursor pagination via limit / before_id / after_id.

curl

curl "https://api.sactl.ai/v1/messages/batches?limit=20" \
  -H "Authorization: Bearer YOUR_VK"

GET /v1/messages/batches/{id} stable

Retrieve message batch

Read the status of a single batch. processing_status transitions through validating → in_progress → ended / canceled / failed.

Response

{
  "id": "msgbatch_01Wx9...",
  "processing_status": "ended",
  "request_counts": { "processing": 0, "succeeded": 2, "errored": 0, "canceled": 0, "expired": 0 },
  "ended_at": "2026-04-20T08:14:02.112Z",
  "results_url": "https://api.sactl.ai/v1/messages/batches/msgbatch_01Wx9.../results"
}

curl

curl https://api.sactl.ai/v1/messages/batches/msgbatch_01Wx9... \
  -H "Authorization: Bearer YOUR_VK"

GET /v1/messages/batches/{id}/results stable

Retrieve message batch results

Returns per-request results as JSONL (application/x-ndjson). Only available when processing_status=ended; other states return bad_request.

Response (one line per request)

{"custom_id":"job-001","result":{"type":"succeeded","message":{"id":"msg_...","content":[...],"usage":{...}}}}
{"custom_id":"job-002","result":{"type":"succeeded","message":{"id":"msg_...","content":[...],"usage":{...}}}}

curl

curl https://api.sactl.ai/v1/messages/batches/msgbatch_01Wx9.../results \
  -H "Authorization: Bearer YOUR_VK"

POST /v1/messages/batches/{id}/cancel stable

Cancel message batch

Move an in_progress / validating batch to canceling. In-flight requests may still complete and get billed normally; canceled requests are not billed.

curl

curl -X POST https://api.sactl.ai/v1/messages/batches/msgbatch_01Wx9.../cancel \
  -H "Authorization: Bearer YOUR_VK"

System

Health check and metrics. No VK required — but in production these should be restricted to ops-network CIDRs at the ingress layer.

GET /health stable

Health check

Health check. Returns 200 + JSON; fields report connectivity to each dependency. Suitable as a Kubernetes liveness / readiness probe target. No auth required.

Response

HTTP/1.1 200 OK
Content-Type: application/json

{
  "status": "ok",
  "redis": "ok",
  "vault": "ok",
  "version": "0.9.3",
  "commit": "a1b2c3d",
  "uptime_seconds": 84221
}

curl

curl https://api.sactl.ai/health

GET /metrics stable

Prometheus metrics

Prometheus metrics endpoint. Returns all business metrics in text/plain Prometheus format: rl_* (rate limiting), forward_* (forwarding), upstream_* (upstream latency / errors), breaker_* (circuit breakers), audit_* (audit pipeline). Add it to your Prometheus scrape config.

Production guidance: this endpoint should be restricted to the ops network / Prometheus scraper in production (enforce a CIDR allowlist at upstream ingress / nginx). Metrics themselves carry no credentials but reveal traffic patterns.

Response (excerpt)

# HELP rl_reject_total Number of requests rejected by rate limiter
# TYPE rl_reject_total counter
rl_reject_total{dim="tenant"} 12
rl_reject_total{dim="vk"} 3
rl_reject_total{dim="vk_model"} 1
rl_reject_total{dim="vk_ip"} 0

# HELP forward_request_duration_seconds Request duration histogram
# TYPE forward_request_duration_seconds histogram
forward_request_duration_seconds_bucket{route="/v1/messages",le="0.5"} 4821
forward_request_duration_seconds_bucket{route="/v1/messages",le="1"} 5910
forward_request_duration_seconds_bucket{route="/v1/messages",le="+Inf"} 6024

# HELP upstream_5xx_total Upstream 5xx responses
# TYPE upstream_5xx_total counter
upstream_5xx_total{upstream="anthropic"} 2

# HELP audit_worm_append_total WORM audit log appends
# TYPE audit_worm_append_total counter
audit_worm_append_total{event="message.ok"} 5932
audit_worm_append_total{event="file.uploaded"} 18

curl

curl https://api.sactl.ai/metrics

Preview / Coming soon

The capabilities below are code-complete but not wired to the hot path, or are still in canary. Production customers should not rely on their behavior.

Multi-key pool PickMiddleware preview

Multiple API keys hang under a single upstream provider; key selection is based on health / cost / usage. When a key gets a 429 / 401 from upstream it enters a cooldown window. When all keys cool down, the provider itself trips and returns 503 service_unavailable.

Current status: middleware code is implemented but not on the forward path by default.
Expected GA: 2026 Q2. Wiring it in does not change the API contract — it just turns 503 service_unavailable from "theoretically possible" into actually observed.

Markup multiplier settle preview

For multi-tier reseller scenarios, VKs can carry a markup_multiplier (e.g. 1.20 = add a 20% channel margin); at settlement the margin is routed to the channel account automatically.

Current status: the schema field is in the database; the settlement step does not yet apply the multiplier, so actual billing equals the underlying price.
Expected GA: same batch as Multi-key pool, 2026 Q2.
Note: this does not expose any new endpoint to the client; it only changes what the X-SACTL-Usage-Cost-USD header amount means.