BlockRun

Chat Completions

The Chat Completions API is OpenAI-compatible, making it easy to migrate existing code. Point your client at BlockRun, pick a model, and pay per request in USDC over x402.

Endpoint

POST https://blockrun.ai/api/v1/chat/completions

Request

Headers

HeaderRequiredDescription
Content-TypeYesMust be application/json
PAYMENT-SIGNATUREConditionalBase64-encoded x402 payment payload (required after 402, x402 v2)

Body Parameters

ParameterTypeRequiredDescription
modelstringYesModel ID (e.g., openai/gpt-5.5)
messagesarrayYesArray of message objects
max_tokensintegerNoMaximum tokens to generate (default: 1024)
temperaturenumberNoSampling temperature (0-2)
top_pnumberNoNucleus sampling parameter

Message Object

FieldTypeDescription
rolestringOne of: system, user, assistant
contentstringThe message content

Response

Success (200)

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1703123456,
  "model": "gpt-5.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 8,
    "total_tokens": 20
  }
}

Usage Fields

FieldAlways presentNotes
prompt_tokensyesFull prompt size; cache reads already folded in
completion_tokensyesOutput tokens; includes thinking/reasoning for all providers
total_tokensyesSum of prompt + completion
prompt_tokens_details.cached_tokenswhen cache hitPrompt tokens read from cache (OpenAI convention)
prompt_tokens_details.cached_creation_tokenswhen cache writePrompt tokens written to cache this turn (BlockRun extension)
cache_read_input_tokenswhen cache hitSame as prompt_tokens_details.cached_tokens — Anthropic-native label
cache_creation_input_tokenswhen cache writeSame as prompt_tokens_details.cached_creation_tokens — Anthropic-native label
completion_tokens_details.reasoning_tokenswhen reasoningGPT-5.x / o-series only; forwarded verbatim. Not surfaced for Anthropic Claude models (thinking tokens are already included in completion_tokens)

Protocol naming convention:

  • Claude native /v1/messages: usage.input_tokens, usage.output_tokens, usage.cache_creation_input_tokens, usage.cache_read_input_tokens
  • OpenAI-compat /v1/chat/completions: usage.prompt_tokens_details.cached_tokens (reads), usage.prompt_tokens_details.cached_creation_tokens (writes), usage.completion_tokens_details.reasoning_tokens (reasoning)

Enabling prompt caching on Anthropic models: Pass "prompt_cache": true in the request body, or embed cache_control blocks in your message content directly — both are honored.

No phantom identity tokens in your billed input

BlockRun does not prepend a hidden identity/system directive to your prompt by default. The input_tokens (prompt_tokens) you are billed for reflect exactly the messages you sent — there are no extra phantom input tokens added by the gateway.

Claude-native context_management requires the anthropic-beta header

If you use the Claude-native POST /v1/messages endpoint with the context_management field, you must also send the matching anthropic-beta header. A context_management body without that header is rejected at the edge with a 400 (rather than silently ignored).

Sampling params on Claude Opus 4.7 / 4.8

temperature, top_p, and top_k are not honored for anthropic/claude-opus-4.7 and anthropic/claude-opus-4.8 — these models reject sampling params upstream, so the gateway strips them so your request still succeeds (it does not fail). Set behavior through your prompt instead.

Payment Required (402)

When you first make a request without payment, you'll receive:

{
  "error": "Payment Required",
  "message": "This endpoint requires x402 payment",
  "price": {
    "amount": "0.001000",
    "currency": "USD",
    "breakdown": {
      "inputCost": "0.000012",
      "outputCost": "0.000600",
      "margin": "0%"
    }
  },
  "paymentInfo": {
    "network": "base",
    "asset": "USDC",
    "x402Version": 2
  }
}

The X-Payment-Required header contains the full payment requirements.

402 is the normal flow, not an error

The first request returns 402 Payment Required with a price quote. Sign it and retry with the PAYMENT-SIGNATURE header to get your completion. The SDKs do this round-trip automatically.

Example

What's next?