Chat Completions

POST /v1/chat/completions is the core endpoint. It takes a list of messages and returns a model completion. Recovea proxies OpenAI's request and response shapes unchanged (the same fields, in the same order, with the same types), so an unmodified OpenAI SDK works as-is. We optimize and meter underneath, fail-open.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.recovea.ai/v1",
    api_key="rcv_live_…",
)

resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a terse assistant."},
        {"role": "user", "content": "Name the capital of France."},
    ],
)
print(resp.choices[0].message.content)

Request

Field	Type	Notes
`model`	string	Required. A bare OpenAI id (e.g. `gpt-4o`) goes native to OpenAI; an OpenRouter `vendor/model` id (e.g. `google/gemini-2.5-pro`, exactly as listed by `GET /v1/models`) rides your OpenRouter key for the long tail. A bare non-OpenAI id returns a `404` naming the `vendor/model` id to use. For Claude models the native Anthropic-shaped `/anthropic` surface is also available. Echoed back in the response.
`messages`	array	Required. Conversation so far. Each item has a `role` and `content` (see below).
`temperature`	number	Sampling temperature, `0`–`2`. Default `1`.
`top_p`	number	Nucleus sampling, `0`–`1`. Use this or `temperature`, not both.
`max_tokens`	integer	Cap on tokens generated in the completion.
`n`	integer	Number of choices to return. Default `1`.
`stop`	string \| array	Up to four sequences that halt generation.
`tools`	array	Function/tool definitions the model may call. Passes through unchanged.
`tool_choice`	string \| object	`auto`, `none`, `required`, or a named function.
`response_format`	object	`{"type": "json_object"}` or a JSON schema for structured output.
`stream`	boolean	When `true`, tokens arrive as server-sent events. See Streaming.

This is a subset; any other field the OpenAI Chat Completions API accepts (presence_penalty, frequency_penalty, seed, logprobs, logit_bias, stream_options, user, …) is forwarded verbatim.

Two surfaces, one account. This OpenAI-shaped /v1 endpoint selects the provider from the model id: a bare OpenAI id goes native to OpenAI on your connected key, and an OpenRouter vendor/model id rides your OpenRouter key for the long tail. Anthropic is served natively on its own Anthropic-shaped /anthropic surface (same rcv_ key, same ledger). Cross-provider cascading — falling back to a cheaper model that still passes the quality gate — is planned, not yet active on the hot path. See How Recovea works.

Messages and roles

Each message carries a role and content:

system: instructions that steer the assistant.
user: end-user input.
assistant: a prior model turn. May include tool_calls.
tool: the result of a tool call, keyed by tool_call_id.

content is a string, or an array of content parts ({"type": "text", …}, {"type": "image_url", …}) for multimodal input.

Response

A non-streamed call returns a chat.completion object. Field-for-field identical to OpenAI:

{
  "id": "chatcmpl-9x8a7b6c5d",
  "object": "chat.completion",
  "created": 1717000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Paris.",
        "refusal": null
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 19,
    "completion_tokens": 2,
    "total_tokens": 21
  }
}

Field	Notes
`id`	Stable `chatcmpl-…` id for the response.
`object`	Always the literal `"chat.completion"`.
`created`	Unix timestamp in seconds.
`model`	The model that served the request: the id you sent, echoed back.
`choices[]`	One entry per `n`, each with `index`, `message`, `logprobs`, `finish_reason`.
`usage`	Token accounting: `prompt_tokens`, `completion_tokens`, `total_tokens`.

finish_reason

The reason the model stopped, preserved exactly:

Value	Meaning
`stop`	Natural end, or hit a `stop` sequence.
`length`	Reached `max_tokens` or the context limit.
`tool_calls`	The model is calling one or more tools.
`content_filter`	Content was flagged and omitted.

Tool / function calling

tools, tool_choice, and the legacy functions / function_call fields pass through unchanged in both directions. When the model decides to call a tool, finish_reason is tool_calls and message.tool_calls is populated:

{
  "message": {
    "role": "assistant",
    "content": null,
    "tool_calls": [
      {
        "id": "call_abc123",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{\"city\":\"Paris\"}"
        }
      }
    ]
  },
  "finish_reason": "tool_calls"
}

You then append a tool message keyed by tool_call_id and call again. It's the standard OpenAI loop, with no Recovea-specific changes.

Headers

Every response carries two Recovea-added ids:

x-request-id:       req_…          (Recovea-generated; also your metering correlation id)
x-recovea-trace-id: trc_3f9a…c21   (correlates the call to your cost ledger)

Both are safe to log or ignore. On an exact-cache hit Recovea also adds x-recovea-cache: hit (absent otherwise). For rate limits, Recovea mints the six x-ratelimit-* headers and Retry-After / retry-after-ms on its own throttle responses; on a response proxied from your provider, the provider's own rate-limit headers pass through verbatim and Recovea fills these in only when the upstream sent none, never overwriting them. Beyond these additive headers the response is byte-for-byte your provider's.

Streaming and errors

Set stream: true to receive tokens as server-sent events. The chunk format, usage handling, and data: [DONE] terminator are documented in Streaming.
Failures return the standard OpenAI error envelope with matching status codes. See Errors.

PreviousAuthentication

Next Streaming