Chat Completions
POST /v1/chat/completions is the core endpoint. It takes a list of messages and returns a model completion. Recovea proxies OpenAI's request and response shapes unchanged (the same fields, in the same order, with the same types), so an unmodified OpenAI SDK works as-is. We optimize and meter underneath, fail-open.
from openai import OpenAI
client = OpenAI(
base_url="https://api.recovea.ai/v1",
api_key="rcv_live_…",
)
resp = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a terse assistant."},
{"role": "user", "content": "Name the capital of France."},
],
)
print(resp.choices[0].message.content)
Request
| Field | Type | Notes |
|---|---|---|
model | string | Required. A bare OpenAI id (e.g. gpt-4o) goes native to OpenAI; an OpenRouter vendor/model id (e.g. google/gemini-2.5-pro, exactly as listed by GET /v1/models) rides your OpenRouter key for the long tail. A bare non-OpenAI id returns a 404 naming the vendor/model id to use. For Claude models the native Anthropic-shaped /anthropic surface is also available. Echoed back in the response. |
messages | array | Required. Conversation so far. Each item has a role and content (see below). |
temperature | number | Sampling temperature, 0–2. Default 1. |
top_p | number | Nucleus sampling, 0–1. Use this or temperature, not both. |
max_tokens | integer | Cap on tokens generated in the completion. |
n | integer | Number of choices to return. Default 1. |
stop | string | array | Up to four sequences that halt generation. |
tools | array | Function/tool definitions the model may call. Passes through unchanged. |
tool_choice | string | object | auto, none, required, or a named function. |
response_format | object | {"type": "json_object"} or a JSON schema for structured output. |
stream | boolean | When true, tokens arrive as server-sent events. See Streaming. |
This is a subset; any other field the OpenAI Chat Completions API accepts (presence_penalty, frequency_penalty, seed, logprobs, logit_bias, stream_options, user, …) is forwarded verbatim.
Two surfaces, one account. This OpenAI-shaped
/v1endpoint selects the provider from the model id: a bare OpenAI id goes native to OpenAI on your connected key, and an OpenRoutervendor/modelid rides your OpenRouter key for the long tail. Anthropic is served natively on its own Anthropic-shaped/anthropicsurface (samercv_key, same ledger). Cross-provider cascading — falling back to a cheaper model that still passes the quality gate — is planned, not yet active on the hot path. See How Recovea works.
Messages and roles
Each message carries a role and content:
system: instructions that steer the assistant.user: end-user input.assistant: a prior model turn. May includetool_calls.tool: the result of a tool call, keyed bytool_call_id.
content is a string, or an array of content parts ({"type": "text", …}, {"type": "image_url", …}) for multimodal input.
Response
A non-streamed call returns a chat.completion object. Field-for-field identical to OpenAI:
{
"id": "chatcmpl-9x8a7b6c5d",
"object": "chat.completion",
"created": 1717000000,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Paris.",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 19,
"completion_tokens": 2,
"total_tokens": 21
}
}
| Field | Notes |
|---|---|
id | Stable chatcmpl-… id for the response. |
object | Always the literal "chat.completion". |
created | Unix timestamp in seconds. |
model | The model that served the request: the id you sent, echoed back. |
choices[] | One entry per n, each with index, message, logprobs, finish_reason. |
usage | Token accounting: prompt_tokens, completion_tokens, total_tokens. |
finish_reason
The reason the model stopped, preserved exactly:
| Value | Meaning |
|---|---|
stop | Natural end, or hit a stop sequence. |
length | Reached max_tokens or the context limit. |
tool_calls | The model is calling one or more tools. |
content_filter | Content was flagged and omitted. |
Tool / function calling
tools, tool_choice, and the legacy functions / function_call fields pass through unchanged in both directions. When the model decides to call a tool, finish_reason is tool_calls and message.tool_calls is populated:
{
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"Paris\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
You then append a tool message keyed by tool_call_id and call again. It's the standard OpenAI loop, with no Recovea-specific changes.
Headers
Every response carries two Recovea-added ids:
x-request-id: req_… (Recovea-generated; also your metering correlation id)
x-recovea-trace-id: trc_3f9a…c21 (correlates the call to your cost ledger)
Both are safe to log or ignore. On an exact-cache hit Recovea also adds x-recovea-cache: hit (absent otherwise). For rate limits, Recovea mints the six x-ratelimit-* headers and Retry-After / retry-after-ms on its own throttle responses; on a response proxied from your provider, the provider's own rate-limit headers pass through verbatim and Recovea fills these in only when the upstream sent none, never overwriting them. Beyond these additive headers the response is byte-for-byte your provider's.