Embeddings

POST /v1/embeddings returns vector representations of text. Recovea proxies the OpenAI embeddings shape unchanged: you send the same request, you get the same response back, and your existing client works without modification.

Recovea meters every embeddings call into your cost ledger using exact token accounting. In v1 there is no semantic cache on this endpoint: embeddings are billed on real, measured usage, not on a cache hit-rate estimate.

Request

from openai import OpenAI

client = OpenAI(
    base_url="https://api.recovea.ai/v1",
    api_key="rcv_live_…",
)

resp = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox jumped over the lazy dog.",
)
print(resp.data[0].embedding[:5])
curl https://api.recovea.ai/v1/embeddings \
  -H "Authorization: Bearer rcv_live_…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "The quick brown fox jumped over the lazy dog."
  }'

Parameters

FieldTypeRequiredNotes
modelstringyesEmbedding model id, e.g. text-embedding-3-small, text-embedding-3-large. Echoed back in the response.
inputstring or arrayyesA single string, or an array of strings to embed in one call. Each array element returns its own vector.
dimensionsintegernoTruncate output vectors to this length (supported by text-embedding-3-*).
encoding_formatstringno"float" (default) returns arrays of floats; "base64" returns base64-encoded packed floats.
userstringnoOpaque end-user identifier, passed through unchanged.

To embed several inputs in one request, pass an array. The response data[] preserves order via each entry's index:

resp = client.embeddings.create(
    model="text-embedding-3-small",
    input=["first document", "second document"],
)
for item in resp.data:
    print(item.index, len(item.embedding))

Response

The response object is "list"; data[] holds one embedding per input, each tagged with its index. The shape matches OpenAI field-for-field.

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [-0.0061, 0.0094, -0.0123, "…"]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 11,
    "total_tokens": 11
  }
}
FieldTypeNotes
objectstringAlways "list".
data[]arrayOne embedding object per input, ordered by index.
data[].objectstringAlways "embedding".
data[].indexintegerPosition of this vector in the input array (0-based).
data[].embeddingarray or stringArray of floats, or a base64 string when encoding_format is "base64".
modelstringThe model that produced the vectors; echoes your request.
usageobjectprompt_tokens and total_tokens only, since embeddings have no completion tokens.

Metering and the ledger

Recovea records usage.prompt_tokens for every embeddings call and prices it against the frozen reference rate for that model, writing the result to your cost ledger. This is exact metering: the count comes from the model's own usage report, not an estimate.

There is no semantic cache on /v1/embeddings in v1. Recovea does not deduplicate or replay near-identical inputs here, so you are never billed against a similarity heuristic. Each input you send is embedded and metered as sent. Caching levers that affect quality apply on the chat surface, behind the non-inferiority gate; embeddings stay pass-through and exact.

Headers and fail-open

Every response carries the standard OpenAI headers plus Recovea's x-recovea-trace-id, which correlates the call to its ledger entry. If Recovea's metering layer fails, the request flows straight through to your provider on your own key: you get the embeddings, savings == 0, never a Recovea-shaped error.

Errors

Errors use the same envelope and status codes as the rest of the API. For example, an unknown model returns 404 with code: "model_not_found", and an input that exceeds the model's context window returns 400 with code: "context_length_exceeded".

{
  "error": {
    "message": "The model `text-embedding-9` does not exist or you do not have access to it.",
    "type": "invalid_request_error",
    "param": null,
    "code": "model_not_found"
  }
}

See Errors for the full status-code and envelope reference.

Next