Developer API¶
The Developer API lets external applications query your notebooks and run RAG-powered conversations using simple API keys — no JWT authentication required. Each key is scoped to a single notebook with configurable rate limits and optional expiry.
Base path: /api/v1/external Authentication: X-API-Key header Rate limiting: Per-key, configurable (default 100/hour)
How It Works¶
sequenceDiagram
participant Admin as Admin User
participant UI as Beyond Retrieval UI
participant DB as Database
participant Ext as External App
participant API as External API
Admin->>UI: Create API key for notebook
UI->>DB: Store key_hash (SHA-256)
UI-->>Admin: Show full key ONCE (br_key_...)
Ext->>API: POST /query or /chat + X-API-Key header
API->>DB: Validate key hash, check active/expiry
API->>API: Execute RAG pipeline
API-->>Ext: JSON response with chunks/answer - Admin creates an API key in the Developer API page (admin-only)
- The full key is shown once — copy it immediately
- External apps send requests with the key in the
X-API-Keyheader - Each request is validated, rate-limited, and scoped to the key's notebook
Authentication¶
All external API endpoints require the X-API-Key header:
Key Security
API keys are shown only once at creation time. They are stored as SHA-256 hashes — there is no way to recover a lost key. If you lose a key, delete it and create a new one.
Error responses¶
| Status | Meaning |
|---|---|
401 | Missing, invalid, disabled, or expired API key |
403 | External API is disabled server-wide (EXTERNAL_API_ENABLED=false) |
429 | Rate limit exceeded for this key |
Endpoints¶
E1: RAG Query¶
Retrieve relevant document chunks from the key's notebook without generating an AI answer.
Request parameters¶
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
query | string | Yes | — | The search query |
strategy_id | string | No | notebook setting (fallback: "fusion") | Retrieval strategy to use |
top_k | integer | No | notebook setting (fallback: 10) | Maximum number of chunks to return (1-50) |
Available strategies:
| Strategy ID | Description | Requires LLM |
|---|---|---|
fusion | Hybrid search with Reciprocal Rank Fusion (recommended) | No |
semantic | Pure vector similarity search | No |
full_text | Full-text keyword search | No |
hybrid | Weighted combination of semantic + full-text | No |
hyde | Hypothetical Document Embedding — generates a hypothetical answer first | Yes |
multi_query | Expands query into multiple sub-queries for broader recall | Yes |
kb_explorer | Agentic exploration with tool-based retrieval | Yes |
Request example¶
Response¶
{
"success": true,
"data": {
"query": "What is the refund policy?",
"strategy_id": "fusion",
"chunks": [
{
"content": "Refunds are processed within 14 business days...",
"source": "company-handbook.pdf",
"score": 1.0,
"metadata": {
"file_id": "a1b2c3d4-...",
"file_title": "company-handbook.pdf",
"chunk_index": 12,
"notebook_id": "...",
"loc": { "lines": { "from": 45, "to": 62 } }
}
}
],
"total_results": 5,
"execution_time_ms": 1250.3
},
"error": null
}
Response fields¶
| Field | Type | Description |
|---|---|---|
query | string | Echo of the input query |
strategy_id | string | Strategy that was used |
chunks | array | Retrieved document chunks |
chunks[].content | string | The chunk text |
chunks[].source | string | Source file name |
chunks[].score | float | Relevance score (0.0 - 1.0, higher = more relevant) |
chunks[].metadata | object | Full chunk metadata (file_id, loc, etc.) |
total_results | integer | Number of chunks returned |
execution_time_ms | float | Server-side execution time in milliseconds |
E2: Chat¶
Send a message and receive a RAG-powered AI response with citations. Conversations are automatically created if no conversation_id is provided, enabling both one-shot and multi-turn use cases.
Request parameters¶
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
message | string | Yes | — | The user's question or message |
conversation_id | string | No | null | UUID of an existing conversation for multi-turn chat. If omitted, a new conversation is created automatically. |
strategy_id | string | No | notebook setting (fallback: "fusion") | Retrieval strategy (same options as query endpoint) |
persona | string | No | notebook setting (fallback: "professional") | AI response personality |
language | string | No | notebook setting (fallback: "en") | Response language code |
Available personas:
| Persona | Description |
|---|---|
professional | Clear, structured, business-appropriate responses |
funny | Lighthearted and entertaining while staying accurate |
mentor | Patient, educational, explains concepts step by step |
storyteller | Narrative-driven answers with engaging flow |
clear | Maximum clarity, minimal jargon, short sentences |
custom | Uses the notebook's custom system prompt |
Supported languages:
| Code | Language |
|---|---|
en | English |
de | German |
es | Spanish |
fr | French |
it | Italian |
pt | Portuguese |
nl | Dutch |
ru | Russian |
zh | Chinese |
ja | Japanese |
Request example — single message¶
{
"message": "What are the key benefits of the premium plan?",
"persona": "professional",
"language": "en"
}
Request example — multi-turn conversation¶
{
"message": "Can you explain the pricing in more detail?",
"conversation_id": "34f512ec-7f8a-4e6f-b8a0-38f1c143807d",
"persona": "professional",
"language": "en"
}
Response¶
{
"success": true,
"data": {
"conversation_id": "34f512ec-7f8a-4e6f-b8a0-38f1c143807d",
"answer": "The premium plan includes several key benefits:\n\n1. **Unlimited access** to all partner locations [1]\n2. **Online courses** including live sessions [2]\n3. **20% discount** for the first two months [3]\n\nMembers can manage everything through the mobile app.",
"citations": [
{
"citation_id": 1,
"content": "In your membership, access to all partner locations is included...",
"file_name": "membership-guide.pdf",
"similarity": 1.0
},
{
"citation_id": 2,
"content": "Online and live courses are included in every membership plan...",
"file_name": "membership-guide.pdf",
"similarity": 0.87
}
],
"suggested_questions": [
"How do I cancel my membership?",
"What payment methods are accepted?",
"Is there a family plan available?"
],
"is_cached": false,
"execution_time_ms": 12500.0
},
"error": null
}
Response fields¶
| Field | Type | Description |
|---|---|---|
conversation_id | string | UUID of the conversation (save this for follow-up messages) |
answer | string | AI-generated response with citation references like [1], [2] |
citations | array | Source chunks that back the answer |
citations[].citation_id | integer | Matches [n] references in the answer text |
citations[].content | string | The source chunk text |
citations[].file_name | string | Origin document name |
citations[].similarity | float | Relevance score (0.0 - 1.0) |
suggested_questions | array | AI-suggested follow-up questions |
is_cached | boolean | true if the response was served from cache |
execution_time_ms | float | Server-side execution time in milliseconds |
Multi-turn conversations
Save the conversation_id from the first response and pass it in subsequent requests to continue the conversation. The AI will have access to the full conversation history for context.
E3: Conversation History¶
Retrieve all messages in a conversation. The conversation must belong to the API key's notebook.
Path parameters¶
| Field | Type | Description |
|---|---|---|
conversation_id | string | UUID of the conversation |
Response¶
{
"success": true,
"data": {
"conversation_id": "34f512ec-7f8a-4e6f-b8a0-38f1c143807d",
"messages": [
{
"role": "user",
"content": "What are the key benefits?",
"created_at": "2026-03-04T09:46:37.125916Z"
},
{
"role": "assistant",
"content": "The key benefits include...",
"created_at": "2026-03-04T09:46:47.476795Z"
}
]
},
"error": null
}
Response fields¶
| Field | Type | Description |
|---|---|---|
conversation_id | string | UUID of the conversation |
messages | array | All messages in chronological order |
messages[].role | string | "user" or "assistant" |
messages[].content | string | Message text |
messages[].created_at | string | ISO 8601 timestamp |
Error responses¶
| Status | Meaning |
|---|---|
404 | Conversation not found |
403 | Conversation belongs to a different notebook than the API key |
Settings Inheritance¶
When optional parameters (strategy_id, top_k, persona, language) are omitted from a request, the API automatically uses the notebook's Intelligence Settings configured by the admin. This means external integrations get the same behavior as the built-in chat UI without needing to specify every parameter.
| Parameter | Notebook setting used | Hardcoded fallback |
|---|---|---|
strategy_id | strategies_config.strategy_id | "fusion" |
top_k | strategies_config.match_count | 10 |
persona | default_persona | "professional" |
language | default_language | "en" |
Additionally, the query endpoint passes the notebook's retrieval weights (full_text_weight, semantic_weight, rrf_k, rerank_top_k) to the retrieval pipeline automatically.
When a parameter is provided in the request, it overrides the notebook setting. This is fully backward-compatible: existing integrations that send explicit values will continue to work identically.
Admin Key Management¶
These endpoints require JWT authentication (admin-only). They are used by the Beyond Retrieval admin UI to manage API keys.
| Endpoint | Method | Description |
|---|---|---|
/api/settings/developer-keys | GET | List all API keys |
/api/settings/developer-keys | POST | Create a new key (returns full key once) |
/api/settings/developer-keys/{key_id} | PATCH | Update key (name, rate_limit, is_active, expires_at) |
/api/settings/developer-keys/{key_id} | DELETE | Delete a key permanently |
/api/settings/developer-keys/{key_id}/snippets | GET | Get code snippets for a key |
Create key request¶
{
"notebook_id": "505b1a5c-b548-4089-8e37-63da42ee3c84",
"name": "My Chatbot Key",
"rate_limit": "100/hour",
"expires_at": "2026-12-31T00:00:00Z"
}
Rate limit format¶
Rate limits follow the pattern <number>/<period>:
| Example | Meaning |
|---|---|
100/hour | 100 requests per hour (default) |
10/minute | 10 requests per minute |
1000/day | 1000 requests per day |
5/second | 5 requests per second |
Configuration¶
Environment variables that control the Developer API:
| Variable | Default | Description |
|---|---|---|
EXTERNAL_API_ENABLED | true | Kill switch — set to false to disable all external API endpoints (returns 403) |
EXTERNAL_API_DEFAULT_RATE_LIMIT | 100/hour | Default rate limit applied per key |
EXTERNAL_API_BASE_URL | (auto-detected) | Base URL for generated code snippets. Auto-detected from request headers if empty. |