n8n Integration¶

n8n is an open-source workflow automation platform that lets you connect services through a visual node editor. By integrating n8n with Beyond Retrieval v2, you can automate document ingestion, build RAG-powered chatbots, schedule health monitoring, and collect user feedback — all without writing application code.

Prerequisites¶

Requirement	Details
n8n instance	Self-hosted (Docker/npm) or n8n Cloud, version 1.0+
Beyond Retrieval API	Running and reachable from your n8n instance
Bearer token	A valid JWT token (or any string when `BYPASS_AUTH=true` in dev mode)
Notebook ID	At least one notebook created in Beyond Retrieval

Development Mode

When BYPASS_AUTH=true, any non-empty Bearer token works. You can use dev-token for testing.

Authentication Setup¶

Create a reusable credential in n8n so every HTTP Request node authenticates automatically.

Open Settings > Credentials in your n8n instance.
Click Add Credential and select Header Auth.
Configure the credential:

Field	Value
Name	`Beyond Retrieval API`
Header Name	`Authorization`
Header Value	`Bearer <your_token>`

Click Save. This credential can now be selected in any HTTP Request node.

Token Rotation

If your JWT token expires, update the credential in one place and all workflows using it will pick up the new token automatically.

HTTP Request Node — Base Configuration¶

Every API call to Beyond Retrieval follows the same base pattern. Configure your HTTP Request nodes with these defaults:

Setting	Value
Authentication	Predefined Credential > Header Auth > `Beyond Retrieval API`
Response Format	JSON
Timeout	`60000` (60 seconds — RAG calls can take time)
Continue On Fail	`true` (recommended, so downstream nodes can handle errors)

Parsing the Response Envelope¶

Every Beyond Retrieval endpoint wraps its payload in a standard envelope:

{
  "success": true,
  "data": { ... },
  "error": null
}

In n8n expressions, access the payload like this:

// Get the full data object
{{ $json.data }}

// Get a nested field (e.g., AI response content)
{{ $json.data.assistant_message.content }}

// Check success status
{{ $json.success }}

// Get error message on failure
{{ $json.error }}

Always Check success

Use an IF node after each HTTP Request to branch on {{ $json.success }} before processing the data. This prevents downstream nodes from operating on error responses.

Example Workflows¶

1. Auto-Ingest Files from Webhook¶

Receive a file via webhook, upload it to Beyond Retrieval, and trigger ingestion.

graph LR
    A[Webhook Trigger] --> B[HTTP Request: Upload File]
    B --> C[IF: Upload Success?]
    C -->|Yes| D[HTTP Request: Start Ingestion]
    D --> E[Wait 10s]
    E --> F[HTTP Request: Check Stage]
    F --> G[IF: Still Processing?]
    G -->|Yes| E
    G -->|No| H[HTTP Request: Confirm Status]

Node 1 — Webhook Trigger

Setting	Value
HTTP Method	`POST`
Path	`ingest-file`
Binary Property	`data`
Response Mode	`Last Node`

Node 2 — HTTP Request: Upload File

Setting	Value
Method	`POST`
URL	`http://localhost:8000/api/notebooks/{{ $json.query.notebook_id }}/documents/upload`
Authentication	Header Auth > `Beyond Retrieval API`
Send Binary Data	`true`
Binary Property	`data`
Input Data Field Name	`files`

Node 3 — IF: Upload Success?

Setting	Value
Condition	`{{ $json.success }}` equals `true`

Node 4 — HTTP Request: Start Ingestion

Setting	Value
Method	`POST`
URL	`http://localhost:8000/api/notebooks/{{ $('Webhook').item.json.query.notebook_id }}/documents/ingest`
Body Content Type	JSON

Body:

{
  "files": [
    {
      "file_id": "={{ $('Upload File').item.json.data[0].file_id }}",
      "file_name": "={{ $('Upload File').item.json.data[0].file_name }}",
      "file_path": "={{ $('Upload File').item.json.data[0].storage_path }}"
    }
  ],
  "settings": {
    "parser": "Docling Parser",
    "chunking_strategy": "Recursive Chunking",
    "chunk_size": 1000,
    "chunk_overlap": 200
  }
}

Node 5 — Wait

Setting	Value
Amount	`10`
Unit	`Seconds`

Node 6 — HTTP Request: Check Stage

Setting	Value
Method	`GET`
URL	`http://localhost:8000/api/notebooks/{{ $('Webhook').item.json.query.notebook_id }}/documents/{{ $('Upload File').item.json.data[0].file_id }}/stage`

Node 7 — IF: Still Processing?

Setting	Value
Condition	`{{ $json.data.status }}` equals `Processing`

Loop the "Yes" branch back to the Wait node. The "No" branch proceeds to confirmation or notification.

2. Scheduled Health Monitoring¶

Run health checks on all notebooks every hour and send alerts for unhealthy ones.

graph LR
    A[Schedule Trigger] --> B[HTTP Request: List Notebooks]
    B --> C[Split In Batches]
    C --> D[HTTP Request: Health Check]
    D --> E[IF: Score < 70?]
    E -->|Yes| F[Slack / Email: Send Alert]
    E -->|No| G[No Op]

Node 1 — Schedule Trigger

Setting	Value
Trigger Interval	Every 1 hour

Node 2 — HTTP Request: List Notebooks

Setting	Value
Method	`GET`
URL	`http://localhost:8000/api/notebooks/`

Node 3 — Split In Batches

Setting	Value
Batch Size	`1`
Input	`{{ $json.data }}`

Node 4 — HTTP Request: Health Check

Setting	Value
Method	`GET`
URL	`http://localhost:8000/api/notebooks/{{ $json.notebook_id }}/health`

Node 5 — IF: Score Below Threshold?

Setting	Value
Condition	`{{ $json.data.health_score }}` is less than `70`

Node 6 — Slack (or Email) Alert

Compose a message with the health details:

Notebook: {{ $('Health Check').item.json.data.notebook_id }}
Health Score: {{ $('Health Check').item.json.data.health_score }}/100
Duplicates: {{ $('Health Check').item.json.data.duplicate_count }}
Orphans: {{ $('Health Check').item.json.data.orphaned_count }}

Auto-Cleanup

Add an optional branch: if duplicates > 0, call POST /api/notebooks/{id}/health/cleanup to remove them automatically.

3. Chat Bot via Telegram or Slack¶

Build a RAG-powered chatbot that answers questions from your knowledge base.

graph LR
    A[Telegram / Slack Trigger] --> B[HTTP Request: Find or Create Conversation]
    B --> C[HTTP Request: Send Message to RAG]
    C --> D[IF: Success?]
    D -->|Yes| E[Telegram / Slack: Reply with Answer]
    D -->|No| F[Telegram / Slack: Reply with Error]

Node 1 — Telegram Trigger (or Slack Trigger)

Setting	Value
Updates	`message`

Node 2 — HTTP Request: Create Conversation

Setting	Value
Method	`POST`
URL	`http://localhost:8000/api/notebooks/<NOTEBOOK_ID>/conversations`
Body Content Type	JSON

{
  "title": "Telegram Chat {{ $json.message.from.first_name }}",
  "chat_mode": "rag"
}

Conversation Reuse

For persistent conversations per user, store the conversation_id in an n8n static data variable or an external database keyed by the user's Telegram/Slack ID.

Node 3 — HTTP Request: Send Message to RAG

Setting	Value
Method	`POST`
URL	`http://localhost:8000/api/notebooks/<NOTEBOOK_ID>/conversations/{{ $json.data.conversation_id }}/messages`
Body Content Type	JSON
Timeout	`120000`

{
  "content": "={{ $('Telegram Trigger').item.json.message.text }}",
  "chat_mode": "rag",
  "strategy_id": "fusion",
  "persona": "professional",
  "language": "en"
}

Node 4 — IF: Success?

Setting	Value
Condition	`{{ $json.success }}` equals `true`

Node 5 — Telegram: Reply

Setting	Value
Chat ID	`{{ $('Telegram Trigger').item.json.message.chat.id }}`
Text	`{{ $json.data.assistant_message.content }}`

Timeout Settings

RAG pipeline calls can take 5-30 seconds depending on document size and LLM provider. Set the HTTP Request timeout to at least 120000 ms (2 minutes) to avoid premature failures.

4. Auto-Feedback Collection¶

Collect user feedback from an external system and post it to Beyond Retrieval.

graph LR
    A[Webhook Trigger] --> B[HTTP Request: Post Feedback]
    B --> C[IF: Success?]
    C -->|Yes| D[Respond to Webhook: 200 OK]
    C -->|No| E[Respond to Webhook: 500 Error]

Node 1 — Webhook Trigger

Setting	Value
HTTP Method	`POST`
Path	`feedback`
Response Mode	`Using Respond to Webhook Node`

Expected webhook body:

{
  "notebook_id": "a1b2c3d4-...",
  "message_id": "msg-5678-...",
  "is_positive": true,
  "feedback_text": "Very helpful answer"
}

Node 2 — HTTP Request: Post Feedback

Setting	Value
Method	`POST`
URL	`http://localhost:8000/api/notebooks/{{ $json.body.notebook_id }}/messages/{{ $json.body.message_id }}/feedback`
Body Content Type	JSON

{
  "is_positive": "={{ $json.body.is_positive }}",
  "feedback_text": "={{ $json.body.feedback_text }}"
}

Node 3 — Respond to Webhook

Return the result to the caller with the appropriate status code.

5. Batch Document Processing¶

Watch a folder, upload each file, and ingest them in sequence.

graph LR
    A[Schedule / Manual Trigger] --> B[Read Binary Files]
    B --> C[Split In Batches]
    C --> D[HTTP Request: Upload]
    D --> E[IF: Upload OK?]
    E -->|Yes| F[HTTP Request: Ingest]
    F --> G[Wait 15s]
    G --> H[HTTP Request: Check Stage]
    H --> I[IF: Done?]
    I -->|No| G
    I -->|Yes| J[Next Batch Item]

Node 1 — Read Binary Files

Setting	Value
File Path	`/data/incoming/*.pdf`
Property Name	`data`

Node 2 — Split In Batches

Setting	Value
Batch Size	`1`

Process files one at a time to avoid overloading the ingestion pipeline.

Node 3 — HTTP Request: Upload

Setting	Value
Method	`POST`
URL	`http://localhost:8000/api/notebooks/<NOTEBOOK_ID>/documents/upload`
Send Binary Data	`true`
Binary Property	`data`
Input Data Field Name	`files`

Node 4 — HTTP Request: Ingest

Setting	Value
Method	`POST`
URL	`http://localhost:8000/api/notebooks/<NOTEBOOK_ID>/documents/ingest`
Body Content Type	JSON

{
  "files": [
    {
      "file_id": "={{ $('Upload').item.json.data[0].file_id }}",
      "file_name": "={{ $('Upload').item.json.data[0].file_name }}",
      "file_path": "={{ $('Upload').item.json.data[0].storage_path }}"
    }
  ],
  "settings": {
    "parser": "Docling Parser",
    "chunking_strategy": "Recursive Chunking"
  }
}

Node 5 — Wait + Poll Loop

Same pattern as Workflow 1: wait 15 seconds, check the stage endpoint, loop while status is Processing.

Batch Ingest Alternative

If you upload multiple files at once, you can pass all of them in a single /ingest call instead of looping. The ingestion endpoint accepts an array of files.

Error Handling¶

IF Node Pattern¶

Place an IF node after every HTTP Request to branch on the response:

// Condition
{{ $json.success === true }}

True branch: continue normal processing
False branch: log the error, send a notification, or retry

Error Trigger Workflow¶

Create a separate workflow with an Error Trigger node to catch failures globally:

Create a new workflow named Error Handler.
Add an Error Trigger node as the start node.
Add a Slack or Email node to notify on failures.
In your main workflows, go to Settings > Error Workflow and select Error Handler.

The Error Trigger receives the full error context including the failed node name and error message.

Retry Logic¶

For transient errors (network timeouts, 429 rate limits), configure retries on the HTTP Request node:

Setting	Value
Retry On Fail	`true`
Max Retries	`3`
Wait Between Retries	`5000` ms

Rate Limits

LLM provider rate limits (HTTP 429) include a Retry-After header. n8n's built-in retry respects this header automatically.

Tips and Best Practices¶

Topic	Recommendation
Timeouts	Set HTTP Request timeout to 60-120 seconds for RAG and ingestion calls
Binary data	Use `Send Binary Data: true` with field name `files` for file uploads
Concurrency	Process ingestion sequentially (batch size 1) to avoid overloading
Credentials	Use Header Auth credentials instead of hardcoding tokens in URLs
Environment	Use n8n environment variables for the base URL: `{{ $env.BR_BASE_URL }}`
Monitoring	Schedule health checks at least once per hour for production notebooks
Conversation reuse	Store conversation IDs in n8n's static data or a database for multi-turn chatbots
File size	Large files (100+ pages) can take several minutes to ingest — adjust wait/poll loops accordingly
Response parsing	Always access data through `$json.data` — never use the raw response body directly
Error workflows	Set up a global Error Trigger workflow for alerting on any workflow failure

Quick Reference — Common API Calls¶

Action	Method	URL Path	Body
List notebooks	`GET`	`/api/notebooks/`	--
Create notebook	`POST`	`/api/notebooks/`	`{notebook_id, notebook_title, user_id}`
Upload file	`POST`	`/api/notebooks/{id}/documents/upload`	multipart `files`
Start ingestion	`POST`	`/api/notebooks/{id}/documents/ingest`	`{files, settings}`
Check stage	`GET`	`/api/notebooks/{id}/documents/{fid}/stage`	--
List documents	`GET`	`/api/notebooks/{id}/documents/`	--
Health check	`GET`	`/api/notebooks/{id}/health`	--
Cleanup duplicates	`POST`	`/api/notebooks/{id}/health/cleanup`	--
Create conversation	`POST`	`/api/notebooks/{id}/conversations`	`{title, chat_mode}`
Send RAG message	`POST`	`/api/notebooks/{id}/conversations/{cid}/messages`	`{content, strategy_id}`
Submit feedback	`POST`	`/api/notebooks/{id}/messages/{mid}/feedback`	`{is_positive}`
Get settings	`GET`	`/api/notebooks/{id}/settings`	--