AI Enhancement¶

The enhancement pipeline enriches document chunks with AI-generated contextual descriptions. Enhanced chunks follow a lifecycle: pending -> processing -> success -> embedded (via publish). Failed chunks can be reset to pending.

Base path: /api/notebooks/{notebook_id}/enhance

Enhancement Lifecycle¶

pending --> processing --> success --> embedded (publish)
              |                          ^
              v                          |
           failed -----> pending (reset) -+

Publish Safety

The publish endpoint aborts if ANY chunk for the file is not in "success" status. All chunks must complete enhancement before publishing.

GET /api/notebooks/{notebook_id}/enhance/files¶

List all files with enhancement status. Auto-populates new files from the documents table on each call.

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`

Status: 200 OK

{
  "success": true,
  "data": [
    {
      "file_id": "f1a2b3c4-...",
      "file_name": "handbook.pdf",
      "total_chunks": 45,
      "pending": 0,
      "processing": 0,
      "success": 45,
      "failed": 0,
      "embedded": 0
    }
  ]
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user

curl http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/files \
  -H "Authorization: Bearer $TOKEN"

import httpx

notebook_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
response = httpx.get(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/files",
    headers={"Authorization": f"Bearer {token}"},
)
files = response.json()["data"]
for f in files:
    print(f"{f['file_name']}: {f['success']}/{f['total_chunks']} enhanced")

GET /api/notebooks/{notebook_id}/enhance/files/{file_id}/chunks¶

List individual chunks for a file (paginated, truncated for display).

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`

Query Parameters:

Parameter	Type	Required	Default	Description
`limit`	integer	No	`200`	Results per page (1-1000)
`offset`	integer	No	`0`	Pagination offset

Status: 200 OK

{
  "success": true,
  "data": [
    {
      "chunk_id": "chunk-1234-...",
      "file_id": "f1a2b3c4-...",
      "status": "success",
      "original_chunk": "The refund policy states that...",
      "enhanced_chunk": "# Context\nThis chunk describes...\n\n---\n\n# Content\nThe refund policy states that..."
    }
  ]
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user

curl "http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/files/$FILE_ID/chunks?limit=50&offset=0" \
  -H "Authorization: Bearer $TOKEN"

import httpx

response = httpx.get(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/files/{file_id}/chunks",
    headers={"Authorization": f"Bearer {token}"},
    params={"limit": 50, "offset": 0},
)
chunks = response.json()["data"]
print(f"Got {len(chunks)} chunks")

GET /api/notebooks/{notebook_id}/enhance/chunks/{chunk_id}¶

Get full chunk detail for the preview panel (no truncation).

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`

Status: 200 OK

{
  "success": true,
  "data": {
    "chunk_id": "chunk-1234-...",
    "file_id": "f1a2b3c4-...",
    "status": "success",
    "original_chunk": "Full original text...",
    "enhanced_chunk": "# Context\n...\n\n---\n\n# Content\n..."
  }
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user
`404`	Chunk not found

curl http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/chunks/$CHUNK_ID \
  -H "Authorization: Bearer $TOKEN"

import httpx

response = httpx.get(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/chunks/{chunk_id}",
    headers={"Authorization": f"Bearer {token}"},
)
print(response.json()["data"]["enhanced_chunk"])

GET /api/notebooks/{notebook_id}/enhance/count¶

Aggregate chunk counts (notebook-wide or per-file).

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`

Query Parameters:

Parameter	Type	Required	Default	Description
`file_id`	string	No	`null`	Filter to a specific file

Status: 200 OK

{
  "success": true,
  "data": {
    "total": 340,
    "pending": 100,
    "processing": 5,
    "success": 200,
    "failed": 3,
    "embedded": 32
  }
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user

curl "http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/count?file_id=$FILE_ID" \
  -H "Authorization: Bearer $TOKEN"

import httpx

response = httpx.get(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/count",
    headers={"Authorization": f"Bearer {token}"},
    params={"file_id": file_id},
)
counts = response.json()["data"]
print(f"Total: {counts['total']}, Success: {counts['success']}")

POST /api/notebooks/{notebook_id}/enhance¶

Start the enhancement pipeline for files or specific chunks. Validates that enhanceable chunks exist, then kicks off background processing.

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`
`Content-Type`	`application/json`

Body:

{
  "file_ids": ["f1a2b3c4-..."],
  "chunk_ids": null
}

Field	Type	Required	Default	Description
`file_ids`	array	No	`null`	File IDs to enhance (file-level)
`chunk_ids`	array	No	`null`	Specific chunk IDs to enhance (chunk-level)

File vs Chunk Level

Provide file_ids to enhance all pending chunks in those files, or chunk_ids to enhance specific chunks.

Status: 200 OK

{
  "success": true,
  "data": [
    {
      "file_id": "f1a2b3c4-...",
      "file_name": "handbook.pdf",
      "total_chunks": 45,
      "pending": 45,
      "processing": 0,
      "success": 0,
      "failed": 0,
      "embedded": 0
    }
  ]
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user
`404`	No enhanceable chunks found

curl -X POST http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"file_ids": ["f1a2b3c4"]}'

import httpx

response = httpx.post(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance",
    headers={"Authorization": f"Bearer {token}"},
    json={"file_ids": ["f1a2b3c4"]},
)
print(response.json()["data"])

GET /api/notebooks/{notebook_id}/enhance/status¶

Poll enhancement progress for a single file. Used by the frontend for progress tracking.

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`

Query Parameters:

Parameter	Type	Required	Description
`file_id`	string	Yes	File to check status for

Status: 200 OK

{
  "success": true,
  "data": {
    "total": 45,
    "pending": 0,
    "processing": 5,
    "success": 38,
    "failed": 2,
    "embedded": 0,
    "progress_pct": 84.4,
    "all_terminated": false
  }
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user

curl "http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/status?file_id=$FILE_ID" \
  -H "Authorization: Bearer $TOKEN"

import httpx
import time

# Poll until all chunks are done
while True:
    response = httpx.get(
        f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/status",
        headers={"Authorization": f"Bearer {token}"},
        params={"file_id": file_id},
    )
    status = response.json()["data"]
    print(f"Progress: {status['progress_pct']:.1f}%")

    if status["all_terminated"]:
        break
    time.sleep(4)

POST /api/notebooks/{notebook_id}/enhance/publish¶

Publish enhanced chunks to the vector store (file-level). Deletes old vectors, embeds enhanced chunks, and inserts into the documents table. Aborts if any chunk is not in "success" status.

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`
`Content-Type`	`application/json`

Body:

{
  "file_id": "f1a2b3c4-...",
  "job_id": "j1a2b3c4-...",
  "file_name": "handbook.pdf",
  "notebook_title": "Customer Support KB"
}

Field	Type	Required	Default	Description
`file_id`	string	Yes	--	File to publish
`job_id`	string	No	`null`	Associated job ID
`file_name`	string	Yes	--	File name for metadata
`notebook_title`	string	Yes	--	Notebook title for metadata

Status: 200 OK

{
  "success": true,
  "data": {
    "success": true,
    "message": "Published 45 enhanced chunks",
    "published_count": 45
  }
}

Code	Cause
`400`	Not all chunks in `"success"` status
`401`	Invalid or missing token
`403`	Non-admin user

curl -X POST http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/publish \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "f1a2b3c4",
    "file_name": "handbook.pdf",
    "notebook_title": "Customer Support KB"
  }'

import httpx

response = httpx.post(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/publish",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "file_id": "f1a2b3c4",
        "file_name": "handbook.pdf",
        "notebook_title": "Customer Support KB",
    },
)
result = response.json()["data"]
print(f"Published: {result['published_count']} chunks")

POST /api/notebooks/{notebook_id}/enhance/publish-chunks¶

Publish specific enhanced chunks to the vector store (chunk-level).

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`
`Content-Type`	`application/json`

Body:

{
  "chunk_ids": ["chunk-1234-...", "chunk-5678-..."],
  "file_id": "f1a2b3c4-...",
  "file_name": "handbook.pdf",
  "notebook_title": "Customer Support KB"
}

Field	Type	Required	Default	Description
`chunk_ids`	array	Yes	--	Chunk IDs to publish
`file_id`	string	Yes	--	Parent file ID
`file_name`	string	Yes	--	File name for metadata
`notebook_title`	string	Yes	--	Notebook title for metadata

Status: 200 OK

{
  "success": true,
  "data": {
    "success": true,
    "message": "Published 2 enhanced chunks",
    "published_count": 2
  }
}

Code	Cause
`400`	Not all specified chunks in `"success"` status
`401`	Invalid or missing token
`403`	Non-admin user

curl -X POST http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/publish-chunks \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "chunk_ids": ["chunk-1234", "chunk-5678"],
    "file_id": "f1a2b3c4",
    "file_name": "handbook.pdf",
    "notebook_title": "Customer Support KB"
  }'

import httpx

response = httpx.post(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/publish-chunks",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "chunk_ids": ["chunk-1234", "chunk-5678"],
        "file_id": "f1a2b3c4",
        "file_name": "handbook.pdf",
        "notebook_title": "Customer Support KB",
    },
)
print(response.json()["data"])

POST /api/notebooks/{notebook_id}/enhance/reset¶

Reset failed chunks back to "pending" status. Optionally re-triggers the enhancement pipeline.

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`
`Content-Type`	`application/json`

Body:

{
  "file_id": "f1a2b3c4-...",
  "chunk_ids": null,
  "trigger_enhancement": true
}

Field	Type	Required	Default	Description
`file_id`	string	Yes	--	File containing failed chunks
`chunk_ids`	array	No	`null`	Specific chunks to reset (all failed if null)
`trigger_enhancement`	boolean	No	`false`	Re-trigger enhancement after reset

Status: 200 OK

{
  "success": true,
  "data": {
    "reset_count": 3,
    "enhancement_triggered": true
  }
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user

curl -X POST http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/reset \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"file_id": "f1a2b3c4", "trigger_enhancement": true}'

import httpx

response = httpx.post(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/reset",
    headers={"Authorization": f"Bearer {token}"},
    json={"file_id": "f1a2b3c4", "trigger_enhancement": True},
)
result = response.json()["data"]
print(f"Reset {result['reset_count']} chunks")

POST /api/notebooks/{notebook_id}/enhance/populate¶

Backfill the contextual_retrieval_table from already-ingested documents. Use this when documents were ingested without context augmentation and you want to make them available for AI enhancement.

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`

Status: 200 OK

{
  "success": true,
  "data": {
    "populated": 45,
    "skipped": 0
  }
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user

curl -X POST http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/populate \
  -H "Authorization: Bearer $TOKEN"

import httpx

response = httpx.post(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/populate",
    headers={"Authorization": f"Bearer {token}"},
)
print(response.json()["data"])

POST /api/notebooks/{notebook_id}/enhance/repair-metadata¶

Recompute original_metadata for chunks where it is empty or missing. Uses file_content and original_chunk to compute real line positions.

Auth: Admin

RequestResponseErrorscurl ExamplePython Example

Headers:

Header	Value
`Authorization`	`Bearer <token>`

Query Parameters:

Parameter	Type	Required	Default	Description
`file_id`	string	No	`null`	Repair specific file, or all files if omitted

Status: 200 OK

{
  "success": true,
  "data": {
    "repaired": 12,
    "total_checked": 45
  }
}

Code	Cause
`401`	Invalid or missing token
`403`	Non-admin user

curl -X POST "http://localhost:8000/api/notebooks/$NOTEBOOK_ID/enhance/repair-metadata?file_id=$FILE_ID" \
  -H "Authorization: Bearer $TOKEN"

import httpx

response = httpx.post(
    f"http://localhost:8000/api/notebooks/{notebook_id}/enhance/repair-metadata",
    headers={"Authorization": f"Bearer {token}"},
    params={"file_id": file_id},
)
result = response.json()["data"]
print(f"Repaired {result['repaired']} of {result['total_checked']} chunks")