Skip to content

Health Monitor

The health monitor assesses data quality within a notebook by detecting duplicate chunks, orphaned embeddings, and tracking enhancement progress. It calculates a composite health score from 0 to 100.

Base path: /api/notebooks/{notebook_id}/health

Health score formula:

score = 100
      - (duplicates / total) * 40
      - (orphans / total) * 30
      + (enhanced / total) * 10

The score is capped between 0 and 100.


GET /api/notebooks/{notebook_id}/health

Run a full health check on a notebook's data. Detects duplicates, orphaned embeddings, enhancement status, and calculates a health score.

Auth: Admin

Headers:

Header Value
Authorization Bearer <token>

Status: 200 OK

{
  "success": true,
  "data": {
    "health_score": 85,
    "total_chunks": 340,
    "duplicate_count": 5,
    "orphaned_count": 2,
    "enhanced_count": 200,
    "duplicate_groups": [
      {
        "content_hash": "abc123...",
        "count": 3,
        "chunk_ids": ["chunk-1", "chunk-2", "chunk-3"]
      }
    ],
    "orphaned_ids": ["chunk-99", "chunk-100"]
  }
}
Code Cause
401 Invalid or missing token
403 Non-admin user
curl http://localhost:8000/api/notebooks/$NOTEBOOK_ID/health \
  -H "Authorization: Bearer $TOKEN"
import httpx

notebook_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
response = httpx.get(
    f"http://localhost:8000/api/notebooks/{notebook_id}/health",
    headers={"Authorization": f"Bearer {token}"},
)
health = response.json()["data"]
print(f"Health Score: {health['health_score']}/100")
print(f"Duplicates: {health['duplicate_count']}")
print(f"Orphans: {health['orphaned_count']}")
print(f"Enhanced: {health['enhanced_count']}/{health['total_chunks']}")

POST /api/notebooks/{notebook_id}/health/cleanup

Remove duplicate chunks from a notebook. Keeps the first (oldest) chunk in each duplicate group and deletes the rest. Returns the count of removed chunks and the new health score.

Auth: Admin

Headers:

Header Value
Authorization Bearer <token>

Status: 200 OK

{
  "success": true,
  "data": {
    "removed_count": 5,
    "new_health_score": 92,
    "new_total_chunks": 335
  }
}
Code Cause
401 Invalid or missing token
403 Non-admin user
curl -X POST http://localhost:8000/api/notebooks/$NOTEBOOK_ID/health/cleanup \
  -H "Authorization: Bearer $TOKEN"
import httpx

notebook_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
response = httpx.post(
    f"http://localhost:8000/api/notebooks/{notebook_id}/health/cleanup",
    headers={"Authorization": f"Bearer {token}"},
)
result = response.json()["data"]
print(f"Removed {result['removed_count']} duplicates")
print(f"New score: {result['new_health_score']}/100")