Backend Architecture¶
The backend is a FastAPI application (Python 3.12) following a strict three-layer architecture: Router → Service → Database.
Directory Structure¶
backend/
config.py # pydantic-settings configuration (40+ settings)
main.py # FastAPI app entry point with lifespan
dependencies.py # Dependency injection (Supabase clients, access control)
middleware/
auth.py # Clerk JWT verification (JWKS + bypass mode)
error_handler.py # Global exception handler
routers/ # 13 router modules, 83+ endpoints
admin.py # Admin endpoints (API keys, storage, db_type)
chat.py # Chat conversations and messages
documents.py # Document CRUD + ingestion triggers
enhancement.py # AI enhancement pipeline
health.py # Health checks + cleanup
models.py # LLM model listings (OpenRouter proxy)
notebooks.py # Notebook CRUD + settings
onedrive.py # OneDrive OAuth2 + file import
retrieval.py # Search and retrieval endpoints
settings.py # Notebook settings management
sharing.py # Notebook access sharing + invites
users.py # User management (admin)
viewer.py # Document content viewer
services/ # 17 service modules
schemas/ # 13 Pydantic schema modules
utils/ # Shared utilities
tests/ # 1200+ unit tests
Three-Layer Architecture¶
- Routers (
routers/) — Handle HTTP concerns: request validation, dependency injection, response formatting. No business logic. - Services (
services/) — Contain all business logic. Receive a Supabase client as a parameter. Never import FastAPI types. - Database — Accessed exclusively through the Supabase Python client. No raw SQL in application code.
# routers/notebooks.py — HTTP layer
@router.post("/", status_code=201)
async def create_notebook(
data: NotebookCreate,
user: dict = Depends(get_current_user),
):
await require_admin(user)
supabase = get_client_for_db_type(data.db_type)
notebook = await notebook_service.create_notebook(supabase, data)
return APIResponse(data=notebook.model_dump())
# services/notebook_service.py — Business logic
class NotebookService:
async def create_notebook(
self, supabase: Client, data: NotebookCreate
) -> NotebookResponse:
row = {"notebook_id": data.notebook_id, ...}
result = supabase.table("notebook").insert(row).execute()
return NotebookResponse(**result.data[0])
Dependency Injection¶
The dependencies.py module provides Supabase client resolvers:
| Dependency | Purpose |
|---|---|
get_supabase() | Active client (cloud or local based on global db_type) |
get_default_supabase() | Always cloud — for global resources (user_api_keys) |
get_supabase_for_notebook() | Per-notebook routing with cache (cloud/local per notebook) |
get_local_client() | Local Supabase only (returns None if not configured) |
Access control dependencies:
| Dependency | Behavior |
|---|---|
check_notebook_access() | Returns "admin" or "chat_only", raises 403 on deny |
require_admin_for_notebook() | Requires admin access to a specific notebook |
require_admin() | Requires global admin role (60s TTL cache) |
APIResponse Envelope¶
Every endpoint must return APIResponse(data=...). The frontend apiClient.js unwraps json.data:
from schemas.common import APIResponse
# Correct — frontend receives the notebook object
return APIResponse(data=notebook.model_dump())
# WRONG — frontend receives undefined
return notebook
Service Patterns¶
Singleton — for stateless services (most services):
class NotebookService:
async def create_notebook(self, supabase, data): ...
notebook_service = NotebookService() # module-level singleton
Per-request — for services holding notebook-specific state:
class ChatService:
def __init__(self, supabase: Client):
self.supabase = supabase
# In router:
svc = ChatService(supabase)
result = await svc.send_message(...)
LLM Provider Factory¶
The centralized client factory in utils/openai_client.py supports three providers:
graph LR
subgraph Provider Factory
Factory["get_llm_client(provider)"]
end
subgraph Providers
OR["OpenRouter<br/>base: openrouter.ai/api/v1"]
OA["OpenAI Direct<br/>base: api.openai.com/v1"]
OL["Ollama<br/>base: ollama:11434/v1"]
end
Factory -->|"'openrouter'"| OR
Factory -->|"'openai'"| OA
Factory -->|"'ollama'"| OL All three use the OpenAI SDK (AsyncOpenAI) with different base URLs and API keys.
Embedding Client Routing¶
graph TD
Input["get_embedding_client(model_id)"]
HasSlash{"model_id contains '/'?"}
HasNomic{"contains 'nomic'/'ollama'?"}
HasORKey{"OpenRouter key available?"}
OpenRouter["OpenRouter Client"]
DirectOpenAI["Direct OpenAI Client"]
OllamaEmbed["Returns None → use embed_ollama()"]
LegacyOR["OpenRouter + adds 'openai/' prefix"]
Input --> HasSlash
HasSlash -->|"Yes"| OpenRouter
HasSlash -->|"No"| HasNomic
HasNomic -->|"Yes"| OllamaEmbed
HasNomic -->|"No"| HasORKey
HasORKey -->|"Yes"| LegacyOR
HasORKey -->|"No"| DirectOpenAI Dynamic API Key Resolution¶
graph TD
GetKey["get_effective_key(provider, user_id)"]
UserDB{"User has key in DB?"}
EnvKey{"Server .env key?"}
Empty["Empty → 400 error for cloud providers"]
GetKey --> UserDB
UserDB -->|"Yes"| UserKey["Use user's DB key"]
UserDB -->|"No"| EnvKey
EnvKey -->|"Yes"| ServerKey["Use server .env key"]
EnvKey -->|"No"| Empty Background Tasks¶
FastAPI's built-in BackgroundTasks (not Celery) for:
- Document ingestion — parsing, chunking, embedding after file upload
- Re-ingestion — cleanup + re-pipeline
- AI Enhancement — LLM chunk enhancement
- LLM Judge — quality evaluation after every RAG response
Lifespan Management¶
On startup, the FastAPI lifespan handler marks stale ingestion jobs (stuck in Processing/reprocessing for longer than stale_job_timeout_minutes) as error, preventing zombie jobs from blocking subsequent runs.