Skip to content

Backend Architecture

The backend is a FastAPI application (Python 3.12) following a strict three-layer architecture: Router → Service → Database.


Directory Structure

backend/
  config.py                  # pydantic-settings configuration (40+ settings)
  main.py                    # FastAPI app entry point with lifespan
  dependencies.py            # Dependency injection (Supabase clients, access control)
  middleware/
    auth.py                  # Clerk JWT verification (JWKS + bypass mode)
    error_handler.py         # Global exception handler
  routers/                   # 13 router modules, 83+ endpoints
    admin.py                 # Admin endpoints (API keys, storage, db_type)
    chat.py                  # Chat conversations and messages
    documents.py             # Document CRUD + ingestion triggers
    enhancement.py           # AI enhancement pipeline
    health.py                # Health checks + cleanup
    models.py                # LLM model listings (OpenRouter proxy)
    notebooks.py             # Notebook CRUD + settings
    onedrive.py              # OneDrive OAuth2 + file import
    retrieval.py             # Search and retrieval endpoints
    settings.py              # Notebook settings management
    sharing.py               # Notebook access sharing + invites
    users.py                 # User management (admin)
    viewer.py                # Document content viewer
  services/                  # 17 service modules
  schemas/                   # 13 Pydantic schema modules
  utils/                     # Shared utilities
  tests/                     # 1200+ unit tests

Three-Layer Architecture

  • Routers (routers/) — Handle HTTP concerns: request validation, dependency injection, response formatting. No business logic.
  • Services (services/) — Contain all business logic. Receive a Supabase client as a parameter. Never import FastAPI types.
  • Database — Accessed exclusively through the Supabase Python client. No raw SQL in application code.
# routers/notebooks.py — HTTP layer
@router.post("/", status_code=201)
async def create_notebook(
    data: NotebookCreate,
    user: dict = Depends(get_current_user),
):
    await require_admin(user)
    supabase = get_client_for_db_type(data.db_type)
    notebook = await notebook_service.create_notebook(supabase, data)
    return APIResponse(data=notebook.model_dump())

# services/notebook_service.py — Business logic
class NotebookService:
    async def create_notebook(
        self, supabase: Client, data: NotebookCreate
    ) -> NotebookResponse:
        row = {"notebook_id": data.notebook_id, ...}
        result = supabase.table("notebook").insert(row).execute()
        return NotebookResponse(**result.data[0])

Dependency Injection

The dependencies.py module provides Supabase client resolvers:

Dependency Purpose
get_supabase() Active client (cloud or local based on global db_type)
get_default_supabase() Always cloud — for global resources (user_api_keys)
get_supabase_for_notebook() Per-notebook routing with cache (cloud/local per notebook)
get_local_client() Local Supabase only (returns None if not configured)

Access control dependencies:

Dependency Behavior
check_notebook_access() Returns "admin" or "chat_only", raises 403 on deny
require_admin_for_notebook() Requires admin access to a specific notebook
require_admin() Requires global admin role (60s TTL cache)

APIResponse Envelope

Every endpoint must return APIResponse(data=...). The frontend apiClient.js unwraps json.data:

from schemas.common import APIResponse

# Correct — frontend receives the notebook object
return APIResponse(data=notebook.model_dump())

# WRONG — frontend receives undefined
return notebook

Service Patterns

Singleton — for stateless services (most services):

class NotebookService:
    async def create_notebook(self, supabase, data): ...

notebook_service = NotebookService()  # module-level singleton

Per-request — for services holding notebook-specific state:

class ChatService:
    def __init__(self, supabase: Client):
        self.supabase = supabase

# In router:
svc = ChatService(supabase)
result = await svc.send_message(...)

LLM Provider Factory

The centralized client factory in utils/openai_client.py supports three providers:

graph LR
    subgraph Provider Factory
        Factory["get_llm_client(provider)"]
    end

    subgraph Providers
        OR["OpenRouter<br/>base: openrouter.ai/api/v1"]
        OA["OpenAI Direct<br/>base: api.openai.com/v1"]
        OL["Ollama<br/>base: ollama:11434/v1"]
    end

    Factory -->|"'openrouter'"| OR
    Factory -->|"'openai'"| OA
    Factory -->|"'ollama'"| OL

All three use the OpenAI SDK (AsyncOpenAI) with different base URLs and API keys.

Embedding Client Routing

graph TD
    Input["get_embedding_client(model_id)"]
    HasSlash{"model_id contains '/'?"}
    HasNomic{"contains 'nomic'/'ollama'?"}
    HasORKey{"OpenRouter key available?"}

    OpenRouter["OpenRouter Client"]
    DirectOpenAI["Direct OpenAI Client"]
    OllamaEmbed["Returns None → use embed_ollama()"]
    LegacyOR["OpenRouter + adds 'openai/' prefix"]

    Input --> HasSlash
    HasSlash -->|"Yes"| OpenRouter
    HasSlash -->|"No"| HasNomic
    HasNomic -->|"Yes"| OllamaEmbed
    HasNomic -->|"No"| HasORKey
    HasORKey -->|"Yes"| LegacyOR
    HasORKey -->|"No"| DirectOpenAI

Dynamic API Key Resolution

graph TD
    GetKey["get_effective_key(provider, user_id)"]
    UserDB{"User has key in DB?"}
    EnvKey{"Server .env key?"}
    Empty["Empty → 400 error for cloud providers"]

    GetKey --> UserDB
    UserDB -->|"Yes"| UserKey["Use user's DB key"]
    UserDB -->|"No"| EnvKey
    EnvKey -->|"Yes"| ServerKey["Use server .env key"]
    EnvKey -->|"No"| Empty

Background Tasks

FastAPI's built-in BackgroundTasks (not Celery) for:

  • Document ingestion — parsing, chunking, embedding after file upload
  • Re-ingestion — cleanup + re-pipeline
  • AI Enhancement — LLM chunk enhancement
  • LLM Judge — quality evaluation after every RAG response

Lifespan Management

On startup, the FastAPI lifespan handler marks stale ingestion jobs (stuck in Processing/reprocessing for longer than stale_job_timeout_minutes) as error, preventing zombie jobs from blocking subsequent runs.