Database Schema¶

Beyond Retrieval v2 uses PostgreSQL 15 via Supabase with three extensions: pgvector, pg_trgm, and pg_net.

Entity Relationship Diagram¶

erDiagram
    notebook {
        UUID notebook_id PK
        TEXT notebook_title
        TEXT notebook_description
        TEXT icon
        TEXT db_type
        TEXT storage_provider
        TEXT user_id
        INTEGER number_of_documents
        TIMESTAMPTZ created_at
        TIMESTAMPTZ updated_at
    }

    notebook_settings {
        UUID notebook_id PK_FK
        TEXT system_prompt_retrieval
        VARCHAR embedding_model
        VARCHAR inference_provider
        VARCHAR inference_model
        DECIMAL inference_temperature
        VARCHAR active_strategy_id
        JSONB strategies_config
        BOOLEAN enable_multimodal_processing
        TEXT language_mode
        INTEGER enhancement_concurrency
        JSONB ingestion_settings
        BOOLEAN judge_enabled
        TIMESTAMPTZ created_at
    }

    notebook_file_jobs {
        UUID job_id PK
        UUID notebook_id FK
        TEXT file_id
        TEXT file_name
        TEXT status
        TEXT error_description
        JSONB ingestion_settings
        TIMESTAMPTZ created_at
    }

    documents {
        BIGSERIAL id PK
        TEXT content
        JSONB metadata
        VECTOR embedding
        TSVECTOR fts
    }

    document_records {
        UUID id PK
        TEXT file_id
        TEXT file_name
        UUID notebook_id FK
        TIMESTAMPTZ created_at
    }

    contextual_retrieval_table {
        TEXT chunk_id PK
        UUID job_id FK
        TEXT file_id
        TEXT original_chunk
        TEXT enhanced_chunk
        TEXT status
        UUID notebook_id FK
        JSONB original_metadata
    }

    chat_conversations {
        UUID conversation_id PK
        UUID notebook_id FK
        TEXT user_id
        TEXT title
        BOOLEAN is_pinned
        INTEGER message_count
        TIMESTAMPTZ last_message_at
    }

    chat_history {
        UUID id PK
        UUID conversation_id FK
        UUID notebook_id FK
        VARCHAR role
        TEXT content
        JSONB citations
        JSONB run_metadata
    }

    query_cache {
        UUID id PK
        UUID notebook_id FK
        TEXT query_text
        TEXT query_hash
        TEXT response
        VECTOR embedding
        TIMESTAMPTZ created_at
    }

    notebook_access {
        UUID id PK
        UUID notebook_id FK
        TEXT user_id
        TEXT access_type
    }

    notebook_invites {
        UUID id PK
        UUID notebook_id FK
        TEXT invite_code
        TIMESTAMPTZ expires_at
    }

    user_api_keys {
        UUID id PK
        TEXT user_id
        TEXT key_name
        TEXT key_value
    }

    notebook ||--|| notebook_settings : "1:1 config"
    notebook ||--o{ notebook_file_jobs : "has jobs"
    notebook ||--o{ document_records : "has files"
    notebook ||--o{ contextual_retrieval_table : "has chunks"
    notebook ||--o{ chat_conversations : "has conversations"
    notebook ||--o{ query_cache : "has cache"
    notebook ||--o{ notebook_access : "has access grants"
    notebook ||--o{ notebook_invites : "has invites"
    chat_conversations ||--o{ chat_history : "has messages"
    notebook_file_jobs ||--o{ contextual_retrieval_table : "produces chunks"

PostgreSQL Extensions¶

Extension	Purpose
`pgvector`	VECTOR type, HNSW index for semantic similarity search
`pg_trgm`	Trigram-based fuzzy text matching
`pg_net`	Asynchronous HTTP requests from within PostgreSQL

Key Indexes¶

HNSW vector index on query_cache.embedding (vector_cosine_ops) for semantic cache lookups
GIN index on documents.fts for full-text search
GIN index on documents.metadata for JSONB containment queries
Composite unique on query_cache(notebook_id, query_hash) for exact cache dedup
Composite unique on notebook_access(notebook_id, user_id)
Composite unique on user_api_keys(user_id, key_name)

Key Database Functions (30+)¶

Function	Purpose
`hybrid_search_v2`	FTS + semantic + RRF fusion search
`match_documents`	Pure semantic vector similarity search
`match_documents_ollama`	Semantic search for Ollama embeddings
`create_conversation`	Atomic conversation creation with validation
`delete_conversation`	Cascade delete with message count reporting
`rename_conversation`	Notebook-isolated conversation rename
`toggle_conversation_pin`	Pin/unpin conversation
`get_user_conversations`	Paginated conversation listing
`update_conversation_stats`	Trigger: auto-update message_count on INSERT
`delete_notebook`	Cascade delete all notebook data
`delete_file_from_notebook`	Clean up file across 6 tables
`submit_feedback`	Upsert feedback for a message
`cleanup_expired_cache`	TTL-based cache eviction

Schema Migrations¶

The primary schema is at db/migrations/001_initial_schema.sql. Additional migrations:

File	Purpose
`001_initial_schema.sql`	Core schema (17+ tables, 30+ functions, 3 extensions)
`002_notebook_access.sql`	Sharing and access control tables
`009_create_user_api_keys_table.sql`	Per-user API key storage
`010_search_playground_history.sql`	Search history

Local mode: start_services.py copies the schema to docker/supabase/init-scripts/99-schema.sql for auto-execution on first container startup.

Cloud mode: Apply migrations manually via the Supabase SQL Editor.