Installation¶
Beyond Retrieval v2 can be deployed using Docker (recommended) or run directly on your machine for development.
Prerequisites¶
| Tool | Version | Notes |
|---|---|---|
| Python | 3.12+ | Required for type \| None syntax |
| Node.js | 22+ | Includes npm; used for the React frontend |
| Docker Desktop | Latest | Or Docker Engine + Compose v2 on Linux |
| Git | 2.x+ | Standard version control |
Optional:
- Ollama — local LLM inference (pulled automatically in Docker mode)
- Clerk — authentication provider (bypassed by default in dev)
Option 1: Docker (Recommended)¶
Docker is the fastest way to get everything running — backend, frontend, Caddy reverse proxy, Ollama, Docling, and optionally a local Supabase instance.
1. Clone the repository¶
git clone https://github.com/your-org/beyond-retrieval.git
cd beyond-retrieval/beyond-retrieval-pythonv
2. Configure environment¶
The default .env.example runs everything locally with zero cloud dependencies. Local Supabase starts by default.
For cloud Supabase, use .env.cloud.example instead:
Then fill in your Supabase URL and key, and add --no-supabase to the start command.
API Keys
LLM provider keys (OpenRouter, OpenAI, Mistral) are configured from the Global Settings page in the UI — not in .env.
3. Start services¶
The --profile flag sets the GPU mode for Ollama:
| Profile | Description |
|---|---|
cpu | CPU-only Ollama (default) |
nvidia | NVIDIA GPU with CUDA |
amd | AMD GPU with ROCm |
Optional flags:
python start_services.py --profile nvidia --build # NVIDIA GPU for Ollama
python start_services.py --profile cpu --no-docling --build # Skip Docling sidecar (~3GB)
python start_services.py --profile cpu --no-supabase --build # Cloud-only (no local DB)
python start_services.py --no-ollama --no-docling --no-supabase --build # Minimal (backend + frontend only)
python start_services.py --dev --build # Dev mode (source mount + hot-reload)
4. Open the app¶
- App: http://localhost:3000
- Supabase Studio (local mode): http://localhost:54321
- FastAPI Docs: http://localhost:8000/docs
Management commands¶
python start_services.py --stop # Stop all services
python start_services.py --logs backend # Tail logs for a service
python start_services.py --status # Show service status
Option 2: Bare-Metal Development¶
Run the backend and frontend directly on your machine without Docker.
Backend¶
cd beyond-retrieval-pythonv/backend
python -m venv venv
# Activate the virtual environment
source venv/bin/activate # Linux / macOS
venv\Scripts\activate # Windows
pip install -r requirements.txt
# Create your environment file
cp ../.env.example ../.env
# Edit ../.env with your credentials
# Start the server
uvicorn main:app --reload --port 8000
The backend serves at http://localhost:8000. OpenAPI docs at http://localhost:8000/docs.
Windows Hot-Reload
On Windows, --reload does NOT detect new files created after the watcher started. When adding new router, service, or schema files, kill and restart uvicorn.
Frontend¶
Opens at http://localhost:5173. Vite proxies /api requests to http://localhost:8000.
Verify Installation¶
After starting the services, verify everything is working:
Next Steps¶
- Quick Start — Create your first notebook and ask a question
- Configuration — Fine-tune environment variables and deployment options