Deployment modes
DocumentForge is a library — there's no daemon to install. You deploy it inside your application. There are three common shapes:
| Mode | Who uses it | Replication |
|---|---|---|
| Embedded | One process, one file | None needed |
| API tier | Multiple app servers → one DB process | Optional followers |
| Leader + followers | High availability / read scale | Required |
| Multi-region | Global deployment | Cross-region async |
Embedded
The simplest mode. Your app opens the database file directly. Perfect for desktop apps, CLI tools, single-node services, tests, and demos.
using var db = DocumentForgeDb.OpenOrCreate("app.dfdb");
No network, no auth, no config. The file moves with your app. One process holds the write lock; other processes see a FileShare.Read lock and should either open their own file or connect via a network API layer.
API tier
When multiple application servers need to share data, run a dfdb serve process in front of the database. Each node reads a small node.json:
{
"nodeName": "prod-1",
"port": 5000,
"dataDir": "/var/lib/documentforge"
}
dfdb serve --config node.json # or with CLI flags dfdb serve --node-name prod-1 --port 5000 --data-dir /var/lib/documentforge # or with env vars DFDB_NODE_NAME=prod-1 DFDB_PORT=5000 DFDB_DATA_DIR=/var/lib/documentforge dfdb serve
App servers send queries over HTTP:
POST /query {"sql": "SELECT * FROM orders WHERE pnr = 'ABC123'"}
A single machine with a decent CPU handles tens of thousands of QPS comfortably.
Local multi-node dev cluster
To experiment with sharding on one box, the repo ships with launch scripts that start three nodes on ports 5001–5003 with separate data folders:
# Windows .\scripts\start-cluster.ps1 # macOS / Linux ./scripts/start-cluster.sh # Check health dfdb health scripts/sample-cluster/cluster.json
Leader + Follower(s)
For read scaling and standby capability, run one leader and N followers. Writes go to the leader; reads can go to any node.
┌───────────────────┐
│ Leader (writes) │ ──logical replication──┐
│ app.dfdb │ │
└───────────────────┘ ▼
┌─────────────────┐
│ Follower (read) │
│ replica.dfdb │
└─────────────────┘
Setup (each process):
// On the leader host using var db = DocumentForgeDb.OpenOrCreate("app.dfdb"); db.StartLogicalReplicationServer(5500); // On each follower host using var db = DocumentForgeDb.OpenOrCreate("replica.dfdb"); db.StartLogicalReplicationFollower("leader-host.internal", 5500);
Point your read-heavy queries at any follower. Writes always go to the leader.
Multi-region
For a global deployment — e.g. writes in Dubai, reads in Singapore and London — run followers in each region. Logical replication is asynchronous and handles cross-continental latency gracefully.
┌────────────┐
│ Dubai │
│ LEADER │
└─────┬──────┘
┌──────────────┼──────────────┐
│ │ │
┌────────▼──────┐ ┌────▼────────┐ ┌──▼─────────┐
│ Singapore │ │ Frankfurt │ │ New York │
│ Follower │ │ Follower │ │ Follower │
└───────────────┘ └─────────────┘ └────────────┘
Read requests are routed to the nearest follower by your load balancer or client logic. Writes hit the Dubai leader; followers catch up in seconds.
To migrate the leader role between regions (e.g. rotating for planned maintenance), see the datacenter move runbook.
Files on disk
A DocumentForge database is up to four files. Keep them together.
| File | Purpose | Safe to delete? |
|---|---|---|
app.dfdb | The database itself | Never (it's your data) |
app.dfdb.wal | Write-ahead log | Only when DB is cleanly closed |
app.dfdb.recovery | Crash recovery log | Only when DB is cleanly closed |
app.dfdb.followerseq | Follower's last-applied seq | Yes, but triggers full catchup |
Configuration
Most tuning is via DatabaseOptions:
var options = new DatabaseOptions { CacheSizeInPages = 10_000, // 10K pages × 8KB = 80MB cache EnableWal = true // WAL + recovery + replication }; using var db = DocumentForgeDb.OpenOrCreate("app.dfdb", options);
Cache sizing rule of thumb
- Target ~20% of your data size in cache for a well-performing system
- 1M docs ~ 500MB data → 10K pages (80MB) is fine
- 10M docs ~ 5GB data → 100K pages (800MB) recommended
- 100M docs ~ 50GB data → you want 200K–500K pages and a machine with > 32GB RAM
Docker
A production-ready Dockerfile ships in the repo root. It's a multi-stage build that publishes a self-contained single-file binary on top of runtime-deps:9.0-bookworm-slim — no .NET runtime needed in the final image. Final image size lands around 90–120 MB depending on compression.
Build and run locally
# Build the image docker build -t dfdb . # Run with a named volume so data survives restarts docker run --rm -p 5000:5000 -v dfdb-data:/data dfdb # In another terminal curl http://localhost:5000/health
Production run (API key + replication secret)
docker run -d --name dfdb \
-p 5000:5000 \
-v dfdb-data:/data \
-e DFDB_NODE_NAME=prod-1 \
-e DFDB_API_KEY=sk_prod_abc123 \
-e DFDB_REPLICATION_SECRET=repl_shared_xyz \
dfdb
As a replication leader
docker run -d --name dfdb-leader \
-p 5000:5000 -p 5500:5500 \
-v dfdb-leader-data:/data \
-e DFDB_REPLICATION_SECRET=repl_shared_xyz \
dfdb serve --bind-all \
--replication-role leader --replication-port 5500
As a follower with auto-failover
docker run -d --name dfdb-follower \
-p 5010:5000 \
-v dfdb-follower-data:/data \
-e DFDB_REPLICATION_SECRET=repl_shared_xyz \
dfdb serve --bind-all \
--replication-role follower \
--leader-host dfdb-leader --leader-port 5500 \
--auto-failover-seconds 10
dfdb user, writes only to /data, exposes ports 5000 (HTTP) and 5500 (replication). Health check hits /health every 30 s. Signals are handled via tini so docker stop flushes cleanly.
Docker Compose (leader + follower, one file)
services:
leader:
image: dfdb
build: .
ports: ["5000:5000", "5500:5500"]
volumes: ["leader-data:/data"]
environment:
DFDB_NODE_NAME: leader
DFDB_REPLICATION_SECRET: repl_shared_xyz
command: >
serve --bind-all
--replication-role leader --replication-port 5500
follower:
image: dfdb
build: .
ports: ["5010:5000"]
volumes: ["follower-data:/data"]
environment:
DFDB_NODE_NAME: follower
DFDB_REPLICATION_SECRET: repl_shared_xyz
depends_on: [leader]
command: >
serve --bind-all
--replication-role follower
--leader-host leader --leader-port 5500
--auto-failover-seconds 10
volumes:
leader-data:
follower-data:
docker compose up → you have a leader, a follower replicating from it, and auto-failover wired in one command.
Deploying to Render.com
A ready-to-use render.yaml blueprint ships in the repo root. Render reads it, provisions a web service + a persistent disk, and wires up a health check automatically.
One-click deploy
- Push this repo to GitHub or GitLab.
- In the Render dashboard: New + → Blueprint → point it at the repo.
- Render reads
render.yamland creates:- A web service named
documentforgebuilt from theDockerfileat the repo root. - A 10 GB persistent disk mounted at
/data. - A health check on
/health.
- A web service named
- Fill in the two "set on dashboard" env vars:
DFDB_API_KEYandDFDB_REPLICATION_SECRET. - Deploy. First build takes ~3–4 minutes; subsequent deploys reuse the Docker layer cache.
What the blueprint looks like
services:
- type: web
name: documentforge
runtime: docker
plan: starter
region: oregon
dockerfilePath: ./Dockerfile
healthCheckPath: /health
disk:
name: dfdb-data
mountPath: /data
sizeGB: 10
envVars:
- key: DFDB_NODE_NAME
value: render-1
- key: DFDB_API_KEY
sync: false # set in dashboard, not in git
- key: DFDB_REPLICATION_SECRET
sync: false
How Render's $PORT is handled
Render injects a PORT env var at start time. The container's entrypoint (docker/entrypoint.sh) maps PORT → DFDB_PORT before starting the binary, so the service listens on whichever port Render assigned. You don't need to hard-code anything.
TLS, custom domain, auth
- TLS: Render terminates TLS at its edge and proxies plain HTTP to your container. Don't configure TLS in
node.jsonfor Render — let the edge handle it. - Custom domain: add it in the dashboard; cert provisioning is automatic.
- Auth: because the edge is publicly reachable, always set
DFDB_API_KEY. Every client then sendsAuthorization: Bearer <key>. - Replication: Render's ingress routes HTTP only. For cross-service replication on Render, add a private service port or use a TCP proxy — for most use cases a single Render instance + regular backups is plenty.
Scaling on Render
starter= 512 MB RAM — fine for <1M docs. Bump tostandard(2 GB),pro(4 GB), or higher for real workloads — see the Performance page for guidance.- Disks can be resized without downtime; bump
sizeGBand redeploy. - For sharding, deploy the same blueprint multiple times (one service per shard) with different
name:andDFDB_NODE_NAMEvalues. Drive them from a localcluster.jsonvia thedfdbCLI.
$PORT env var translation in docker/entrypoint.sh, which is a no-op when $PORT isn't set.
Backups
Because the database is a file, backup is file copy. But don't just cp a live database — you might get a torn page mid-write.
Cold backup (simplest, downtime required)
dotnet my-app.dll --shutdown # stop the process cp app.dfdb /backup/app.$(date).dfdb cp app.dfdb.wal /backup/app.$(date).dfdb.wal dotnet my-app.dll # start again
Warm backup (no downtime, uses a follower)
Much better approach for production: spin up a follower for the purpose of backup. Let it catch up. Pause replication. Copy the follower's file. Resume.
Streaming backup (future)
A future version will include an online backup command that walks the data file page-by-page with a stable snapshot, without blocking writes. Until then, the follower-as-backup pattern is the right approach.
Monitoring
DocumentForge exposes its state through the DatabaseStatistics object. Log these at regular intervals.
var s = db.GetStatistics(); Console.WriteLine($"File: {s.FileSize / 1024.0 / 1024.0:F2} MB"); Console.WriteLine($"Pages: {s.PageCount:N0}"); Console.WriteLine($"Cache: {s.CachedPages:N0} pages ({s.DirtyPages} dirty)"); foreach (var coll in s.Collections) Console.WriteLine($"{coll.Name}: {coll.DocumentCount} docs, {coll.IndexCount} indexes");
For replication:
// Leader metrics leader.LeaderCurrentSeq // how many ops have been issued leader.GetLogicalFollowerCount() // connected followers // Follower metrics follower.FollowerLastSeq // how far caught up follower.LogicallyReplicatedOps // running total follower.GapsDetected // > 0 means lost ops — alert!
Replication lag is leader.LeaderCurrentSeq - follower.FollowerLastSeq. Healthy systems keep this near zero during normal traffic.
Production checklist
- ✓ Leader and followers on trusted network (no auth in protocol yet)
- ✓ Monitor replication lag — alert on sustained growth
- ✓ Monitor
GapsDetected— should always be 0 - ✓ Cache size tuned to your data size
- ✓ Backups automated via a dedicated follower
- ✓ Clients know which host is the current leader
- ✓ Client retry logic for transient "read-only mode" errors during planned handover
- ✓ Disk space alerting on the data file's volume
- ✓ Tested a full handover in staging before needing one in prod