Skip to Content
GuidesDeployment

Deployment

DocumentForge is a library — there is no daemon to install. You deploy it inside your application, and it ships as one self-contained binary with no .NET runtime required on the target.

Deployment modes

There are four common shapes:

ModeWho uses itReplication
EmbeddedOne process, one fileNone needed
API tierMultiple app servers, one DB processOptional followers
Leader + followersHigh availability / read scaleRequired
Multi-regionGlobal deploymentCross-region async

Embedded

The simplest mode. Your app opens the database file directly — perfect for desktop apps, CLI tools, single-node services, tests, and demos.

using var db = DocumentForgeDb.OpenOrCreate("app.dfdb");

No network, no auth, no config. The file moves with your app. One process holds the write lock; other processes see a FileShare.Read lock and should either open their own file or connect via a network API layer.

API tier

When multiple application servers need to share data, run a dfdb serve process in front of the database. Each node reads a small node.json:

{ "nodeName": "prod-1", "port": 5000, "dataDir": "/var/lib/documentforge" }
dfdb serve --config node.json # or with CLI flags dfdb serve --node-name prod-1 --port 5000 --data-dir /var/lib/documentforge # or with env vars DFDB_NODE_NAME=prod-1 DFDB_PORT=5000 DFDB_DATA_DIR=/var/lib/documentforge dfdb serve

App servers send queries over HTTP — POST /query with a body of {"sql": "SELECT * FROM orders WHERE pnr = 'ABC123'"}. A single machine with a decent CPU handles tens of thousands of QPS comfortably.

Local multi-node dev cluster

To experiment with sharding on one box, the repo ships launch scripts that start three nodes on ports 5001–5003 with separate data folders:

# Windows .\scripts\start-cluster.ps1 # macOS / Linux ./scripts/start-cluster.sh # Check health dfdb health scripts/sample-cluster/cluster.json

Leader + follower(s)

For read scaling and standby capability, run one leader and N followers. Writes go to the leader; reads can go to any node.

// On the leader host using var db = DocumentForgeDb.OpenOrCreate("app.dfdb"); db.StartLogicalReplicationServer(5500); // On each follower host using var db = DocumentForgeDb.OpenOrCreate("replica.dfdb"); db.StartLogicalReplicationFollower("leader-host.internal", 5500);

Point your read-heavy queries at any follower. Writes always go to the leader.

Replication uses a raw TCP connection. Open firewall rules for the replication port between your nodes. There is no auth or encryption at the protocol level yet — run it on a trusted internal network or tunnel over VPN / WireGuard / service mesh mTLS.

Multi-region

For a global deployment — e.g. writes in Dubai, reads in Singapore and London — run followers in each region. Logical replication is asynchronous and handles cross-continental latency gracefully. Read requests are routed to the nearest follower by your load balancer or client logic; writes hit the leader and followers catch up in seconds.

To migrate the leader role between regions (e.g. rotating for planned maintenance), see the replication guide.

Files on disk

A DocumentForge database is up to four files. Keep them together.

FilePurposeSafe to delete?
app.dfdbThe database itselfNever (it is your data)
app.dfdb.walWrite-ahead logOnly when DB is cleanly closed
app.dfdb.recoveryCrash recovery logOnly when DB is cleanly closed
app.dfdb.followerseqFollower’s last-applied seqYes, but triggers full catchup

Configuration

Most tuning is via DatabaseOptions:

var options = new DatabaseOptions { CacheSizeInPages = 10_000, // 10K pages × 8KB = 80MB cache EnableWal = true // WAL + recovery + replication }; using var db = DocumentForgeDb.OpenOrCreate("app.dfdb", options);

Cache sizing rule of thumb

  • Target ~20% of your data size in cache for a well-performing system.
  • 1M docs ~ 500MB data, so 10K pages (80MB) is fine.
  • 10M docs ~ 5GB data, so 100K pages (800MB) is recommended.
  • 100M docs ~ 50GB data, so you want 200K–500K pages and a machine with more than 32GB RAM.

Publishing and running

Publish a self-contained single-file binary, ship it to the server, and run it. No .NET runtime needs to be installed on the target.

# Build a self-contained binary ./scripts/publish-dfdb.sh # or publish-dfdb.ps1 on Windows # Ship dist/<rid>/dfdb to the server and run it ./dfdb serve --port 5000 --data-dir ./data

Other platforms: the same image runs unchanged on Fly.io, Railway, Koyeb, Azure Container Apps, Google Cloud Run (with a persistent volume), AWS App Runner, and any Kubernetes cluster. The only platform-specific bit is the $PORT env var translation in docker/entrypoint.sh, which is a no-op when $PORT is not set.

Backups

Because the database is a file, backup is file copy. But do not just cp a live database — you might get a torn page mid-write.

Cold backup (simplest, downtime required)

dotnet my-app.dll --shutdown # stop the process cp app.dfdb /backup/app.$(date).dfdb cp app.dfdb.wal /backup/app.$(date).dfdb.wal dotnet my-app.dll # start again

Warm backup (no downtime, uses a follower)

A much better approach for production: spin up a follower for the purpose of backup, let it catch up, pause replication, copy the follower’s file, then resume.

Streaming backup (future)

A future version will include an online backup command that walks the data file page-by-page with a stable snapshot, without blocking writes. Until then, the follower-as-backup pattern is the right approach.

Monitoring

DocumentForge exposes its state through the DatabaseStatistics object. Log these at regular intervals.

var s = db.GetStatistics(); Console.WriteLine($"File: {s.FileSize / 1024.0 / 1024.0:F2} MB"); Console.WriteLine($"Pages: {s.PageCount:N0}"); Console.WriteLine($"Cache: {s.CachedPages:N0} pages ({s.DirtyPages} dirty)"); foreach (var coll in s.Collections) Console.WriteLine($"{coll.Name}: {coll.DocumentCount} docs, {coll.IndexCount} indexes");

For replication:

// Leader metrics leader.LeaderCurrentSeq // how many ops have been issued leader.GetLogicalFollowerCount() // connected followers // Follower metrics follower.FollowerLastSeq // how far caught up follower.LogicallyReplicatedOps // running total follower.GapsDetected // > 0 means lost ops — alert!

Replication lag is leader.LeaderCurrentSeq - follower.FollowerLastSeq. Healthy systems keep this near zero during normal traffic.

Production checklist

  • Leader and followers on a trusted network (no auth in the protocol yet).
  • Monitor replication lag — alert on sustained growth.
  • Monitor GapsDetected — should always be 0.
  • Cache size tuned to your data size.
  • Backups automated via a dedicated follower.
  • Clients know which host is the current leader.
  • Client retry logic for transient “read-only mode” errors during planned handover.
  • Disk space alerting on the data file’s volume.
  • Tested a full handover in staging before needing one in prod.
Last updated on