.NET SDK
The embedded surface: DocumentForgeDb, the typed LINQ collection API, and the
interactive REPL. DocumentForge is a .NET 9 library with zero external
dependencies.
Install
Add it to your solution by cloning the repository and referencing the
DocumentForge.Engine project, or install the NuGet package once published.
git clone https://github.com/aerotoysio/documentforge.git
cd documentforge
dotnet build
# Add a reference from your project
dotnet add reference path/to/src/DocumentForge.EngineRequirements
- .NET 9 SDK or later.
- Windows, macOS, or Linux.
- Write access to the directory where your
.dfdbfile lives.
Opening a database
A DocumentForge database is a single file on disk. Opening one is a constructor call.
using DocumentForge.Engine;
using var db = DocumentForgeDb.OpenOrCreate("airline.dfdb");If airline.dfdb exists it is opened; if not, it is created. OpenOrCreate is
the usual entry point; use Create when you want to require a fresh file. The
using ensures the file is closed and flushed on dispose.
Files created alongside your database: airline.dfdb (data),
airline.dfdb.wal (write-ahead log), and airline.dfdb.recovery (crash
recovery log). Keep them together when moving or backing up.
Insert documents
Documents are JSON. You can pass a JSON string, a BsonDocument, or an
anonymous C# object (via BsonDocument.FromJson(JsonSerializer.Serialize(obj))).
// JSON string (simplest)
db.Insert("orders", @"{
""pnr"": ""ABC123"",
""status"": ""CONFIRMED"",
""passenger"": {
""firstName"": ""John"",
""lastName"": ""Smith""
},
""flights"": [
{ ""flightNumber"": ""AA100"", ""departureAirport"": ""JFK"" }
]
}");Collections are created automatically on first insert. The document’s _id
field is auto-generated (sequential, time-ordered) if you do not supply one.
Bulk insert
For loading large datasets, BulkInsert acquires the write lock once and defers
index updates until the batch completes. Expect 50,000–70,000 docs/sec on a
modern laptop.
var batch = new List<BsonDocument>();
for (int i = 0; i < 10_000; i++)
{
batch.Add(BsonDocument.FromJson($@"{{ ""pnr"": ""ORD{i:D6}"" }}"));
}
db.BulkInsert("orders", batch);Query
SQL-like queries hit a single Execute method. Dot notation navigates nested
objects; bracket notation indexes arrays.
// Point lookup
var r1 = db.Execute("SELECT * FROM orders WHERE pnr = 'ABC123'");
// Nested field
var r2 = db.Execute("SELECT * FROM orders WHERE passenger.lastName = 'Smith'");
// Array element
var r3 = db.Execute("SELECT * FROM orders WHERE flights[0].departureAirport = 'JFK'");
// Range, ordering, limit
var r4 = db.Execute(@"
SELECT pnr, passenger.lastName
FROM orders
WHERE flights[0].fareAmount > 500
ORDER BY flights[0].fareAmount DESC
LIMIT 10
");
// Iterate results
foreach (var doc in r4.Documents)
Console.WriteLine(doc.ToJson());Every query returns a QueryResult containing the documents, the plan used
(INDEX_SCAN, COLLECTION_SCAN, etc.), and the execution time in milliseconds.
Typed LINQ surface
Work with strongly-typed objects via db.Collection<T>(...). The LINQ surface
serialises with camelCase JSON, so typed inserts and typed Where clauses agree
on field names.
var order = db.Collection<Order>("orders")
.Where(o => o.Pnr == "ABC123")
.FirstOrDefault();Create indexes
Indexes are the difference between a 1ms lookup and a 1-second scan. Create one for any JSON path you query frequently.
// Single-field index
db.CreateIndex("orders", "pnr", "idx_pnr", unique: true);
// Nested path
db.CreateIndex("orders", "passenger.lastName", "idx_lastname");
// Composite — good for multi-field WHERE clauses
db.Execute("CREATE INDEX idx_status_date ON orders (status, createdAt)");Indexes are persistent — they survive a database restart without being rebuilt. They are also incrementally maintained on every insert, update, and delete, so queries always see fresh results.
Interactive REPL
Included in the repo is an interactive console for experimenting with SQL queries against your database.
dfdb repl ./data/data.dfdb
dfdb> SELECT * FROM orders WHERE pnr = 'ABC123'
dfdb> SELECT status, COUNT(*) FROM orders GROUP BY status
dfdb> statsREST API
For testing from Postman or wiring other services in, use dfdb serve:
dfdb serve --port 5000 --data-dir ./data
# Then from another terminal
curl -X POST http://localhost:5000/query \
-H "Content-Type: application/json" \
-d '{"sql": "SELECT * FROM orders LIMIT 5"}'The same endpoints power the admin UI, so pointing it at any dfdb serve
instance just works. See the REST reference for the full
endpoint list.
Next steps
- SQL reference — all the query features supported.
- Data modeling — when to embed, when to reference.
- Replication — read scaling and zero-downtime handover.
- Deployment — running across multiple machines and datacenters.
- Security — API keys, replication secrets, TLS.