fernDB is an experimental toy relational database written in Go, built to explore how a SQL query engine works from the parser down to the execution layer. It supports ACID transactions and includes the pieces you would expect in a real database system: SQL parsing, query planning, a cost-based optimizer, secondary indexes, join algorithms, query execution pipelines, a server, and an interactive CLI.
fernDB uses frostfire as its storage engine. frostfire is a transactional key-value engine, while fernDB focuses on the relational layer: SQL parsing, planning, optimization, execution, and mapping tables/indexes onto frostfire KV operations.
Status: fernDB is a learning and experimental project. It is useful for understanding how a query engine fits together, but it is not intended for production workloads.
- Full SQL frontend that parses, validates, normalizes, and plans statements before execution.
- Server mode with a RESP-inspired wire protocol, plus a small CLI that shows how client applications can be built on top.
- ACID transactions for individual statements and explicit
BEGIN/COMMIT/ROLLBACKworkflows. - Core SQL support for schema changes, reads, writes, indexes, table introspection, plan inspection, and statistics collection.
- Typed storage for
INT,FLOAT,TEXT,BOOL,BLOB, andNULLvalues. - Relational constraints and indexing with primary keys,
AUTOINCREMENT,NOT NULL, and unique secondary indexes. - Query support for projection, filtering, expressions, multi-column ordering, limits, and inner joins.
- Statistics-aware optimizer that uses collected table and column statistics to choose efficient access paths, join strategies, and sort plans.
fernDB requires Go 1.25 or newer.
Build the server and CLI:
go build -o ferndb-server ./cmd/server
go build -o ferndb-cli ./cmd/cliStart the server:
./ferndb-serverBy default the server listens on 127.0.0.1:5454 and stores data in
ferndb.db. These and the per-connection timeouts can be tuned via flags:
./ferndb-server -addr 127.0.0.1:5454 -db my.db \
-idle-timeout 5m -write-timeout 30s| Flag | Default | Description |
|---|---|---|
-addr |
127.0.0.1:5454 |
TCP listen address. |
-db |
ferndb.db |
Database file path. |
-idle-timeout |
5m |
Max time a client may stay idle between requests before the connection is closed. Set to 0 to disable. |
-write-timeout |
30s |
Max time allowed to write a single response. Set to 0 to disable. |
Connect with the CLI:
./ferndb-cli
./ferndb-cli -addr 127.0.0.1:5454Try a small session:
CREATE TABLE users (
id INT PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
score FLOAT,
active BOOL
);
INSERT INTO users (name, score, active) VALUES
('alice', 92.5, TRUE),
('bob', 71.0, FALSE),
('carol', 88.2, TRUE);
CREATE INDEX by_active_score ON users (active, score);
ANALYZE users;
SELECT id, name, score
FROM users
WHERE active = TRUE
ORDER BY score DESC
LIMIT 2;Example output:
╭────┬───────┬───────╮
│ id │ name │ score │
├────┼───────┼───────┤
│ 1 │ alice │ 92.5 │
│ 3 │ carol │ 88.2 │
╰────┴───────┴───────╯
(2 rows)
The CLI is an interactive client for connecting to a running fernDB server and executing SQL statements.
Meta commands:
\h, \help show help
\q, exit disconnect and quit
Transaction commands:
BEGIN; -- start a write transaction
BEGIN READ; -- start a read-only transaction
COMMIT; -- commit the current write transaction
ROLLBACK; -- abort the current transaction (read or write)SQL statements are terminated with ;.
Outside an explicit transaction, each write statement runs in its own
auto-committed write transaction. COMMIT is only valid for a write
transaction; ending a read transaction with ROLLBACK releases it.
fernDB has a small fixed type system:
| Type | Notes |
|---|---|
INT |
64-bit signed integer, e.g. 42 |
FLOAT |
64-bit floating point number, e.g. 92.5 |
TEXT |
UTF-8 string, e.g. 'alice' |
BOOL |
Boolean value: TRUE or FALSE |
BLOB |
Byte string, written as a hex literal such as X'48656c6c6f' |
Columns are nullable by default. Use NOT NULL to reject NULL values.
Expressions can appear in projections, WHERE clauses, JOIN ... ON
conditions, and UPDATE ... SET assignments.
- Column references:
name,users.name. - Wildcards in projections:
*,users.*. - Literals:
42,92.5,'alice',TRUE,FALSE,NULL,X'48656c6c6f'. - Arithmetic:
+,-,*,/. - Comparisons:
=,!=,<,<=,>,>=. - Boolean logic:
AND,OR,NOT. - Null checks:
IS NULL,IS NOT NULL. - Parenthesized grouping:
(score + bonus) >= 100.
Example:
SELECT id, name, score + 5
FROM users
WHERE (active = TRUE AND score >= 80) OR name = 'alice';Every table must declare exactly one PRIMARY KEY column. AUTOINCREMENT is
supported on integer primary keys. NOT NULL is supported per column.
CREATE TABLE users (
id INT PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
score FLOAT,
active BOOL
);
DROP TABLE users;Multi-row VALUES and explicit column lists are supported. For autoincrement
primary keys, omit the primary key column; supplying an explicit value is
rejected.
INSERT INTO users (name, score, active) VALUES
('alice', 92.5, TRUE),
('bob', 71.0, FALSE),
('carol', 88.2, TRUE);SELECT supports projection, WHERE, joins, multi-column ORDER BY, and
LIMIT.
SELECT id, name, score
FROM users
WHERE active = TRUE
ORDER BY score DESC
LIMIT 2;Joins let queries combine rows from multiple tables using JOIN ... ON
conditions. fernDB supports chained inner joins, so a query can join two or more
tables in one statement.
Use qualified column names such as users.id when a query reads from multiple
tables. The optimizer uses join predicates and statistics to choose both join
order and join algorithm.
SELECT users.name, products.name
FROM users
JOIN orders ON users.id = orders.user_id
JOIN products ON orders.product_id = products.id;SET expressions can reference existing column values.
UPDATE users
SET score = score + 5
WHERE name = 'bob';Primary key columns cannot be updated. Secondary indexes are maintained when indexed columns change.
DELETE FROM users WHERE active = FALSE;Omitting WHERE deletes every row in the table.
Indexes give the optimizer more ways to avoid full table scans and satisfy ordered queries. fernDB supports single-column and composite secondary indexes, with optional uniqueness enforcement.
CREATE INDEX by_name ON users (name);
CREATE UNIQUE INDEX by_user_product ON orders (user_id, product_id);
DROP INDEX by_name ON users;The optimizer can use indexes to speed up filters when they match the query shape:
CREATE INDEX by_active_score ON users (active, score);
SELECT id FROM users WHERE active = TRUE;Use transactions when a group of statements should commit or roll back together.
The CLI and wire protocol expose session-level BEGIN, BEGIN READ, COMMIT,
and ROLLBACK commands:
BEGIN;
INSERT INTO users (name, score, active) VALUES ('dave', 60.0, TRUE);
SELECT id, name FROM users WHERE name = 'dave';
ROLLBACK;BEGIN READ opens a read-only transaction. Read transactions reject write
statements and see a consistent snapshot of the database for as long as they
are open. Use ROLLBACK to release a read transaction:
BEGIN READ;
SELECT id, name FROM users;
SELECT id, user_id FROM orders;
ROLLBACK;Outside an explicit transaction, each write statement is committed
automatically. Embedded Go applications can use Begin() and BeginWrite() for
manual transaction control.
DESCRIBE is a quick way to inspect a table from the CLI or a client
application. It returns rows for columns, indexes, and any autoincrement
sequence:
DESCRIBE users;Example output:
╭──────────┬─────────┬────────────────────────────────────────╮
│ section │ name │ detail │
├──────────┼─────────┼────────────────────────────────────────┤
│ column │ id │ int NOT NULL PRIMARY KEY AUTOINCREMENT │
│ column │ name │ string NOT NULL │
│ column │ score │ float │
│ column │ active │ bool │
│ index │ by_name │ (name) │
│ sequence │ users │ 3 │
╰──────────┴─────────┴────────────────────────────────────────╯
(6 rows)
EXPLAIN shows the physical plan chosen for a SELECT. Use it to see whether
the optimizer picked a table scan or index seek, which join algorithm it chose,
and where sorts, limits, and projections appear in the plan.
EXPLAIN
SELECT users.name, products.name
FROM users
JOIN orders ON users.id = orders.user_id
JOIN products ON orders.product_id = products.id
WHERE users.name = 'alice'
ORDER BY products.name
LIMIT 5;Output:
╭───────────────────────────────────────────────────────────────────────────────╮
│ QUERY PLAN │
├───────────────────────────────────────────────────────────────────────────────┤
│ Project: users.name, products.name (cost=3.17 rows=2 rowSize=9) │
│ └─ Limit: 5 (cost=3.16 rows=2) │
│ └─ Sort: products.name ASC (cost=3.14 rows=2) │
│ └─ MergeJoin: orders.product_id=products.id (cost=3.12 rows=2) │
│ ├─ Sort: orders.product_id ASC (cost=2.09 rows=2) │
│ │ └─ NLJoin: (users.id = orders.user_id) (cost=2.06 rows=2) │
│ │ ├─ Filter: (users.name = "alice") (cost=1.02 rows=1) │
│ │ │ └─ SeqScan: users (cost=1.02 rows=2) │
│ │ └─ SeqScan: orders (cost=1.03 rows=3) │
│ └─ SeqScan: products (cost=1.02 rows=2) │
╰───────────────────────────────────────────────────────────────────────────────╯
(1 rows)
Plans include estimated costs and row counts. The exact operators may change as statistics change.
ANALYZE refreshes the table and column statistics used by the optimizer:
ANALYZE users;Those statistics include row counts, average row sizes, distinct-value counts, null fractions, min/max values, most-common values, and histograms.
The optimizer uses these statistics to estimate how many rows a filter or join will produce, how expensive each access path is likely to be, and which join order is cheaper. Those estimates drive the final execution plan: index seeks when a predicate is selective, merge or hash joins when they are cheaper than nested loops, and sorts only where the required ordering is not already available.
BLOB stores raw bytes. In SQL, byte values are written as hex literals using
the X'...' form:
CREATE TABLE blobs (id INT PRIMARY KEY, payload BLOB);
INSERT INTO blobs VALUES (1, X'48656c6c6f');
SELECT id, payload FROM blobs;fernDB can also be embedded directly in a Go program. This path bypasses the
server and wire protocol: your application opens a database file and executes
SQL through the ferndb package.
Install it like any Go module dependency:
go get github.com/chaitanya-uike/ferndbUse Open to create or open a database file, and Close when the application
is done with it:
package main
import (
"log"
"github.com/chaitanya-uike/ferndb"
)
func main() {
db, err := ferndb.Open("app.db")
if err != nil {
log.Fatal(err)
}
defer db.Close()
}db.Exec parses, analyses, plans, and executes one SQL statement. Read
statements run in a read transaction. Write statements run in their own
auto-committed write transaction.
if _, err := db.Exec("CREATE TABLE users (id INT PRIMARY KEY, name TEXT NOT NULL)"); err != nil {
log.Fatal(err)
}
if _, err := db.Exec("INSERT INTO users VALUES (1, 'alice'), (2, 'bob')"); err != nil {
log.Fatal(err)
}For write statements, inspect Result.Affected:
result, err := db.Exec("UPDATE users SET name = 'robert' WHERE id = 2")
if err != nil {
log.Fatal(err)
}
log.Printf("%d rows changed", result.Affected)SELECT, DESCRIBE, and EXPLAIN return column names and rows. Values are
returned as types.Value. Call Value() to get the native Go value (int64,
float64, string, bool, []byte, or nil for SQL NULL).
result, err := db.Exec("SELECT id, name FROM users ORDER BY id")
if err != nil {
log.Fatal(err)
}
for _, row := range result.Rows {
id := row[0].Value().(int64)
name := row[1].Value().(string)
log.Printf("%d: %s", id, name)
}Use BeginWrite when multiple writes should commit or roll back together:
tx := db.BeginWrite()
if _, err := tx.Exec("INSERT INTO users VALUES (3, 'carol')"); err != nil {
tx.Abort()
log.Fatal(err)
}
if _, err := tx.Exec("INSERT INTO users VALUES (4, 'dave')"); err != nil {
tx.Abort()
log.Fatal(err)
}
if err := tx.Commit(); err != nil {
log.Fatal(err)
}Use Begin for an explicit read transaction:
tx := db.Begin()
defer tx.Abort()
result, err := tx.Exec("SELECT id, name FROM users")
if err != nil {
log.Fatal(err)
}
_ = resultRead transactions reject write statements with ferndb.ErrReadOnly.
Putting the basics together:
package main
import (
"log"
"github.com/chaitanya-uike/ferndb"
)
func main() {
db, err := ferndb.Open("app.db")
if err != nil {
log.Fatal(err)
}
defer db.Close()
if _, err := db.Exec("CREATE TABLE users (id INT PRIMARY KEY, name TEXT NOT NULL)"); err != nil {
log.Fatal(err)
}
if _, err := db.Exec("INSERT INTO users VALUES (1, 'alice')"); err != nil {
log.Fatal(err)
}
result, err := db.Exec("SELECT id, name FROM users")
if err != nil {
log.Fatal(err)
}
_ = result
}fernDB speaks a small RESP-inspired protocol over TCP. Request frames are plain bytes with short type tags, decimal lengths,
and \r\n delimiters.
The bundled CLI is just one client implementation. Applications can talk to the server from any language that can open a TCP socket and read/write these frames.
Every scalar line is terminated by CRLF, the two-byte delimiter \r\n
(carriage return 0x0d, line feed 0x0a). In the examples below, \r\n is
shown escaped for readability; on the wire it must be sent as those two bytes,
not as four printable characters. For length-prefixed bulk strings, the payload
is followed by another CRLF:
$5\r\n
hello\r\n
That frame means: bulk string, 5 payload bytes, payload hello, then CRLF.
The length counts only payload bytes; it does not include either CRLF.
Requests are arrays whose first element is a bulk-string command and whose
remaining elements are bulk-string arguments. This request runs
SELECT * FROM users:
*2\r\n
$4\r\n
EXEC\r\n
$19\r\n
SELECT * FROM users\r\n
As a single byte stream, the same request is:
*2\r\n$4\r\nEXEC\r\n$19\r\nSELECT * FROM users\r\n
The supported commands are:
| Command | Arguments | Description |
|---|---|---|
PING |
none | Health check; returns PONG |
EXEC |
SQL string | Executes one SQL statement |
BEGIN |
none | Starts a session write transaction |
BEGIN_READ |
none | Starts a session read-only transaction |
COMMIT |
none | Commits the current write transaction |
ROLLBACK |
none | Aborts the current transaction (read or write) |
Successful EXEC responses are a three-element array:
[columns, rows, affected]
columnsis an array of column-name bulk strings.rowsis an array of row arrays.affectedis an integer count for write statements.
Values use RESP-style tags: simple strings (+), errors (-), integers (:),
doubles (,), booleans (#t / #f), bulk strings ($), arrays (*), and
nulls (_). Errors are returned as error frames whose text begins with ERR.
Frame formats:
| Type | Format | Example |
|---|---|---|
| Simple string | +<text>\r\n |
+PONG\r\n |
| Error | -<text>\r\n |
-ERR table not found\r\n |
| Integer | :<digits>\r\n |
:3\r\n |
| Double | ,<digits>\r\n |
,92.5\r\n |
| Boolean | #t\r\n or #f\r\n |
#t\r\n |
| Null | _\r\n |
_\r\n |
| Bulk string | $<length>\r\n<payload>\r\n |
$5\r\nalice\r\n |
| Null bulk string | $-1\r\n |
$-1\r\n |
| Array | *<count>\r\n<frame>... |
*2\r\n$4\r\nEXEC\r\n$8\r\nSELECT 1\r\n |
| Null array | *-1\r\n |
*-1\r\n |
Arrays contain exactly <count> nested frames. Bulk strings can contain binary
payloads, so clients should read exactly <length> bytes before consuming the
trailing CRLF. The current server rejects bulk strings and array lengths larger
than 16 MiB.
For a SELECT id, name FROM users result, an EXEC response may look like:
*3\r\n
*2\r\n
$2\r\n
id\r\n
$4\r\n
name\r\n
*1\r\n
*2\r\n
:1\r\n
$5\r\n
alice\r\n
:0\r\n
Decoded, that is:
columns = ["id", "name"]
rows = [[1, "alice"]]
affected = 0
For writes, columns and rows are empty arrays and affected reports the
number of changed rows. For BEGIN, BEGIN_READ, COMMIT, ROLLBACK, and
PING, successful responses are simple strings such as +OK\r\n or
+PONG\r\n.
For example, a client library can send EXEC with SQL, decode the result array,
and expose it as native rows for its host language. The Go CLI under
cmd/cli is intentionally small so it can be used as a reference when building
clients in any TCP-capable language.
The sections below go deeper into fernDB's internals: transaction behavior, query planning, and the package layout of the engine.
fernDB supports ACID transactions across SQL statements and explicit
BEGIN / COMMIT blocks. A committed transaction is applied as a unit; a
rolled-back or failed transaction leaves no partial changes behind.
Under the hood, fernDB maps SQL read and write transactions onto frostfire transactions. Tables, indexes, catalog entries, sequences, and statistics are all updated through the same transactional storage layer, so commits and rollbacks apply to the whole database state fernDB manages.
Outside an explicit BEGIN / COMMIT block, fernDB wraps each write statement
in its own write transaction and commits it automatically. Read-only statements
(SELECT, DESCRIBE, EXPLAIN) run in a read transaction. Inside an explicit
transaction, COMMIT makes all changes visible together and ROLLBACK discards
them.
frostfire — and therefore fernDB — uses a single-writer, multi-reader concurrency model:
- Any number of read transactions can be open at the same time, and they run concurrently with each other and with the active writer.
- Only one write transaction can be active at a time. A second call to
BeginWritefrom another goroutine or connection blocks until the in-flight writer commits or aborts. Within a single session, issuingBEGINwhile a transaction is already open returns an error instead of opening a nested transaction.
Readers never block writers and writers never block readers. The trade-off is that write throughput is serial: concurrent write workloads queue at the storage engine instead of interleaving.
Readers see a stable snapshot of the database without taking locks. frostfire provides this with MVCC over a copy-on-write B+tree: a read transaction captures the current version of the database when it begins, and writers produce a new version without disturbing the pages older readers are still using.
The practical effect for SQL is snapshot isolation: a long-running SELECT
sees a consistent view of every table, index, and catalog entry as of the
moment it began, even if another connection inserts, updates, or drops tables
in parallel. There are no dirty reads, non-repeatable reads, or phantom reads
within a single read transaction.
A write transaction's changes are buffered in memory until Commit, which
flushes them to disk and fsyncs the file. If the process or machine crashes
mid-commit, the database reopens at the last successfully committed state.
Because every commit pays a fixed cost (page writes and an fsync),
batching related writes into one transaction is the main lever for write
throughput. The CLI and embedded API both expose this directly:
BEGIN;
INSERT INTO users (name, score, active) VALUES ('a', 1.0, TRUE);
INSERT INTO users (name, score, active) VALUES ('b', 2.0, TRUE);
INSERT INTO users (name, score, active) VALUES ('c', 3.0, TRUE);
COMMIT;The three inserts above pay one commit cost, not three. The same applies to
bulk loaders driving fernDB from a Go program — group logically related writes
under a single BeginWrite / Commit.
A few habits follow from the model above:
- Keep write transactions short. While a write transaction is open,
other writers wait. Avoid network calls, or unrelated CPU work
between
BeginWriteandCommit. - Abort on early return. In embedded Go code, always
Aborta write transaction on an error path before returning, otherwise the next writer will block forever. - One transaction per goroutine. A transaction handle is not safe to share across goroutines.
- Long-lived readers delay page reuse. Pages freed by a writer cannot be recycled until every reader that could still see them finishes. Long-running read transactions are cheap for concurrency but can grow the on-disk file if they overlap with heavy write activity. Close them when you are done.
Query planning is the part of fernDB that decides how a SQL statement should run. The parser understands the text, the planner builds and optimizes a plan, and the executor runs the selected plan.
fernDB keeps those responsibilities separate so each stage can work at the right level of abstraction. Parsing produces an abstract syntax tree (AST), analysis turns it into a typed logical plan, optimization searches through equivalent logical plans and physical implementations, and execution runs the selected physical plan as a pipeline of operators.
SQL text
-> lexer / parser
-> analyser
-> logical plan
-> normaliser
-> cost-based optimizer
-> physical plan
-> pipeline operators
The lexer and parser turn SQL into AST nodes for statements and expressions. The analyser then resolves table and column names against the catalog, expands wildcards, checks expression types, validates statement-specific rules, and assigns attributes that later stages can refer to unambiguously.
This is where invalid queries fail early: unknown columns, ambiguous names,
wrong assignment types, non-boolean WHERE clauses, invalid ORDER BY
expressions, bad LIMIT values, and invalid DDL are rejected before planning.
A logical plan describes the relational work required to answer a query without choosing concrete algorithms yet. It is about what needs to happen: read this table, keep these columns, apply this predicate, join these relations, return rows in this order.
After analysis, a SELECT is represented as logical operators such as:
Scanfor reading a table.Filterfor applying predicates.Projectfor choosing output attributes.Joinfor combining relations.- physical ordering requirements for
ORDER BY. Limitfor row limits.
Logical plans do not say whether a table should be read with a sequential scan or an index, or whether a join should use nested loops, hashing, or sorted inputs. They leave those choices open so the optimizer can compare alternatives.
For a query like:
SELECT users.name
FROM users
JOIN orders ON users.id = orders.user_id
WHERE users.active = TRUE;a simplified logical shape is:
Project(users.name)
└─ Filter(users.active = TRUE)
└─ Join(users.id = orders.user_id)
├─ Scan(users)
└─ Scan(orders)
That tree says what relational operations are needed, but it does not yet choose an index, join algorithm, or sort strategy.
The normalizer rewrites expressions inside the logical plan into forms that are
easier for the optimizer to reason about. It applies boolean simplifications,
constant folding, De Morgan rewrites, comparison negation, flattening of nested
AND / OR expressions, and CNF-style rewrites where useful.
For example, predicates are flattened into conjuncts so later stages can push filters down, extract index bounds, and separate join predicates from residual filters.
fernDB uses a Volcano-style cost-based optimizer. Instead of committing to the first valid plan it can build, the optimizer explores a search space of equivalent relational expressions and chooses the cheapest physical plan it can find.
The optimizer has two main phases: an exploration/search phase and an implementation/optimization phase.
During the exploration/search phase, the optimizer applies transformation
rules to generate equivalent logical expressions. These alternatives all compute
the same result, but they expose different shapes for the later optimization
phase. For example, one expression may join users before orders, while
another may join orders first.
Transformation rules define the logical rewrites the optimizer is allowed to apply, such as:
- Join commutativity and associativity, which let the optimizer explore join order.
- Predicate pushdown through joins, which moves filters closer to the tables they reference.
- Projection pushdown, which avoids carrying columns that later operators do not need.
- Filter merging, which combines stacked filters into a simpler predicate.
Equivalent expressions are grouped by the logical result they produce. Each group can contain several ways to compute the same result, which avoids treating every rewrite as a separate tree and lets the optimizer share common subproblems.
During the implementation/optimization phase, implementation rules turn logical expressions into candidate physical plans. The optimizer recursively searches for the cheapest physical plan that satisfies the required physical properties, such as output ordering. Candidate plans are costed and compared until the cheapest valid plan wins.
A physical plan is a concrete implementation of a logical plan. It contains the actual operators fernDB will execute, with costs, estimated row counts, and any physical properties the operator provides, such as output ordering.
One logical plan can have many possible physical plans. A logical Join might
be implemented as NLJoin, HashJoin, or MergeJoin. A logical table read
might become a SeqScan, an IndexSeek, or an IndexSeek followed by an
IndexLookup if the index does not contain every column the query needs.
For the logical shape above, one possible physical plan might be:
Project(users.name)
└─ HashJoin(users.id = orders.user_id)
├─ Filter(users.active = TRUE)
│ └─ SeqScan(users)
└─ SeqScan(orders)
With a useful index on users(active), another candidate might use IndexSeek
for the users side. The optimizer costs those candidates and keeps the
cheapest valid one.
Physical implementation rules produce operators such as:
SeqScanfor reading a table directly.IndexSeekfor using a secondary index with equality or range bounds.IndexLookupwhen an index identifies rows but the query still needs columns from the base table.Filter,Project,Limit, andSort.NLJoin(nested-loop join),HashJoin, andMergeJoin.
The optimizer compares all valid candidates using the cost model. Join
predicates determine which join algorithms are legal; collected statistics
estimate how many rows each operator will produce; required ordering determines
whether a plan already delivers rows in the requested order or needs an
explicit Sort.
ANALYZE stores table and column statistics used during planning. The cost model
combines those estimates with CPU, scan, join, sort, hash, memory, row-size, and
page-size costs.
Those numbers are not meant to be perfect. They are good enough to let fernDB make meaningful choices between alternatives, such as:
- A full table scan versus an index seek.
- One join order versus another.
- Nested-loop, hash, or merge join.
- An index-only plan versus an index lookup back into the table.
- Reusing an existing order versus adding an explicit sort.
When real statistics are not available — a table that has never been
ANALYZEd, or a predicate that the available histograms cannot help with —
the optimizer falls back to built-in defaults: an assumed row count and row
size for the table, plus textbook selectivity guesses for filters (roughly
1% for equality, 33% for range comparisons, 5% for IS NULL). Plans are
still picked, but the estimates are coarse. Running ANALYZE after loading
data is the easiest way to replace these defaults with numbers that reflect
the actual data.
Some queries require output properties beyond just rows. ORDER BY requires a
particular ordering. Merge joins require both inputs to be ordered by the join
keys. The optimizer tracks delivered and required physical properties while
building plans.
If a child plan already satisfies a required order, fernDB can reuse it. If not,
the physical planner inserts an enforcing Sort. This is why EXPLAIN is useful
for understanding not just which access path was chosen, but also where the
engine had to add operators to satisfy the query.
The final physical plan is converted into pipeline operators. Each operator implements the executor interface and produces rows for its parent operator: scans read from frostfire buckets, filters evaluate predicates, joins combine rows, sorts materialize ordered runs when needed, and projections shape the final result.
EXPLAIN prints the selected physical plan, including estimated cost, estimated
rows, and the operator tree. It is the best way to inspect the optimizer's
choices for a specific query.