Skip to content

chaitanya-uike/ferndb

Repository files navigation

fernDB logo

fernDB

fernDB is an experimental toy relational database written in Go, built to explore how a SQL query engine works from the parser down to the execution layer. It supports ACID transactions and includes the pieces you would expect in a real database system: SQL parsing, query planning, a cost-based optimizer, secondary indexes, join algorithms, query execution pipelines, a server, and an interactive CLI.

fernDB uses frostfire as its storage engine. frostfire is a transactional key-value engine, while fernDB focuses on the relational layer: SQL parsing, planning, optimization, execution, and mapping tables/indexes onto frostfire KV operations.

Status: fernDB is a learning and experimental project. It is useful for understanding how a query engine fits together, but it is not intended for production workloads.

Contents

Features

  • Full SQL frontend that parses, validates, normalizes, and plans statements before execution.
  • Server mode with a RESP-inspired wire protocol, plus a small CLI that shows how client applications can be built on top.
  • ACID transactions for individual statements and explicit BEGIN / COMMIT / ROLLBACK workflows.
  • Core SQL support for schema changes, reads, writes, indexes, table introspection, plan inspection, and statistics collection.
  • Typed storage for INT, FLOAT, TEXT, BOOL, BLOB, and NULL values.
  • Relational constraints and indexing with primary keys, AUTOINCREMENT, NOT NULL, and unique secondary indexes.
  • Query support for projection, filtering, expressions, multi-column ordering, limits, and inner joins.
  • Statistics-aware optimizer that uses collected table and column statistics to choose efficient access paths, join strategies, and sort plans.

Quick start

fernDB requires Go 1.25 or newer.

Build the server and CLI:

go build -o ferndb-server ./cmd/server
go build -o ferndb-cli ./cmd/cli

Start the server:

./ferndb-server

By default the server listens on 127.0.0.1:5454 and stores data in ferndb.db. These and the per-connection timeouts can be tuned via flags:

./ferndb-server -addr 127.0.0.1:5454 -db my.db \
    -idle-timeout 5m -write-timeout 30s
Flag Default Description
-addr 127.0.0.1:5454 TCP listen address.
-db ferndb.db Database file path.
-idle-timeout 5m Max time a client may stay idle between requests before the connection is closed. Set to 0 to disable.
-write-timeout 30s Max time allowed to write a single response. Set to 0 to disable.

Connect with the CLI:

./ferndb-cli
./ferndb-cli -addr 127.0.0.1:5454

Try a small session:

CREATE TABLE users (
    id     INT PRIMARY KEY AUTOINCREMENT,
    name   TEXT NOT NULL,
    score  FLOAT,
    active BOOL
);

INSERT INTO users (name, score, active) VALUES
    ('alice', 92.5, TRUE),
    ('bob', 71.0, FALSE),
    ('carol', 88.2, TRUE);

CREATE INDEX by_active_score ON users (active, score);
ANALYZE users;

SELECT id, name, score
FROM users
WHERE active = TRUE
ORDER BY score DESC
LIMIT 2;

Example output:

╭────┬───────┬───────╮
│ id │ name  │ score │
├────┼───────┼───────┤
│ 1  │ alice │ 92.5  │
│ 3  │ carol │ 88.2  │
╰────┴───────┴───────╯
(2 rows)

CLI

The CLI is an interactive client for connecting to a running fernDB server and executing SQL statements.

Meta commands:

\h, \help     show help
\q, exit      disconnect and quit

Transaction commands:

BEGIN;        -- start a write transaction
BEGIN READ;   -- start a read-only transaction
COMMIT;       -- commit the current write transaction
ROLLBACK;     -- abort the current transaction (read or write)

SQL statements are terminated with ;.

Outside an explicit transaction, each write statement runs in its own auto-committed write transaction. COMMIT is only valid for a write transaction; ending a read transaction with ROLLBACK releases it.

SQL reference

Data types

fernDB has a small fixed type system:

Type Notes
INT 64-bit signed integer, e.g. 42
FLOAT 64-bit floating point number, e.g. 92.5
TEXT UTF-8 string, e.g. 'alice'
BOOL Boolean value: TRUE or FALSE
BLOB Byte string, written as a hex literal such as X'48656c6c6f'

Columns are nullable by default. Use NOT NULL to reject NULL values.

Expressions

Expressions can appear in projections, WHERE clauses, JOIN ... ON conditions, and UPDATE ... SET assignments.

  • Column references: name, users.name.
  • Wildcards in projections: *, users.*.
  • Literals: 42, 92.5, 'alice', TRUE, FALSE, NULL, X'48656c6c6f'.
  • Arithmetic: +, -, *, /.
  • Comparisons: =, !=, <, <=, >, >=.
  • Boolean logic: AND, OR, NOT.
  • Null checks: IS NULL, IS NOT NULL.
  • Parenthesized grouping: (score + bonus) >= 100.

Example:

SELECT id, name, score + 5
FROM users
WHERE (active = TRUE AND score >= 80) OR name = 'alice';

CREATE TABLE and DROP TABLE

Every table must declare exactly one PRIMARY KEY column. AUTOINCREMENT is supported on integer primary keys. NOT NULL is supported per column.

CREATE TABLE users (
    id     INT PRIMARY KEY AUTOINCREMENT,
    name   TEXT NOT NULL,
    score  FLOAT,
    active BOOL
);

DROP TABLE users;

INSERT

Multi-row VALUES and explicit column lists are supported. For autoincrement primary keys, omit the primary key column; supplying an explicit value is rejected.

INSERT INTO users (name, score, active) VALUES
    ('alice', 92.5, TRUE),
    ('bob', 71.0, FALSE),
    ('carol', 88.2, TRUE);

SELECT

SELECT supports projection, WHERE, joins, multi-column ORDER BY, and LIMIT.

SELECT id, name, score
FROM users
WHERE active = TRUE
ORDER BY score DESC
LIMIT 2;

JOIN

Joins let queries combine rows from multiple tables using JOIN ... ON conditions. fernDB supports chained inner joins, so a query can join two or more tables in one statement.

Use qualified column names such as users.id when a query reads from multiple tables. The optimizer uses join predicates and statistics to choose both join order and join algorithm.

SELECT users.name, products.name
FROM users
JOIN orders ON users.id = orders.user_id
JOIN products ON orders.product_id = products.id;

UPDATE

SET expressions can reference existing column values.

UPDATE users
SET score = score + 5
WHERE name = 'bob';

Primary key columns cannot be updated. Secondary indexes are maintained when indexed columns change.

DELETE

DELETE FROM users WHERE active = FALSE;

Omitting WHERE deletes every row in the table.

Indexes

Indexes give the optimizer more ways to avoid full table scans and satisfy ordered queries. fernDB supports single-column and composite secondary indexes, with optional uniqueness enforcement.

CREATE INDEX by_name ON users (name);
CREATE UNIQUE INDEX by_user_product ON orders (user_id, product_id);
DROP INDEX by_name ON users;

The optimizer can use indexes to speed up filters when they match the query shape:

CREATE INDEX by_active_score ON users (active, score);

SELECT id FROM users WHERE active = TRUE;

Transactions

Use transactions when a group of statements should commit or roll back together. The CLI and wire protocol expose session-level BEGIN, BEGIN READ, COMMIT, and ROLLBACK commands:

BEGIN;
INSERT INTO users (name, score, active) VALUES ('dave', 60.0, TRUE);
SELECT id, name FROM users WHERE name = 'dave';
ROLLBACK;

BEGIN READ opens a read-only transaction. Read transactions reject write statements and see a consistent snapshot of the database for as long as they are open. Use ROLLBACK to release a read transaction:

BEGIN READ;
SELECT id, name FROM users;
SELECT id, user_id FROM orders;
ROLLBACK;

Outside an explicit transaction, each write statement is committed automatically. Embedded Go applications can use Begin() and BeginWrite() for manual transaction control.

DESCRIBE

DESCRIBE is a quick way to inspect a table from the CLI or a client application. It returns rows for columns, indexes, and any autoincrement sequence:

DESCRIBE users;

Example output:

╭──────────┬─────────┬────────────────────────────────────────╮
│ section  │ name    │ detail                                 │
├──────────┼─────────┼────────────────────────────────────────┤
│ column   │ id      │ int NOT NULL PRIMARY KEY AUTOINCREMENT │
│ column   │ name    │ string NOT NULL                        │
│ column   │ score   │ float                                  │
│ column   │ active  │ bool                                   │
│ index    │ by_name │ (name)                                 │
│ sequence │ users   │ 3                                      │
╰──────────┴─────────┴────────────────────────────────────────╯
(6 rows)

EXPLAIN

EXPLAIN shows the physical plan chosen for a SELECT. Use it to see whether the optimizer picked a table scan or index seek, which join algorithm it chose, and where sorts, limits, and projections appear in the plan.

EXPLAIN
SELECT users.name, products.name
FROM users
JOIN orders ON users.id = orders.user_id
JOIN products ON orders.product_id = products.id
WHERE users.name = 'alice'
ORDER BY products.name
LIMIT 5;

Output:

╭───────────────────────────────────────────────────────────────────────────────╮
│ QUERY PLAN                                                                    │
├───────────────────────────────────────────────────────────────────────────────┤
│ Project: users.name, products.name  (cost=3.17 rows=2 rowSize=9)              │
│ └─ Limit: 5  (cost=3.16 rows=2)                                               │
│    └─ Sort: products.name ASC  (cost=3.14 rows=2)                             │
│       └─ MergeJoin: orders.product_id=products.id  (cost=3.12 rows=2)         │
│          ├─ Sort: orders.product_id ASC  (cost=2.09 rows=2)                   │
│          │  └─ NLJoin: (users.id = orders.user_id)  (cost=2.06 rows=2)        │
│          │     ├─ Filter: (users.name = "alice")  (cost=1.02 rows=1)          │
│          │     │  └─ SeqScan: users  (cost=1.02 rows=2)                       │
│          │     └─ SeqScan: orders  (cost=1.03 rows=3)                         │
│          └─ SeqScan: products  (cost=1.02 rows=2)                             │
╰───────────────────────────────────────────────────────────────────────────────╯
(1 rows)

Plans include estimated costs and row counts. The exact operators may change as statistics change.

ANALYZE

ANALYZE refreshes the table and column statistics used by the optimizer:

ANALYZE users;

Those statistics include row counts, average row sizes, distinct-value counts, null fractions, min/max values, most-common values, and histograms.

The optimizer uses these statistics to estimate how many rows a filter or join will produce, how expensive each access path is likely to be, and which join order is cheaper. Those estimates drive the final execution plan: index seeks when a predicate is selective, merge or hash joins when they are cheaper than nested loops, and sorts only where the required ordering is not already available.

BLOB

BLOB stores raw bytes. In SQL, byte values are written as hex literals using the X'...' form:

CREATE TABLE blobs (id INT PRIMARY KEY, payload BLOB);
INSERT INTO blobs VALUES (1, X'48656c6c6f');
SELECT id, payload FROM blobs;

Using fernDB as a Go package

fernDB can also be embedded directly in a Go program. This path bypasses the server and wire protocol: your application opens a database file and executes SQL through the ferndb package.

Install it like any Go module dependency:

go get github.com/chaitanya-uike/ferndb

Open a database

Use Open to create or open a database file, and Close when the application is done with it:

package main

import (
    "log"

    "github.com/chaitanya-uike/ferndb"
)

func main() {
    db, err := ferndb.Open("app.db")
    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()
}

Execute SQL

db.Exec parses, analyses, plans, and executes one SQL statement. Read statements run in a read transaction. Write statements run in their own auto-committed write transaction.

if _, err := db.Exec("CREATE TABLE users (id INT PRIMARY KEY, name TEXT NOT NULL)"); err != nil {
    log.Fatal(err)
}

if _, err := db.Exec("INSERT INTO users VALUES (1, 'alice'), (2, 'bob')"); err != nil {
    log.Fatal(err)
}

For write statements, inspect Result.Affected:

result, err := db.Exec("UPDATE users SET name = 'robert' WHERE id = 2")
if err != nil {
    log.Fatal(err)
}
log.Printf("%d rows changed", result.Affected)

Read results

SELECT, DESCRIBE, and EXPLAIN return column names and rows. Values are returned as types.Value. Call Value() to get the native Go value (int64, float64, string, bool, []byte, or nil for SQL NULL).

result, err := db.Exec("SELECT id, name FROM users ORDER BY id")
if err != nil {
    log.Fatal(err)
}

for _, row := range result.Rows {
    id := row[0].Value().(int64)
    name := row[1].Value().(string)
    log.Printf("%d: %s", id, name)
}

Use explicit transactions

Use BeginWrite when multiple writes should commit or roll back together:

tx := db.BeginWrite()

if _, err := tx.Exec("INSERT INTO users VALUES (3, 'carol')"); err != nil {
    tx.Abort()
    log.Fatal(err)
}
if _, err := tx.Exec("INSERT INTO users VALUES (4, 'dave')"); err != nil {
    tx.Abort()
    log.Fatal(err)
}

if err := tx.Commit(); err != nil {
    log.Fatal(err)
}

Use Begin for an explicit read transaction:

tx := db.Begin()
defer tx.Abort()

result, err := tx.Exec("SELECT id, name FROM users")
if err != nil {
    log.Fatal(err)
}
_ = result

Read transactions reject write statements with ferndb.ErrReadOnly.

Minimal example

Putting the basics together:

package main

import (
    "log"

    "github.com/chaitanya-uike/ferndb"
)

func main() {
    db, err := ferndb.Open("app.db")
    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()

    if _, err := db.Exec("CREATE TABLE users (id INT PRIMARY KEY, name TEXT NOT NULL)"); err != nil {
        log.Fatal(err)
    }
    if _, err := db.Exec("INSERT INTO users VALUES (1, 'alice')"); err != nil {
        log.Fatal(err)
    }

    result, err := db.Exec("SELECT id, name FROM users")
    if err != nil {
        log.Fatal(err)
    }
    _ = result
}

Wire protocol

fernDB speaks a small RESP-inspired protocol over TCP. Request frames are plain bytes with short type tags, decimal lengths, and \r\n delimiters.

The bundled CLI is just one client implementation. Applications can talk to the server from any language that can open a TCP socket and read/write these frames.

Every scalar line is terminated by CRLF, the two-byte delimiter \r\n (carriage return 0x0d, line feed 0x0a). In the examples below, \r\n is shown escaped for readability; on the wire it must be sent as those two bytes, not as four printable characters. For length-prefixed bulk strings, the payload is followed by another CRLF:

$5\r\n
hello\r\n

That frame means: bulk string, 5 payload bytes, payload hello, then CRLF. The length counts only payload bytes; it does not include either CRLF.

Requests are arrays whose first element is a bulk-string command and whose remaining elements are bulk-string arguments. This request runs SELECT * FROM users:

*2\r\n
$4\r\n
EXEC\r\n
$19\r\n
SELECT * FROM users\r\n

As a single byte stream, the same request is:

*2\r\n$4\r\nEXEC\r\n$19\r\nSELECT * FROM users\r\n

The supported commands are:

Command Arguments Description
PING none Health check; returns PONG
EXEC SQL string Executes one SQL statement
BEGIN none Starts a session write transaction
BEGIN_READ none Starts a session read-only transaction
COMMIT none Commits the current write transaction
ROLLBACK none Aborts the current transaction (read or write)

Successful EXEC responses are a three-element array:

[columns, rows, affected]
  • columns is an array of column-name bulk strings.
  • rows is an array of row arrays.
  • affected is an integer count for write statements.

Values use RESP-style tags: simple strings (+), errors (-), integers (:), doubles (,), booleans (#t / #f), bulk strings ($), arrays (*), and nulls (_). Errors are returned as error frames whose text begins with ERR.

Frame formats:

Type Format Example
Simple string +<text>\r\n +PONG\r\n
Error -<text>\r\n -ERR table not found\r\n
Integer :<digits>\r\n :3\r\n
Double ,<digits>\r\n ,92.5\r\n
Boolean #t\r\n or #f\r\n #t\r\n
Null _\r\n _\r\n
Bulk string $<length>\r\n<payload>\r\n $5\r\nalice\r\n
Null bulk string $-1\r\n $-1\r\n
Array *<count>\r\n<frame>... *2\r\n$4\r\nEXEC\r\n$8\r\nSELECT 1\r\n
Null array *-1\r\n *-1\r\n

Arrays contain exactly <count> nested frames. Bulk strings can contain binary payloads, so clients should read exactly <length> bytes before consuming the trailing CRLF. The current server rejects bulk strings and array lengths larger than 16 MiB.

For a SELECT id, name FROM users result, an EXEC response may look like:

*3\r\n
*2\r\n
$2\r\n
id\r\n
$4\r\n
name\r\n
*1\r\n
*2\r\n
:1\r\n
$5\r\n
alice\r\n
:0\r\n

Decoded, that is:

columns = ["id", "name"]
rows    = [[1, "alice"]]
affected = 0

For writes, columns and rows are empty arrays and affected reports the number of changed rows. For BEGIN, BEGIN_READ, COMMIT, ROLLBACK, and PING, successful responses are simple strings such as +OK\r\n or +PONG\r\n.

For example, a client library can send EXEC with SQL, decode the result array, and expose it as native rows for its host language. The Go CLI under cmd/cli is intentionally small so it can be used as a reference when building clients in any TCP-capable language.

How it works

The sections below go deeper into fernDB's internals: transaction behavior, query planning, and the package layout of the engine.

Transactions

fernDB supports ACID transactions across SQL statements and explicit BEGIN / COMMIT blocks. A committed transaction is applied as a unit; a rolled-back or failed transaction leaves no partial changes behind.

Under the hood, fernDB maps SQL read and write transactions onto frostfire transactions. Tables, indexes, catalog entries, sequences, and statistics are all updated through the same transactional storage layer, so commits and rollbacks apply to the whole database state fernDB manages.

Outside an explicit BEGIN / COMMIT block, fernDB wraps each write statement in its own write transaction and commits it automatically. Read-only statements (SELECT, DESCRIBE, EXPLAIN) run in a read transaction. Inside an explicit transaction, COMMIT makes all changes visible together and ROLLBACK discards them.

Concurrency model: single writer, multiple readers

frostfire — and therefore fernDB — uses a single-writer, multi-reader concurrency model:

  • Any number of read transactions can be open at the same time, and they run concurrently with each other and with the active writer.
  • Only one write transaction can be active at a time. A second call to BeginWrite from another goroutine or connection blocks until the in-flight writer commits or aborts. Within a single session, issuing BEGIN while a transaction is already open returns an error instead of opening a nested transaction.

Readers never block writers and writers never block readers. The trade-off is that write throughput is serial: concurrent write workloads queue at the storage engine instead of interleaving.

Snapshot isolation via copy-on-write

Readers see a stable snapshot of the database without taking locks. frostfire provides this with MVCC over a copy-on-write B+tree: a read transaction captures the current version of the database when it begins, and writers produce a new version without disturbing the pages older readers are still using.

The practical effect for SQL is snapshot isolation: a long-running SELECT sees a consistent view of every table, index, and catalog entry as of the moment it began, even if another connection inserts, updates, or drops tables in parallel. There are no dirty reads, non-repeatable reads, or phantom reads within a single read transaction.

Commit, durability, and batching

A write transaction's changes are buffered in memory until Commit, which flushes them to disk and fsyncs the file. If the process or machine crashes mid-commit, the database reopens at the last successfully committed state.

Because every commit pays a fixed cost (page writes and an fsync), batching related writes into one transaction is the main lever for write throughput. The CLI and embedded API both expose this directly:

BEGIN;
INSERT INTO users (name, score, active) VALUES ('a', 1.0, TRUE);
INSERT INTO users (name, score, active) VALUES ('b', 2.0, TRUE);
INSERT INTO users (name, score, active) VALUES ('c', 3.0, TRUE);
COMMIT;

The three inserts above pay one commit cost, not three. The same applies to bulk loaders driving fernDB from a Go program — group logically related writes under a single BeginWrite / Commit.

Practical guidance

A few habits follow from the model above:

  • Keep write transactions short. While a write transaction is open, other writers wait. Avoid network calls, or unrelated CPU work between BeginWrite and Commit.
  • Abort on early return. In embedded Go code, always Abort a write transaction on an error path before returning, otherwise the next writer will block forever.
  • One transaction per goroutine. A transaction handle is not safe to share across goroutines.
  • Long-lived readers delay page reuse. Pages freed by a writer cannot be recycled until every reader that could still see them finishes. Long-running read transactions are cheap for concurrency but can grow the on-disk file if they overlap with heavy write activity. Close them when you are done.

Query planning

Query planning is the part of fernDB that decides how a SQL statement should run. The parser understands the text, the planner builds and optimizes a plan, and the executor runs the selected plan.

fernDB keeps those responsibilities separate so each stage can work at the right level of abstraction. Parsing produces an abstract syntax tree (AST), analysis turns it into a typed logical plan, optimization searches through equivalent logical plans and physical implementations, and execution runs the selected physical plan as a pipeline of operators.

SQL text
  -> lexer / parser
  -> analyser
  -> logical plan
  -> normaliser
  -> cost-based optimizer
  -> physical plan
  -> pipeline operators

Parsing and analysis

The lexer and parser turn SQL into AST nodes for statements and expressions. The analyser then resolves table and column names against the catalog, expands wildcards, checks expression types, validates statement-specific rules, and assigns attributes that later stages can refer to unambiguously.

This is where invalid queries fail early: unknown columns, ambiguous names, wrong assignment types, non-boolean WHERE clauses, invalid ORDER BY expressions, bad LIMIT values, and invalid DDL are rejected before planning.

Logical plans

A logical plan describes the relational work required to answer a query without choosing concrete algorithms yet. It is about what needs to happen: read this table, keep these columns, apply this predicate, join these relations, return rows in this order.

After analysis, a SELECT is represented as logical operators such as:

  • Scan for reading a table.
  • Filter for applying predicates.
  • Project for choosing output attributes.
  • Join for combining relations.
  • physical ordering requirements for ORDER BY.
  • Limit for row limits.

Logical plans do not say whether a table should be read with a sequential scan or an index, or whether a join should use nested loops, hashing, or sorted inputs. They leave those choices open so the optimizer can compare alternatives.

For a query like:

SELECT users.name
FROM users
JOIN orders ON users.id = orders.user_id
WHERE users.active = TRUE;

a simplified logical shape is:

Project(users.name)
└─ Filter(users.active = TRUE)
   └─ Join(users.id = orders.user_id)
      ├─ Scan(users)
      └─ Scan(orders)

That tree says what relational operations are needed, but it does not yet choose an index, join algorithm, or sort strategy.

Normalization

The normalizer rewrites expressions inside the logical plan into forms that are easier for the optimizer to reason about. It applies boolean simplifications, constant folding, De Morgan rewrites, comparison negation, flattening of nested AND / OR expressions, and CNF-style rewrites where useful.

For example, predicates are flattened into conjuncts so later stages can push filters down, extract index bounds, and separate join predicates from residual filters.

Cost-based optimization

fernDB uses a Volcano-style cost-based optimizer. Instead of committing to the first valid plan it can build, the optimizer explores a search space of equivalent relational expressions and chooses the cheapest physical plan it can find.

The optimizer has two main phases: an exploration/search phase and an implementation/optimization phase.

During the exploration/search phase, the optimizer applies transformation rules to generate equivalent logical expressions. These alternatives all compute the same result, but they expose different shapes for the later optimization phase. For example, one expression may join users before orders, while another may join orders first.

Transformation rules define the logical rewrites the optimizer is allowed to apply, such as:

  • Join commutativity and associativity, which let the optimizer explore join order.
  • Predicate pushdown through joins, which moves filters closer to the tables they reference.
  • Projection pushdown, which avoids carrying columns that later operators do not need.
  • Filter merging, which combines stacked filters into a simpler predicate.

Equivalent expressions are grouped by the logical result they produce. Each group can contain several ways to compute the same result, which avoids treating every rewrite as a separate tree and lets the optimizer share common subproblems.

During the implementation/optimization phase, implementation rules turn logical expressions into candidate physical plans. The optimizer recursively searches for the cheapest physical plan that satisfies the required physical properties, such as output ordering. Candidate plans are costed and compared until the cheapest valid plan wins.

Physical plans

A physical plan is a concrete implementation of a logical plan. It contains the actual operators fernDB will execute, with costs, estimated row counts, and any physical properties the operator provides, such as output ordering.

One logical plan can have many possible physical plans. A logical Join might be implemented as NLJoin, HashJoin, or MergeJoin. A logical table read might become a SeqScan, an IndexSeek, or an IndexSeek followed by an IndexLookup if the index does not contain every column the query needs.

For the logical shape above, one possible physical plan might be:

Project(users.name)
└─ HashJoin(users.id = orders.user_id)
   ├─ Filter(users.active = TRUE)
   │  └─ SeqScan(users)
   └─ SeqScan(orders)

With a useful index on users(active), another candidate might use IndexSeek for the users side. The optimizer costs those candidates and keeps the cheapest valid one.

Physical implementation rules produce operators such as:

  • SeqScan for reading a table directly.
  • IndexSeek for using a secondary index with equality or range bounds.
  • IndexLookup when an index identifies rows but the query still needs columns from the base table.
  • Filter, Project, Limit, and Sort.
  • NLJoin (nested-loop join), HashJoin, and MergeJoin.

The optimizer compares all valid candidates using the cost model. Join predicates determine which join algorithms are legal; collected statistics estimate how many rows each operator will produce; required ordering determines whether a plan already delivers rows in the requested order or needs an explicit Sort.

Statistics and costs

ANALYZE stores table and column statistics used during planning. The cost model combines those estimates with CPU, scan, join, sort, hash, memory, row-size, and page-size costs.

Those numbers are not meant to be perfect. They are good enough to let fernDB make meaningful choices between alternatives, such as:

  • A full table scan versus an index seek.
  • One join order versus another.
  • Nested-loop, hash, or merge join.
  • An index-only plan versus an index lookup back into the table.
  • Reusing an existing order versus adding an explicit sort.

When real statistics are not available — a table that has never been ANALYZEd, or a predicate that the available histograms cannot help with — the optimizer falls back to built-in defaults: an assumed row count and row size for the table, plus textbook selectivity guesses for filters (roughly 1% for equality, 33% for range comparisons, 5% for IS NULL). Plans are still picked, but the estimates are coarse. Running ANALYZE after loading data is the easiest way to replace these defaults with numbers that reflect the actual data.

Physical properties

Some queries require output properties beyond just rows. ORDER BY requires a particular ordering. Merge joins require both inputs to be ordered by the join keys. The optimizer tracks delivered and required physical properties while building plans.

If a child plan already satisfies a required order, fernDB can reuse it. If not, the physical planner inserts an enforcing Sort. This is why EXPLAIN is useful for understanding not just which access path was chosen, but also where the engine had to add operators to satisfy the query.

Execution

The final physical plan is converted into pipeline operators. Each operator implements the executor interface and produces rows for its parent operator: scans read from frostfire buckets, filters evaluate predicates, joins combine rows, sorts materialize ordered runs when needed, and projections shape the final result.

EXPLAIN prints the selected physical plan, including estimated cost, estimated rows, and the operator tree. It is the best way to inspect the optimizer's choices for a specific query.

About

A tiny relational database written in Go

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages