Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Agent Instructions for pragmaticDB

These rules are **mandatory** for any AI coding assistant working on this project.
Violations of these rules are unacceptable under any circumstance.

---

## Rule 1: Zero Deletions

**Do NOT delete a single line of existing code.**

- No existing function, method, struct, class, enum, variable, comment, or include may be removed.
- No existing function signature may be changed in a way that breaks callers.
- If a function needs new parameters, use **default parameters** so all existing call sites compile unchanged.
- If a struct/class needs new fields, **append** them — never reorder, rename, or remove existing fields.
- New default values on appended fields must ensure the old behavior when the field is not explicitly set.

## Rule 2: Existing Features Must Not Change Behavior

**Every existing feature must continue to work exactly as it did before your changes.**

This includes but is not limited to:
- `CREATE TABLE`
- `INSERT INTO`
- `SELECT * FROM <table>;` (single-table select)
- `DELETE FROM <table>;` and `DELETE FROM <table> WHERE col = val;`
- `COMMIT;`
- `exit` / `quit`
- Persistence (catalog.db, table_N.db files)
- TCP server behavior

If you add a new code path (e.g., JOIN), it must be gated behind a condition check
so that existing queries never enter the new path. Example pattern:

```cpp
if (new_feature_is_active) {
return NewFeaturePath();
}
// ... entire existing code below, untouched ...
```

## Rule 3: Additive-Only Architecture

All new features must be implemented as **additions**, not modifications:

- **New files** are always preferred over modifying existing files.
- When modifying an existing file is unavoidable, only **append** new code (new methods, new includes, new fields).
- The only acceptable in-body change to an existing function is inserting a short dispatch/guard at the **top** that routes to a new function, leaving the rest of the function body untouched.

## Rule 4: Test Preservation

- **Never modify or delete existing test logic.** Avoid modifying existing test files. If the project uses a central test runner/registry (e.g., `tests/test_main.cpp`, `include/tests.h`), appending a new test invocation or declaration there is allowed when necessary to wire new tests.
- All existing tests must continue to compile and pass after your changes.
- New tests go in **new test files** (e.g., `tests/test_join.cpp`).
- After making changes, verify that `make test` passes all existing tests.

## Rule 5: The Expression AST Is Join-Only

The Expression AST (`Expression`, `ComparisonExpression`, `LogicalExpression`, etc.)
and the expression evaluator (`EvaluateExpression`) are **exclusively** for JOIN condition evaluation.

- Do NOT refactor existing WHERE clause handling to use the AST.
- Do NOT refactor existing DELETE WHERE to use the AST.
- These existing features use their own simple string-comparison logic and must continue to do so.
- If a future feature needs expressions (e.g., SELECT WHERE with complex conditions),
that is a separate, deliberate decision — not something to do as a side effect.

---

## Summary

| Action | Allowed? |
|---|---|
| Adding new files | ✅ Always |
| Appending new methods/fields to existing files | ✅ Yes |
| Adding a 3-line dispatch guard at the top of an existing function | ✅ Yes |
| Deleting any existing line of code | ❌ Never |
| Renaming any existing function, variable, or file | ❌ Never |
| Changing an existing function's behavior | ❌ Never |
| Modifying existing test files | ❌ Never |
| Using the Expression AST outside of JOIN | ❌ Never |
5 changes: 5 additions & 0 deletions include/catalog/schema.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@ class Schema {
uint32_t GetLength() const;
uint32_t GetColumnCount() const;

static Schema Merge(
const Schema& left, const std::string& left_table,
const Schema& right, const std::string& right_table
);

private:
uint32_t length_;
std::vector<Column> columns_;
Expand Down
5 changes: 5 additions & 0 deletions include/ds/statement.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

#include <string>
#include <vector>
#include <memory>
#include "catalog/column.h"
#include "query/expression.h"

// ── Statement type tag ────────────────────────────────────────────────────────
enum class StatementType {
Expand Down Expand Up @@ -40,6 +42,9 @@ struct InsertStatement : public Statement {
struct SelectStatement : public Statement {
std::string table_name;

std::string join_table_name;
std::unique_ptr<Expression> join_condition;

SelectStatement() : Statement(StatementType::SELECT) {}
};

Expand Down
14 changes: 13 additions & 1 deletion include/query/executor.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
#include "../ds/statement.h"
#include "../ds/query_result.h"
#include "catalog/schema.h"
#include "query/optimizer.h"
#include "query/index_provider.h"
/**
* @brief Executes a parsed Statement against the database Catalog.
*
Expand All @@ -13,7 +15,8 @@
*/
class Executor {
public:
explicit Executor(Catalog& catalog) : catalog_(catalog) {}
explicit Executor(Catalog& catalog, const IndexProvider& idx = kDefaultIndexProvider)
: catalog_(catalog), index_provider_(idx) {}

/**
* @brief Execute the given statement and return a QueryResult.
Expand All @@ -28,5 +31,14 @@ class Executor {
QueryResult ExecuteCommit();
QueryResult ExecuteDelete(const DeleteStatement& stmt);

QueryResult ExecuteJoin(const SelectStatement& stmt);
std::vector<std::pair<RecordId, RecordId>> ExecuteBranch(
const BranchPlan& branch,
TableInfo* left_tbl, TableInfo* right_tbl,
const Schema& left_schema, const Schema& right_schema,
const Schema& merged_schema
);

Catalog& catalog_;
const IndexProvider& index_provider_;
};
60 changes: 60 additions & 0 deletions include/query/expression.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#pragma once
#include <string>
#include <memory>

// Expression type tags
enum class ExpressionType {
COLUMN_REF, // e.g. "users.id"
CONSTANT, // e.g. "42", "true"
COMPARISON, // =, >, <, >=, <=, !=
LOGICAL // AND, OR
};

enum class ComparisonType { EQ, NEQ, LT, GT, LTE, GTE };
enum class LogicType { AND, OR };

// Base class — all expression nodes inherit from this
struct Expression {
ExpressionType expr_type;
virtual ~Expression() = default;

protected:
Expression(ExpressionType type) : expr_type(type) {}
};

// Leaf: references a column like "users.id"
struct ColumnRefExpression : Expression {
std::string table_name; // "users" (from "users.id")
std::string col_name; // "id"

ColumnRefExpression(std::string table, std::string col)
: Expression(ExpressionType::COLUMN_REF), table_name(std::move(table)), col_name(std::move(col)) {}
};

// Leaf: a literal value like 42 or true
struct ConstantExpression : Expression {
std::string raw_value; // stored as string, converted at eval time

ConstantExpression(std::string val)
: Expression(ExpressionType::CONSTANT), raw_value(std::move(val)) {}
};

// Internal node: left <op> right
struct ComparisonExpression : Expression {
ComparisonType comp_type;
std::unique_ptr<Expression> left;
std::unique_ptr<Expression> right;

ComparisonExpression(ComparisonType comp, std::unique_ptr<Expression> l, std::unique_ptr<Expression> r)
: Expression(ExpressionType::COMPARISON), comp_type(comp), left(std::move(l)), right(std::move(r)) {}
};

// Internal node: left AND/OR right
struct LogicalExpression : Expression {
LogicType logic_type;
std::unique_ptr<Expression> left;
std::unique_ptr<Expression> right;

LogicalExpression(LogicType logic, std::unique_ptr<Expression> l, std::unique_ptr<Expression> r)
: Expression(ExpressionType::LOGICAL), logic_type(logic), left(std::move(l)), right(std::move(r)) {}
};
12 changes: 12 additions & 0 deletions include/query/expression_eval.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#pragma once
#include "query/expression.h"
#include "type/tuple.h"
#include "catalog/schema.h"

// Evaluates any expression tree against a single (possibly merged) tuple.
// Returns true if the condition holds.
bool EvaluateExpression(
const Expression* expr,
const Tuple& tuple,
const Schema& schema
);
27 changes: 27 additions & 0 deletions include/query/index_provider.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#pragma once

#include <vector>
#include <string>
#include "../ds/record_id.h"

class IndexProvider {
public:
virtual ~IndexProvider() = default;

// Given a table name, column name, and a value, return all RecordIds matching col=val
virtual std::vector<RecordId> LookupEquals(
const std::string& table_name,
const std::string& col_name,
const std::string& value) const = 0;
};

class NullIndexProvider : public IndexProvider {
public:
std::vector<RecordId> LookupEquals(
const std::string&, const std::string&, const std::string&) const override {
return {}; // always returns empty
}
};

// Static default instance with stable lifetime for use as a default reference
static NullIndexProvider kDefaultIndexProvider{};
31 changes: 31 additions & 0 deletions include/query/optimizer.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#pragma once
#include <vector>
#include <memory>
#include "query/expression.h"
#include "query/index_provider.h"

struct EquiCondition {
std::string left_col;
std::string right_col;
};

struct IndexCondition {
std::string table;
std::string col;
std::string val;
};

struct BranchPlan {
std::vector<IndexCondition> index_conditions;
std::vector<EquiCondition> equi_conditions;
std::unique_ptr<Expression> theta_filter; // anything not captured above
};

struct JoinPlan {
std::vector<BranchPlan> branches;
};

class Optimizer {
public:
static JoinPlan PlanJoin(const std::unique_ptr<Expression>& condition, const IndexProvider& idx_provider);
};
5 changes: 5 additions & 0 deletions include/query/parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,9 @@ class Parser {
std::unique_ptr<Statement> ParseInsert(std::istringstream& ss);
std::unique_ptr<Statement> ParseSelect(std::istringstream& ss);
std::unique_ptr<Statement> ParseDelete(std::istringstream& ss);

std::unique_ptr<Expression> ParseExpression(std::istringstream& ss);
std::unique_ptr<Expression> ParseAndExpression(std::istringstream& ss);
std::unique_ptr<Expression> ParseComparison(std::istringstream& ss);
std::unique_ptr<Expression> ParseAtom(std::istringstream& ss);
};
1 change: 1 addition & 0 deletions include/tests.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ class test {
void TestCatalogClass();
void TestTableIteratorClass();
void TestQueryEngineClass();
void TestJoin();
};
2 changes: 2 additions & 0 deletions include/type/tuple.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ class Tuple {
const char* GetData() const;
uint32_t GetLength() const;

static Tuple Merge(const Tuple& left, const Tuple& right);

private:
std::vector<char> data_;
};
9 changes: 9 additions & 0 deletions include/type/value.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,15 @@ class Value {
data_ = std::make_any<T>(val);
}
void test();

TypeId GetTypeId() const { return type_id_; }

bool CompareEquals(const Value& other) const;
bool CompareLessThan(const Value& other) const;
bool CompareGreaterThan(const Value& other) const;
bool CompareLessThanOrEqual(const Value& other) const;
bool CompareGreaterThanOrEqual(const Value& other) const;
bool CompareNotEqual(const Value& other) const;
private:
TypeId type_id_;
std::any data_;
Expand Down
14 changes: 14 additions & 0 deletions src/catalog/schema.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,17 @@ uint32_t Schema::GetLength() const { return length_; }
uint32_t Schema::GetColumnCount() const {
return static_cast<uint32_t>(columns_.size());
}

Schema Schema::Merge(
const Schema& left, const std::string& left_table,
const Schema& right, const std::string& right_table
) {
std::vector<Column> merged_cols;
for (const auto& col : left.GetColumns()) {
merged_cols.emplace_back(left_table + "." + col.GetName(), col.GetType());
}
for (const auto& col : right.GetColumns()) {
merged_cols.emplace_back(right_table + "." + col.GetName(), col.GetType());
}
return Schema(merged_cols);
}
Loading
Loading