EagleDB

An embeddable, in-process T-SQL query engine. EagleDB lets you run T-SQL queries against local DataFrames using Apache DataFusion — no SQL Server connection required at query time.

Why EagleDB?

SQL Server is great for production, but iterating on analytical queries against a live database is slow and contends with other workloads. EagleDB copies the tables you need into memory once, then executes your T-SQL locally — hundreds of times faster for complex joins and CTEs.

Features

T-SQL compatibility — ISNULL, LEN, CHARINDEX, STUFF...FOR XML PATH, GETDATE, DATEADD, DATEDIFF, CAST, CTEs, subqueries, CASE, bitwise operators, and more
Automatic transpilation — Parses T-SQL, rewrites SQL Server idioms, and transpiles to DataFusion SQL
In-process execution — No server process; runs inside your Python application via PyO3
ODBC table loading — Point at a SQL Server instance, name the tables you need, and EagleDB loads them as in-memory Arrow tables
Batch execution — Supports GO-separated batches, DECLARE/SET variables, temp tables, and multi-statement scripts
Pandas integration — Results returned as pandas DataFrames

Architecture

┌─────────────────────────────────────────────────┐
│                  Python App                      │
│                                                  │
│   import eagledb                                 │
│   session = eagledb.Session()                    │
│   session.load_table(conn, "schema.table")       │
│   df = session.execute("SELECT ...")             │
│                                                  │
├─────────────────────────────────────────────────┤
│              Python Wrapper Layer                 │
│         (session.py, loader.py)                   │
│    Table loading via pyodbc, DataFrame I/O        │
├─────────────────────────────────────────────────┤
│              Rust Core (PyO3)                     │
│                                                  │
│  ┌──────────┐  ┌───────────┐  ┌──────────────┐  │
│  │  Parser   │  │  Planner  │  │   Engine     │  │
│  │ (T-SQL    │→│ (rewrite  │→│ (DataFusion  │  │
│  │  lexer)   │  │  & map)   │  │  execution)  │  │
│  └──────────┘  └───────────┘  └──────────────┘  │
│                                                  │
│  ┌──────────┐  ┌───────────┐  ┌──────────────┐  │
│  │ Functions │  │  Dialect  │  │   Catalog    │  │
│  │ (T-SQL    │  │ (syntax   │  │ (table       │  │
│  │  → DF)    │  │  compat)  │  │  registry)   │  │
│  └──────────┘  └───────────┘  └──────────────┘  │
└─────────────────────────────────────────────────┘

Conversion Pipeline

Parse — State-machine lexer splits SQL into statements, respecting quotes, comments, brackets, and semicolons
Normalize — Strip comments, collapse whitespace, remove WITH (NOLOCK) hints
Rewrite — Convert STUFF...FOR XML PATH to STRING_AGG, handle correlated subqueries with bitwise ops, strip three-part names
Map functions — ISNULL → COALESCE, LEN → LENGTH, CHARINDEX(a,b) → STRPOS(b,a), GETDATE() → CURRENT_TIMESTAMP, etc.
Execute — Run on DataFusion with full Arrow columnar performance

Installation

From source (requires Rust toolchain and maturin)

git clone https://github.com/atomicmilkshake/eagledb.git
cd eagledb
pip install maturin
maturin develop --release

Dependencies

Python 3.8+
Rust 1.80+ (for building)
ODBC Driver 17 for SQL Server (for loading tables from SQL Server)

Quick Start

import eagledb

# Create an in-memory session
session = eagledb.Session()

# Load tables from SQL Server (one-time)
conn_str = "DRIVER={ODBC Driver 17 for SQL Server};SERVER=myserver;DATABASE=mydb;Trusted_Connection=yes"
session.load_table(conn_str, "dbo.Customers")
session.load_table(conn_str, "dbo.Orders")

# Execute T-SQL locally — no server round-trip
df = session.execute("""
    SELECT c.Name, COUNT(o.OrderID) AS OrderCount
    FROM dbo.Customers c
    JOIN dbo.Orders o ON c.CustomerID = o.CustomerID
    WHERE o.OrderDate >= DATEADD(YEAR, -1, GETDATE())
    GROUP BY c.Name
    ORDER BY OrderCount DESC
""")

print(df)

Batch execution with variables

result = session.execute_batch("""
    DECLARE @cutoff DATE = '2024-01-01';

    SELECT *
    INTO #recent
    FROM dbo.Orders
    WHERE OrderDate >= @cutoff;

    SELECT CustomerID, COUNT(*) AS cnt
    FROM #recent
    GROUP BY CustomerID
    ORDER BY cnt DESC;
""")

Benchmarks

Measured on a desktop workstation (Ryzen 7, 32 GB RAM) against a typical enterprise GIS database with tables ranging from 10K to 500K rows.

Query Type	SQL Server	EagleDB	Speedup
Simple join + filter	1.2 s	4 ms	~300x
Multi-CTE analytical	3.6 s	10 ms	~360x
13-statement stress script	>4 hours	25 s	—

First run includes table loading time (typically 2-5 seconds per table over the network). Subsequent runs hit the in-memory cache and execute at full speed.

Supported T-SQL Features

Category	Examples
Functions	`ISNULL`, `LEN`, `CHARINDEX`, `STUFF`, `REVERSE`, `GETDATE`, `DATEADD`, `DATEDIFF`, `CAST`, `CONVERT`
Aggregates	`STRING_AGG` (via `STUFF...FOR XML PATH`), `COUNT`, `SUM`, `AVG`, `VAR`
Syntax	CTEs, subqueries, `CASE`, `BETWEEN`, `IN`, `LIKE`, `TOP`, `DISTINCT`, `UNION`
Operators	Bitwise `&`, `
Batches	`GO` separators, `DECLARE`/`SET` variables, temp tables (`#temp`)
Hints	`WITH (NOLOCK)` — silently stripped
SET options	`SET NOCOUNT ON`, `SET TRANSACTION ISOLATION LEVEL`, etc. — silently ignored

Project Structure

eagledb/
├── src/
│   ├── lib.rs              # PyO3 module entry point
│   ├── engine.rs           # Core DataFusion session
│   ├── catalog.rs          # Table registry
│   ├── types.rs            # Type mappings (SQL Server → Arrow)
│   └── tsql/
│       ├── parser.rs       # T-SQL statement parser
│       ├── planner.rs      # Query rewriter
│       ├── functions.rs    # Function mappings
│       └── dialect.rs      # Syntax compatibility
├── python/eagledb/
│   ├── __init__.py         # Public API
│   ├── session.py          # High-level Session class
│   └── loader.py           # ODBC table loader
├── tests/
│   ├── integration/        # Integration tests
│   └── smoke_test.py       # Quick smoke test
├── examples/
│   └── arcfm_blaster.py    # Example application
├── Cargo.toml
└── pyproject.toml

License

This project is licensed under the GNU General Public License v3.0 — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EagleDB

Why EagleDB?

Features

Architecture

Conversion Pipeline

Installation

From source (requires Rust toolchain and maturin)

Dependencies

Quick Start

Batch execution with variables

Benchmarks

Supported T-SQL Features

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
docs		docs
examples		examples
python/eagledb		python/eagledb
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
EAGLEDB_PROJECT_PLAN.md		EAGLEDB_PROJECT_PLAN.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
run.bat		run.bat

Folders and files

Latest commit

History

Repository files navigation

EagleDB

Why EagleDB?

Features

Architecture

Conversion Pipeline

Installation

From source (requires Rust toolchain and maturin)

Dependencies

Quick Start

Batch execution with variables

Benchmarks

Supported T-SQL Features

Project Structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages