Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,17 @@ cloudflared tunnel --url http://localhost:9332

See [self-hosting guide](docs/self-hosting.md) for full production setup.

## Research Export

For offline fee-forecasting work, the repo now includes a local JSONL export path that emits the same merged row shape used by the companion benchmark importer.

```powershell
$env:PYTHONPATH='src'
python scripts/export_fee_forecast_benchmark.py data/fee-forecast-benchmark.jsonl --hours 168 --interval-minutes 10
```

The export joins local `fee_history` observations to the next `1-6` confirmed blocks from the research tables in `data/bitcoin_api.db`. After migration `012_add_research_tables.sql`, the background fee collector also fills `block_confirmations` on each detected new block and logs fee estimates every cycle, so a normal local API run can build this export without extra manual seeding. Very recent observations without six future block outcomes are skipped automatically.

## Contributing

Issues and PRs welcome. Run the test suite before submitting:
Expand Down
22 changes: 18 additions & 4 deletions docs/OPERATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,22 @@ Requires the Fee Observatory to be collecting data (`bitcoin-fee-observatory` re

**Dashboard:** `GET /fee-observatory` — branded page with iframe to Streamlit dashboard (port 8505).

### Export fee forecast benchmark rows

Use the local benchmark export when you want importer-compatible JSONL for offline forecasting work or the companion `bitcoin-fee-forecast-bench` repo.

```powershell
$env:PYTHONPATH='src'
python scripts/export_fee_forecast_benchmark.py data/fee-forecast-benchmark.jsonl --hours 168 --interval-minutes 10
```

Notes:
- Reads `fee_history` observations from the main API DB and joins them to the next `1-6` `block_confirmations`
- Emits one JSONL row per usable observation with `observation_id`, `observed_at`, `features`, and `clearing_fee_bin_by_horizon`
- Skips observations that do not yet have six future confirmed blocks
- The research tables come from migration `012_add_research_tables.sql`
- On a normal local API run, the background fee collector fills those research tables automatically as new blocks arrive

### x402 Stablecoin Micropayments (optional)

Enables pay-per-call via the x402 protocol (USDC on Base). Requires the `bitcoin-api-x402` package.
Expand Down Expand Up @@ -388,11 +404,9 @@ Replace `YOUR_KEY` with the value from your `.env` `ADMIN_API_KEY`.

The background fee collector thread automatically prunes old data once per 24 hours:
- Usage logs older than 90 days are deleted
- Fee history older than 30 days is downsampled to hourly averages
- Fee history older than 365 days is deleted
- Research data (block_confirmations, fee_estimates_log) older than 365 days is deleted
- Fee history older than 30 days is deleted

The fee collector also logs multi-source fee estimates every 5 minutes (Core 8 targets, mempool.space 4 targets, local mempool 1 target) and captures block confirmation feerate percentiles on each new block.
The fee collector also logs Core fee estimates for targets `1`, `6`, and `144` every 5 minutes, adds mempool.space estimates for `1`, `3`, `6`, and `144` when that public API is reachable, and captures block confirmation feerate percentiles on each detected new block after the collector has seen a prior tip.

Check API logs for `Auto-prune:` messages to confirm it's running.

Expand Down
25 changes: 19 additions & 6 deletions docs/SCOPE_OF_WORK.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Satoshi API -- Scope of Work

**Version:** 0.3.4
**Date:** 2026-03-08
**Date:** 2026-04-24
**Author:** Bortlesboat
**Status:** Live -- https://bitcoinsapi.com

Expand Down Expand Up @@ -50,19 +50,19 @@ Bitcoin Core RPC (port 8332, localhost only)
| `main.py` | App creation, lifespan, router registration (~177 lines) | Composition root |
| `middleware.py` | Security headers, CORS, auth + rate limiting middleware, gzip compression | Middleware chain |
| `exceptions.py` | RPC, validation, HTTP, and generic exception handlers; RFC 7807 `type` URIs | Exception handler registry |
| `jobs.py` | Background fee collector thread lifecycle | Background worker |
| `jobs.py` | Background fee collector thread lifecycle, fee estimate logging, and block confirmation capture for research tables | Background worker |
| `static_routes.py` | Landing page, robots.txt, sitemap, decision pages | Static file serving |
| `usage_buffer.py` | Batch usage logging (flush at 50 rows or 30s) | Write-behind buffer |
| `migrations/` | SQL migration files + runner, tracked in `schema_migrations` | Sequential migrations |
| `auth.py` | API key validation, tier resolution | Strategy (tier-based) |
| `rate_limit.py` | Per-minute sliding window (in-memory or Upstash Redis) + daily limits | Token bucket / sliding window |
| `notifications.py` | Transactional email (Resend) + analytics events (PostHog) | Fire-and-forget side effects |
| `cache.py` | TTL caching with reorg-safe depth awareness, stale fallback for graceful degradation, `get_cached_node_info()` helper for non-RPC contexts | Cache-aside with lock-per-cache + stale-while-error |
| `db.py` | SQLite (WAL mode), usage logging, key storage | Repository pattern |
| `db.py` | SQLite (WAL mode), fee history, self-populating fee research tables, usage logging, key storage | Repository pattern |
| `config.py` | 12-factor env var config via Pydantic | Settings singleton |
| `dependencies.py` | Lazy singleton RPC connection | Dependency injection |
| `models.py` | Response envelope, typed data models | DTO / envelope pattern |
| `services/` | Business logic: fee analysis, tx broadcast, exchange comparison, serializers | Service layer (pure functions) |
| `services/` | Business logic: fee analysis, benchmark export, tx broadcast, exchange comparison, serializers | Service layer (pure functions) |
| `routers/` | 28 thin HTTP routers (25 core + 3 indexer) — parameter validation, auth, response envelope | RESTful resource routing |

### 2.3 Design Principles Applied
Expand Down Expand Up @@ -427,6 +427,9 @@ Errors follow the same structure:
39. **Pro checkout dead end** -- "Upgrade to Pro" button returned 503; changed to "Contact for Pro" mailto link
40. **Watchdog stale code** -- `API_DIR` resolved relative to script location (broke when Task Scheduler ran old release copy); now uses `releases/bitcoin-api-current` symlink

**Benchmark Export Self-Sufficiency (Apr 24):**
41. **Clean exporter branch could not bootstrap its own research data** -- The background fee collector now writes `block_confirmations` on detected new blocks and logs fee estimates into `fee_estimates_log`, so fresh installs can produce real benchmark export rows after migration `012_add_research_tables.sql`.

### 5.3 Known Limitations (Acceptable for v0.1)

| Limitation | Impact | When to Address |
Expand All @@ -439,6 +442,7 @@ Errors follow the same structure:
| ~~No webhook support~~ | ~~Clients must poll~~ | **RESOLVED** -- WebSocket `/api/v1/ws` with pub/sub |
| No address transaction history | Cannot provide `/address/{addr}/txs` | Deliberate -- Bitcoin Core RPC has no `getaddresshistory`. Requires external indexer (Electrs, Fulcrum). We offer `scantxoutset` via POST `/address/utxos` for UTXO lookup by address. Adding Electrs increases deployment complexity significantly. |
| Email delivery depends on Resend | Welcome email fails silently if Resend is down | Graceful degradation -- registration succeeds regardless, key always returned in response |
| Fee benchmark export needs six future confirmed blocks per observation | Very recent fee-history rows are skipped until enough blocks confirm | Acceptable for offline research export; full `1-6` block outcomes matter more than max recency |

---

Expand Down Expand Up @@ -499,7 +503,7 @@ Errors follow the same structure:
- `src/bitcoin_api/indexer/routers/` -- indexed_address, indexed_tx, indexer_status
- `src/bitcoin_api/indexer/migrations/` -- 001_initial_schema.sql

**Tests (23 test files + 2 support files):**
**Tests (current repo test files + support files):**
- `tests/test_health.py` -- 11 tests (health, root, status, healthz, docs, visualizer)
- `tests/test_blocks.py` -- 18 tests (block-related endpoints)
- `tests/test_fees.py` -- 45 tests (fee endpoints + fee research infrastructure)
Expand All @@ -525,6 +529,8 @@ Errors follow the same structure:
- `tests/test_indexer_services.py` -- 12 tests (address balance/history, transaction detail)
- `tests/test_price_service.py` -- 13 tests (price service provider fallback, caching, error handling)
- `tests/test_observatory.py` -- 13 tests (Fee Observatory endpoints: scoreboard, block-stats, estimates, 503 fallback, static page)
- `tests/test_fee_benchmark_export.py` -- 2 tests (benchmark export row builder + CLI writer)
- `tests/test_jobs.py` -- 2 tests (single-iteration fee collector coverage for research table population)
- `tests/test_x402_stats.py` -- 6 tests (x402 payment analytics)
- `tests/test_e2e.py` -- 21 e2e tests (against live node)
- `tests/locustfile.py` -- Load test (8 weighted endpoints)
Expand All @@ -548,8 +554,9 @@ Errors follow the same structure:
**Project config (1 file):**
- `CLAUDE.md` -- Project instructions for AI-assisted development

**Scripts (14 files):**
**Scripts (15 files):**
- `scripts/create_api_key.py`, `scripts/seed_db.py`
- `scripts/export_fee_forecast_benchmark.py` (writes benchmark-ready JSONL from local fee research data)
- `scripts/security_check.sh` (requires `SATOSHI_API_KEY` env var for POST tests)
- `scripts/security_audit.py` (10 automated security checks)
- `scripts/staging-check.sh` (pre-deploy validation: starts staging server, checks CSP/headers/docs/endpoints)
Expand All @@ -564,6 +571,12 @@ Errors follow the same structure:
- `scripts/smoke-test-api.sh` (5-point health check for cron monitoring; supports --quiet)
- `scripts/doc_consistency.py` (CI-enforced doc consistency checks)

**Research export surfaces (5 files):**
- `src/bitcoin_api/services/benchmark_export.py` (joins fee history to future block outcomes for offline benchmark export)
- `src/bitcoin_api/benchmark_export_cli.py` (CLI entrypoint for benchmark-ready JSONL export)
- `src/bitcoin_api/migrations/012_add_research_tables.sql` (fee research tables for block confirmations and estimate logs)
- `src/bitcoin_api/jobs.py` + `src/bitcoin_api/db.py` (background collector now populates those research tables during normal API operation)

**Legal (3 files):**
- `static/terms.html` -- Terms of Service (FL governing law, liability limitation, acceptable use)
- `static/privacy.html` -- Privacy Policy (data collection, retention, third-party services)
Expand Down
7 changes: 7 additions & 0 deletions scripts/export_fee_forecast_benchmark.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
"""Local wrapper for exporting benchmark-ready fee forecast rows."""

from bitcoin_api.benchmark_export_cli import main


if __name__ == "__main__":
raise SystemExit(main())
50 changes: 50 additions & 0 deletions src/bitcoin_api/benchmark_export_cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
"""CLI for exporting benchmark-ready fee forecast datasets."""

from __future__ import annotations

import argparse
from pathlib import Path

from .services.benchmark_export import write_fee_forecast_benchmark_export


def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
description="Export fee research tables into benchmark-ready JSONL rows.",
)
parser.add_argument("output_path", type=Path, help="Destination JSONL path")
parser.add_argument(
"--hours",
type=int,
default=168,
help="How many recent hours of fee history to inspect (default: 168)",
)
parser.add_argument(
"--interval-minutes",
type=int,
default=10,
help="Fee history downsampling interval in minutes (default: 10)",
)
parser.add_argument(
"--limit",
type=int,
default=None,
help="Optional cap on exported examples (keeps the most recent rows)",
)
return parser


def main(argv: list[str] | None = None) -> int:
parser = build_parser()
args = parser.parse_args(argv)
write_fee_forecast_benchmark_export(
args.output_path,
hours=args.hours,
interval_minutes=args.interval_minutes,
limit=args.limit,
)
return 0


if __name__ == "__main__":
raise SystemExit(main())
62 changes: 62 additions & 0 deletions src/bitcoin_api/db.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,68 @@ def record_fee_snapshot(
conn.commit()


def record_block_confirmation(
block_height: int,
block_hash: str,
block_time: str,
tx_count: int,
total_fees_sat: int,
min_feerate: float,
max_feerate: float,
p10_feerate: float,
p25_feerate: float,
p50_feerate: float,
p75_feerate: float,
p90_feerate: float,
core_est_1: float | None = None,
core_est_6: float | None = None,
core_est_144: float | None = None,
mempool_local_est: float | None = None,
mempool_space_est: float | None = None,
) -> None:
conn = get_db()
conn.execute(
"INSERT OR REPLACE INTO block_confirmations "
"(block_height, block_hash, block_time, tx_count, total_fees_sat, "
"min_feerate, max_feerate, p10_feerate, p25_feerate, p50_feerate, "
"p75_feerate, p90_feerate, core_est_1, core_est_6, core_est_144, "
"mempool_local_est, mempool_space_est) "
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
(
block_height,
block_hash,
block_time,
tx_count,
total_fees_sat,
min_feerate,
max_feerate,
p10_feerate,
p25_feerate,
p50_feerate,
p75_feerate,
p90_feerate,
core_est_1,
core_est_6,
core_est_144,
mempool_local_est,
mempool_space_est,
),
)
conn.commit()


def record_fee_estimates_batch(entries: list[tuple[str, int, float]]) -> None:
if not entries:
return

conn = get_db()
conn.executemany(
"INSERT INTO fee_estimates_log (source, target, feerate) VALUES (?, ?, ?)",
entries,
)
conn.commit()


def get_fee_history(hours: int = 24, interval_minutes: int = 10) -> list[dict]:
conn = get_db()
rows = conn.execute(
Expand Down
Loading