Skip to content

Conversation

@frrist
Copy link
Member

@frrist frrist commented Jan 28, 2026

Closes #414

Summary

Adds PostgreSQL as an optional database backend while keeping SQLite as the default. Operators who have been happily using SQLite will notice nothing. Operators approaching SQLite's limits now have somewhere to go.
This implements the database modularity described in Piri Operations Scaling, enabling a more scalable alternative to SQLite.

Changes

PostgreSQL Backend Support

  • Job Queue System (lib/jobqueue/): Added PostgreSQL dialect support with schema isolation. Each logical queue (replicator, aggregator, egress_tracker) uses a separate PostgreSQL schema to avoid table conflicts while sharing a single database connection.
    -Task Engine/Scheduler: Extended GORM-based task scheduler to support PostgreSQL via the scheduler schema.
  • SQL Dialect Abstraction (lib/jobqueue/dialect/): New dialect package handles SQL differences between SQLite and PostgreSQL (parameter placeholders ? vs $1, and similar indignities).

Architecture Improvements

  • Moved SQLite path derivation from config to database providers. Config now holds values; providers handle initialization. This follows single-responsibility principle and—more importantly—stops creating SQLite directories when PostgreSQL is configured.

Configuration

[repo.database]
  type = "postgres"
  url = "postgres://user:pass@localhost:5432/piri"
  
  # Optional pool settings (defaults shown)
  max_open_conns = 5        # 4 pools × 5 = 20 total connections
  max_idle_conns = 5        # Equal to max to avoid connection churn
  conn_max_lifetime = "30m"

Multi-Node Shared Database: Multiple Piri nodes can share a single PostgreSQL server by using separate databases:

# Node 1
url = "postgres://user:pass@localhost:5432/piri_node1"

# Node 2  
url = "postgres://user:pass@localhost:5432/piri_node2"

Each node creates its own schemas within its database. No conflicts, no coordination required.

Testing

All job queue and worker tests now run against both SQLite and PostgreSQL backends using testcontainers-go, following the pattern established for S3-compatible storage testing via MinIO.

Migration Notes

  • SQLite users: No action required. SQLite remains the default.
  • PostgreSQL users: Requires PostgreSQL 13+ for gen_random_uuid() support.
  • Existing data: This PR adds PostgreSQL support; it does not provide SQLite-to-PostgreSQL migration tooling. That's future work if operators need it.

@frrist frrist requested a review from alanshaw as a code owner January 28, 2026 23:28
@frrist frrist linked an issue Jan 28, 2026 that may be closed by this pull request
@frrist frrist self-assigned this Jan 28, 2026
@frrist frrist changed the title Support PostgreSQL Add PostgreSQL as Optional Database Backend Jan 28, 2026
postgres.WithPassword("test"),
testcontainers.WithWaitStrategy(
wait.ForLog("database system is ready to accept connections").
WithOccurrence(2)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithOccurrence(2) seems odd...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude told me this was due to a Postgres quirk since PostgreSQL logs "database system is ready to accept connections" twice during startup:

  1. After initial database initialization
  2. After PostgreSQL restarts and is truly ready

This tells the container to wait until this log has been seen twice before considering the DB ready. And fwiw this matches my observation of the container:

$ docker run \                                                                                                               
            --name postgres \
            -e POSTGRES_USER=myuser \
            -e POSTGRES_PASSWORD=mypassword \
            -e POSTGRES_DB=mydb \
            -p 5432:5432 \
            postgres:16
<omitted>
2026-01-29 19:18:19.577 UTC [48] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2026-01-29 19:18:19.587 UTC [51] LOG:  database system was shut down at 2026-01-29 19:18:19 UTC
**FIRST TIME:** 2026-01-29 19:18:19.593 UTC [48] LOG:  database system is ready to accept connections
 done
server started
CREATE DATABASE


/usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*

waiting for server to shut down....2026-01-29 19:18:19.765 UTC [48] LOG:  received fast shutdown request
2026-01-29 19:18:19.768 UTC [48] LOG:  aborting any active transactions
2026-01-29 19:18:19.770 UTC [48] LOG:  background worker "logical replication launcher" (PID 54) exited with exit code 1
2026-01-29 19:18:19.771 UTC [49] LOG:  shutting down
2026-01-29 19:18:19.776 UTC [49] LOG:  checkpoint starting: shutdown immediate
2026-01-29 19:18:19.850 UTC [49] LOG:  checkpoint complete: wrote 926 buffers (5.7%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.033 s, sync=0.022 s, total=0.080 s; sync files=301, longest=0.003 s, average=0.001 s; distance=4273 kB, estimate=4273 kB; lsn=0/191F0D0, redo lsn=0/191F0D0
2026-01-29 19:18:19.865 UTC [48] LOG:  database system is shut down
 done
server stopped

PostgreSQL init process complete; ready for start up.

2026-01-29 19:18:19.903 UTC [1] LOG:  starting PostgreSQL 16.11 (Debian 16.11-1.pgdg13+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 14.2.0-19) 14.2.0, 64-bit
2026-01-29 19:18:19.903 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2026-01-29 19:18:19.903 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2026-01-29 19:18:19.909 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2026-01-29 19:18:19.917 UTC [64] LOG:  database system was shut down at 2026-01-29 19:18:19 UTC
**SECOND TIME**: 2026-01-29 19:18:19.924 UTC [1] LOG:  database system is ready to accept connections

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind just adding a short comment for that?

)

// InsertIgnore returns the appropriate INSERT IGNORE/ON CONFLICT DO NOTHING syntax.
func (d Dialect) InsertIgnore(table, columns, placeholders string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm isn't this what the ORM is for? I appreciate that's a bigger change. It's maybe worth switching to using GORMs DB methods in the future so we don't have to do this kind of thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thought crossed my mind, the tradeoff just didn't seem worth it - at this point in time at least.

Most of this jobqueue code was forked from https://github.com/maragudk/goqite which used raw SQL, and refactoring to GORM wasn't worth the effort/risk for this PR imo. (Though now the implementation has diverged significantly).

Alright if we punt this to the future?

@Peeja
Copy link
Member

Peeja commented Feb 3, 2026

FWIW, in storacha/guppy#319, I ended up keeping a single schema and translating it between dialects. Actually, there's no schema file anymore, because we were using Goose, so I made the schema a migration for consistency, but same idea. Don't know if that would work or be helpful here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Piri Postgres Support

4 participants