Skip to content

Conversation

@hdgarrood
Copy link
Contributor

@hdgarrood hdgarrood commented Dec 3, 2025

Fixes #1611. With these changes, we can migrate an arbitrary number of entities with just 4 queries.

Before submitting your PR, check that you've:

  • Documented new APIs with Haddock markup
  • Added @since declarations to the Haddock
  • Ran fourmolu on any changed files (restyled will do this for you, so
    accept the suggested changes if it makes them)
  • Adhered to the code style (see the .editorconfig and fourmolu.yaml files for details)

After submitting your PR:

  • Update the Changelog.md file with a link to your PR
  • Bumped the version number if there isn't an (unreleased) on the Changelog
  • Check that CI passes (or if it fails, for reasons unrelated to your change, like CI timeouts)

import qualified Database.PostgreSQL.Simple.Types as PG

import qualified Blaze.ByteString.Builder.Char8 as BBB
import Control.Arrow
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was getting pretty big, so I wanted to move all of the migrations stuff into a separate one. A lot of the code here is unchanged: in essence, all I've done is extract the querying part out so that it happens first, and so that we pull all the data we need at once (with 4 queries) rather than doing N+1s.

backend <- ask
pure (SqlBackend.connPrepare backend)

-- NB: we do not perform these migrations in main.hs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test suite is kind of cursed in that it leaves the schema in the test DB, because this means that if there's a bug in the code that determines what migrations you need to apply, you only see test failures if you run the tests twice in a row (if you're starting from a fresh DB).

I think I've made things slightly worse here by doing this... but I also think it's important to directly exercise the migrator like this.


let
expected =
SchemaState
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This almost acts like a golden test. If you want to change the expected output, the easiest thing to do is to just rerun the test, eyeball the "but got: ..." to make sure it makes sense, and then copy it into the code here.

collectSchemaState
:: (Text -> IO Statement) -> [EntityNameDB] -> IO (Either Text SchemaState)
collectSchemaState getStmt entityNames = runExceptT $ do
existence <- getTableExistence getStmt entityNames
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each of these four functions performs exactly one query, and these are all of the queries the migrator now needs to perform.

(errs, _) -> throwError (T.intercalate "\n" errs)
where
getTableExistenceSql =
"SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname != 'pg_catalog'"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all basically the same queries as before, but instead of doing where tablename = ?, I'm doing where tablename = ANY (?), and substituting with the full list of tables.

([], xs) -> pure $ Map.unionsWith Map.union xs
(errs, _) -> throwError (T.intercalate "\n" errs)
where
-- TODO: should this filter by schema?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the existing query doesn't filter by schema and I feel like it maybe should? but that's one for later I think

allDefs
entity
(newcols, udspair)
(map dubiouslyRemoveReferences essColumns, Map.toList essConstraints)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is basically the same as before except for this one line - in the previous version, we were relying on getColumn to implicitly remove these references (or rather not fetch them in the first place), now we're doing it explicitly here

-- otherwise no-op, `getAlters` will handle dropping this for us.
oldCol

-- | Indicates whether a Postgres Column is safe to drop.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything from here downwards is exactly the same as it was before

@hdgarrood hdgarrood marked this pull request as ready for review December 4, 2025 17:41
@hdgarrood hdgarrood changed the title wip: avoid N+1 in postgresql migrations Avoid N+1 in postgresql migrations Dec 4, 2025
Copy link
Collaborator

@parsonsmatt parsonsmatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Batch migration performance

2 participants