Skip to content

Missing tests - recovery tests #1766

Description

@steve-chavez

Currently recovery tests are done manually, it'd be great to have them as automated tests.

These are the main scenarios:

(the connection recovery worker is referred as just "worker")

1. postgrest started with a pg connection, then pg becomes unavailable

  • worker starts only after a request to postgrest, postgrest should respond with a {"details":"no connection to the server\n","message":"Database client error. Retrying the connection."}
    • if db-channel-enabled=true, worker starts immediately, not necessary to prove this in tests though.
  • pg becomes available, postgrest succeeds reconnecting, reloads the schema cache and responds with 200
  • if db-load-guc-config=true, it should also re-read the in-db config.
    • test with an ALTER ROLE postgrest_test_authenticator SET pgrst.db_schemas = 'public'; and try a GET /public_consumers which should give a 404 if the in-db config isn't re-read.

2. unavailable pg, postgrest started

  • worker starts immediately, postgrest should respond with a 503 {"message":"Database connection lost. Retrying the connection."}
    • Bug: if db-channel-enabled=true, postgrest doesn't reply and curl gives Connection refused. This must be because of the mvarConnectionStatus MVar, it doesn't happen on 1 though.
  • pg becomes available, postgrest succeeds reconnecting, reloads the schema cache and responds with 200
  • if db-load-guc-config=true, it should also re-read the in-db config.
    • Same test as 1

3. SIGUSR1 - NOTIFY reload schema

  • when these are done, no running requests using pg connections must be interrupted
  • when postgrest has a pg connection, both SIGUSR1 and NOTIFY will reload the schema cache
    • if db-load-guc-config=true, it should also re-read the in-db config.
    • ensure SIGUSR1 starts the worker when db-channel-enabled=true(got it to lock before, and worker was not starting, so this must be ensured)
  • when postgrest loses the connection, and db-channel-enabled=false(only SIGUSR1)
    • SIGUSR1 starts the worker, only one can run at a time. Ensured by refIsWorkerOn, this can be confirmed by doing several SIGUSR1 and just noting one Attempting to reconnect to the database in 1 seconds... message. If refIsWorkerOn is removed, there will be several Attempting to reconnect to the database in 1 seconds... mesagges.
      • Not sure how to test this, maybe count the number of threads?
    • pg becomes available, postgrest succeeds reconnecting, reloads the schema cache and responds with 200
  • when postgrest loses the connection, and db-channel-enabled=true
    • ensure the listener recovers, e.g. doing a NOTIFY 'reload cache/load config' should work after recovery.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ciRelated to CI setuptestsRelated to tests

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions