Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented Nov 20, 2025

About

Adding a missing item from the backlog: AlterTableRecreateStatements is needed when Fivetran propagates changes to primary key columns, which CrateDB can't handle with ALTER TABLE operations, but needs to re-create the tables instead.

Details

The Fivetran SDK tester now runs to completion using the vanilla input.json file.

Review

This PR includes unresolved conversations with CodeRabbit (FYI) that mostly refer to tickets about subsequent iterations to improve relevant spots. Please collapse/resolve them progressively as you process them. Thanks!

The patch yielded a few backlog items, mostly coming from conversations with CodeRabbit: I've collapsed relevant threads to reduce noise, but please consider them when you find the same code smells. We will tackle them with subsequent iterations.

Backlog

References

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Warning

Rate limit exceeded

@amotl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 44 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between df2894b and 175050e.

📒 Files selected for processing (3)
  • docs/handbook.md (1 hunks)
  • src/cratedb_fivetran_destination/engine.py (2 hunks)
  • tests/conftest.py (1 hunks)

Walkthrough

Adds a recreate-via-temporary-table alter flow (AlterTableRecreateStatements and TableAddress), refactors table creation/reflection into helpers, updates type/field mappings and test fixtures, adjusts test expectations and startup cleanup, and exposes a WriteHistoryBatch stub.

Changes

Cohort / File(s) Summary
Engine / Migration logic
src/cratedb_fivetran_destination/engine.py
Added AlterTableRecreateStatements and imported TableAddress. New class holds address_effective, address_temporary, columns_old, columns_new and emits a SqlBag sequence to set read-only, copy mapped columns into a temp table, unset read-only, swap/rename tables, and drop the temp table.
Main server & workflow
src/cratedb_fivetran_destination/main.py
Factored table creation into _create_table and _table_info_from_fivetran; use fresh sa.MetaData() for reflection; detect primary-key changes and create *_alter_tmp then use AlterTableRecreateStatements (fallback to inplace alter); execute generated SQL conditionally; added WriteHistoryBatch stub; engine created with echo=False.
Model / mappings
src/cratedb_fivetran_destination/model.py
Added TableAddress data container (duplicated definition present); extended FieldMap with _fivetran_start/_end/_active__fivetran_*; changed TypeMap mappings: NAIVE_DATEsa.TIMESTAMP() and DECIMALsa.DECIMAL(6, 3).
Test data / canonical fixtures
tests/data/fivetran_canonical/input_fivetran.json
Expanded schemas and ops: many new numeric/text/time/binary/json columns, enabled history_mode for transaction, added composite_table with composite PK, and richer upsert/truncate/update/delete sequences.
Tests & fixtures behavior
tests/test_adapter.py, tests/test_integration.py, tests/conftest.py
Tests updated to use substring log assertions and expect successful alter outcomes; removed some drop cleanup and __fivetran_deleted; conftest now resets read-only (best-effort) and drops testdrive.foo/testdrive.foo_alter_tmp during setup (errors swallowed).
Docs / changelog / dev notes
CHANGES.md, DEVELOP.md, docs/handbook.md
Documented AlterTableRecreateStatements and caveats about temp-table recreate behavior, added DEVELOP.md notes on testing, and handbook caveats describing write-blocking and data-copy risks.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant Server as main.py
    participant Engine as engine.py
    participant DB as CrateDB

    Client->>Server: Request AlterTable
    Server->>Server: Reflect table (fresh MetaData)
    alt Primary key changed
        Server->>Server: _create_table(temp: name_alter_tmp)
        Server->>Engine: Build AlterTableRecreateStatements(address_effective, address_temporary, columns_old, columns_new)
        Engine->>Engine: to_sql() -> SqlBag (set read-only → INSERT mapped cols into temp → unset read-only → swap/rename → DROP temp)
        Server->>DB: Execute SqlBag steps
        DB-->>Server: Success
        Server-->>Client: AlterTable succeeded (recreated)
    else Primary key unchanged
        Server->>Engine: Build/execute AlterTableInplaceStatements
        Server->>DB: Execute ALTER statements
        DB-->>Server: Success
        Server-->>Client: AlterTable succeeded (in-place)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Areas to focus:
    • src/cratedb_fivetran_destination/engine.py — verify SQL generation: identifier quoting, correctness and tightening of old→new column mapping, handling of missing/new columns and data types.
    • src/cratedb_fivetran_destination/main.py — temp table lifecycle, conditional execution, error handling, and reflection with fresh MetaData.
    • src/cratedb_fivetran_destination/model.py — duplicate TableAddress definitions and TypeMap/FieldMap changes impacting existing migrations/tests.
    • tests/conftest.py and updated tests — ensure setup cleanup and relaxed assertions don't mask regressions or flakiness.

Possibly related issues

Poem

🐇 I nibbled schema edges and dug a tidy way,

built a cozy temp-table where old rows hopped to stay,
with a gentle swap and whisk, the new layout takes its place,
no crumbs, no muddle left—just carrots and good grace,
hooray for tidy migrations and a neat, hopped-up space.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.23% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: implementing AlterTableRecreateStatements for handling primary key column changes, which is the central feature of this pull request.
Description check ✅ Passed The description clearly relates to the changeset, explaining the purpose of AlterTableRecreateStatements for handling primary key changes that CrateDB cannot process via ALTER TABLE, and noting that the Fivetran SDK tester now completes successfully.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl amotl marked this pull request as ready for review November 20, 2025 08:24
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Member

@seut seut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really a good idea to automate the support of PK changes by recreating the table?
This may end up in data loss, when writes are continue to happen on the source table while the INSERT INTO into the target table is already ongoing/done.
There should be at least a big warning somewhere.

@amotl
Copy link
Member Author

amotl commented Nov 20, 2025

Hi @seut. Thanks for the review.

Is it really a good idea to automate the support of PK changes by recreating the table?

The Fivetran SDK tester provides the trajectory of the API surface to implement within our adapter, so we don't really have any options not implementing it. There will be other even more exciting features ahead, for example GH-86 and GH-89. We are looking at typical (advanced) data warehouse operations here, even if we don't like them.

This may end up in data loss, when writes are continue to happen on the source table while the INSERT INTO into the target table is already ongoing/done. There should be at least a big warning somewhere.

I hear you, let's do it.

@amotl amotl requested a review from seut November 20, 2025 09:34
@seut
Copy link
Member

seut commented Nov 20, 2025

Hi @seut. Thanks for the review.

Is it really a good idea to automate the support of PK changes by recreating the table?

The Fivetran SDK tester provides the trajectory of the API surface to implement within our adapter, so we don't really have any options not implementing it. There will be other even more exciting features ahead, for example GH-86 and GH-89. We are looking at typical (advanced) data warehouse operations here, even if we don't like them.

But we should not implement such things to make the API/testers happy but then it may result in data loss. Noone will be aware of this and scream later on.

This may end up in data loss, when writes are continue to happen on the source table while the INSERT INTO into the target table is already ongoing/done. There should be at least a big warning somewhere.

I hear you, let's do it.

Didn't see any changes in this regard, do I miss some?

Copy link
Member

@seut seut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One solution to avoid any data loss by this automation is to put the source table into read-only at the beginning of the operations.
ALTER TABLE ... SET ("blocks.write"=true);

@amotl
Copy link
Member Author

amotl commented Nov 20, 2025

I hear you, let's add corresponding warnings.

Didn't see any changes in this regard, do I miss some?

No, I haven't added anything yet, but will also use your "blocks.write"=true on the next iteration. Thanks.

coderabbitai[bot]

This comment was marked as spam.

coderabbitai[bot]

This comment was marked as spam.

Copy link
Member

@seut seut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more minor suggestions.

coderabbitai[bot]

This comment was marked as spam.

@amotl
Copy link
Member Author

amotl commented Nov 28, 2025

Added some more minor suggestions.

Those are very valuable suggestions, thanks a stack. I've wrapped them into 7da3ca7 and 175050e.

@amotl amotl requested a review from seut November 28, 2025 11:13
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl merged commit 7bc3da1 into main Nov 28, 2025
12 checks passed
@amotl amotl deleted the make-it-work branch November 28, 2025 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants