Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: (low code)run state migrations for concurrent streams #316

Conversation

darynaishchenko
Copy link
Contributor

@darynaishchenko darynaishchenko commented Feb 5, 2025

What

related to klaviyo campaigns stream to low code airbytehq/airbyte#51551
Concurrent framework gets the stream_state from the state manager and not the DeclarativeStream, where state migrations are called.

How

Added state_migration.migrate(stream_state) to the concurrent declarative source. Added unit test.

Summary by CodeRabbit

  • New Features

    • Enhanced stream state management with automated migration to ensure consistent and up-to-date processing.
    • Introduced a new class for custom state migration, providing structured handling of stream states.
    • Added support for applying state migrations during cursor creation for improved flexibility.
  • Tests

    • Added new tests to validate the state migration process, ensuring reliable data handling and improved system stability.
    • Introduced additional tests for state migrations during cursor creation from both datetime-based and per-partition streams.

@darynaishchenko darynaishchenko self-assigned this Feb 5, 2025
@darynaishchenko darynaishchenko changed the title fix(low code)run state migrations for concurrent streams fix: (low code)run state migrations for concurrent streams Feb 5, 2025
@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Feb 5, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

🟦 Job completed successfully (no changes).

Copy link
Contributor

coderabbitai bot commented Feb 5, 2025

📝 Walkthrough

Walkthrough

The changes introduce a new static method _migrate_state in the ConcurrentDeclarativeSource class, which manages state migrations for declarative streams by checking if the current stream_state should be migrated and applying necessary updates. A new class CustomStateMigration is also created to facilitate state migration with specific attributes and methods. Corresponding test functions are added to validate the behavior of the ConcurrentDeclarativeSource and cursor creation methods when state migrations are applied.

Changes

File Path Change Summary
airbyte_cdk/sources/.../concurrent_declarative_source.py Adds a static method _migrate_state for handling state migrations and modifies _group_streams to call this method for datetime incremental and per-partition streams.
unit_tests/sources/.../custom_state_migration.py Introduces the CustomStateMigration class with attributes for declarative_stream and config, and methods should_migrate (always true) and migrate for managing state migration.
unit_tests/sources/.../test_concurrent_declarative_source.py Adds a test function test_concurrent_declarative_source_runs_state_migrations_provided_in_manifest to validate that the ConcurrentDeclarativeSource correctly applies state migrations during processing.
airbyte_cdk/sources/.../model_to_component_factory.py Introduces a new static method apply_stream_state_migrations for applying state migrations and updates method signatures in ModelToComponentFactory to include optional stream_state_migrations parameters.
unit_tests/sources/.../test_model_to_component_factory.py Introduces two test functions to validate state migrations within the context of concurrent cursors created from datetime-based and per-partition cursors.

Possibly related PRs

  • airbytehq/airbyte-python-cdk#135: The changes in the main PR, specifically the addition of the _migrate_state method and modifications to the _group_streams method in the ConcurrentDeclarativeSource class, are related to the adjustments made to the same class in the retrieved PR, which also modifies the _group_streams method and enhances stream handling.
  • airbytehq/airbyte-python-cdk#228: The changes in the main PR, specifically the addition of the _migrate_state method and modifications to the _group_streams method in the ConcurrentDeclarativeSource class, are directly related to the changes in the retrieved PR, which also modifies the _group_streams method and the state management within the same class.
  • airbytehq/airbyte-python-cdk#267: The changes in the main PR, specifically the addition of the _migrate_state method in the ConcurrentDeclarativeSource class, are related to the state migration logic introduced in the retrieved PR's _migrate_child_state_to_parent_state method in the SubstreamPartitionRouter class, as both involve migrating stream states.

Suggested labels

enhancement

Suggested reviewers

  • maxi297
  • brianjlai
  • tolik0

Wdyt? Do these updates look good to you? Let me know if there's anything you'd like to adjust!

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
unit_tests/sources/declarative/custom_state_migration.py (2)

13-25: Consider adding docstrings to improve code documentation.

The class and its methods would benefit from docstrings explaining their purpose and behavior. For example:

 class CustomStateMigration(StateMigration):
+    """Handles state migration for declarative streams with partitioned states.
+    
+    This class evaluates cursor fields against the provided configuration and
+    creates a new migrated state with specific partition types.
+    """
     declarative_stream: DeclarativeStream
     config: Config

26-28: Consider making should_migrate more selective.

The method always returns True, which means migration will be attempted for all states. Should we add some validation to determine when migration is actually needed? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca68c5c and 327f5fc.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2 hunks)
  • unit_tests/sources/declarative/custom_state_migration.py (1 hunks)
  • unit_tests/sources/declarative/test_concurrent_declarative_source.py (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/concurrent_declarative_source.py

[error] 229-229: Incompatible types in assignment (expression has type "Mapping[str, Any]", variable has type "MutableMapping[str, Any]")


[error] 339-339: Incompatible types in assignment (expression has type "Mapping[str, Any]", variable has type "MutableMapping[str, Any]")

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Validate PR title
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)
🔇 Additional comments (2)
unit_tests/sources/declarative/custom_state_migration.py (1)

29-47: Verify the migrated state structure.

The migration creates a fixed state with two partitions of type "type_1" and "type_2". Let's ensure this structure aligns with the expected state format.

unit_tests/sources/declarative/test_concurrent_declarative_source.py (1)

1234-1461: LGTM! Comprehensive test coverage for state migration.

The test thoroughly validates:

  • State migration configuration
  • HTTP request mocking
  • State transformation verification

@github-actions github-actions bot added bug Something isn't working security labels Feb 5, 2025
Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is something missing in my explanation during our sync. This PR updates the state that is used on the retriever (which seems to cover this hack) but not the one on the cursor. I'm not even sure the hack needs to be maintained at that point. @brianjlai what do you think?

@brianjlai
Copy link
Contributor

brianjlai commented Feb 5, 2025

I think there is something missing in my explanation during our sync. This PR updates the state that is used on the retriever (which seems to cover this hack) but not the one on the cursor. I'm not even sure the hack needs to be maintained at that point. @brianjlai what do you think?

@maxi297 Yeah that sounds correct. I think the missing piece here is that we need to also run a state migration after we read in state from ModelToComponentFactory.create_concurrent_cursor_from_datetime_based_cursor().

That will also require that we pass in the migrations to the model factory. Or an idea that i kind of like is that when we parse the manifest into the DatetimeBasedCursorModel, we also append the state migrations to the pydantic model. And we have it available to run from create_concurrent_cursor_from_datetime_based_cursor()

edit: actually we don't want to modify the schema to change the model so I'm being dumb. maybe we just pass it in as a parameter to create_concurrent_cursor_from_datetime_based_cursor(). i think that can work

@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Feb 5, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (1)

6-6: Consider organizing imports alphabetically?

The pipeline indicates that the import block is unsorted. Would you like me to help organize them alphabetically, wdyt?

-from typing import Any, Generic, Iterator, List, Mapping, Optional, Tuple, MutableMapping
+from typing import (
+    Any,
+    Generic,
+    Iterator,
+    List,
+    Mapping,
+    MutableMapping,
+    Optional,
+    Tuple,
+)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08198ac and e61916c.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py (4 hunks)
  • unit_tests/sources/declarative/test_concurrent_declarative_source.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/declarative/test_concurrent_declarative_source.py
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/concurrent_declarative_source.py

[warning] 5-5: Import block is un-sorted or un-formatted. Organize imports.

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)

528-537: LGTM! Clean implementation of state migration.

The static method effectively handles state migrations and properly converts the immutable mapping to a mutable one using dict(). Nice job implementing the feedback from previous reviews!


227-227: LGTM! Well-placed integration points.

The state migration is correctly applied at both points where stream state is retrieved, ensuring proper state handling for both datetime incremental streams and streams with concurrent partition processing.

Also applies to: 335-335

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

1197-1262: Should we apply stream_state_migrations to stream_state here as well?

In create_concurrent_cursor_from_perpartition_cursor, we pass stream_state_migrations to the cursor factory, but it seems we are not applying the migrations to stream_state before using it. Unlike create_concurrent_cursor_from_datetime_based_cursor, we're not applying the migrations directly in this method. Should we apply the migrations to stream_state here to ensure consistent state handling across methods? Wdyt?

🧹 Nitpick comments (6)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

946-946: Should we specify a more precise type for stream_state_migrations?

Currently, stream_state_migrations is typed as Optional[List[Any]]. Specifying a more precise type, such as Optional[List[StateMigration]], could improve type checking and readability. Wdyt?


957-961: Can we refactor the state migration logic into a helper method?

Noticing that the logic for applying stream_state_migrations to stream_state is (or should be) similar in both create_concurrent_cursor_from_datetime_based_cursor and create_concurrent_cursor_from_perpartition_cursor. Would it make sense to extract this into a helper method to avoid duplication and ensure consistency? Wdyt?

Also applies to: 1197-1199

unit_tests/sources/declarative/test_concurrent_declarative_source.py (3)

1234-1383: Consider adding more test cases for state migration scenarios.

The test verifies basic state migration functionality, but we could make it more comprehensive. What do you think about adding test cases for:

  1. Multiple state migrations
  2. Failed state migrations
  3. Empty state migrations

Also, would you like me to help generate these additional test cases?


1362-1371: Consider using a more descriptive variable name for clarity.

The variable state_blob could be more descriptive. What do you think about renaming it to initial_state_blob or unmigrated_state_blob to better reflect its purpose? wdyt?


1376-1382: Consider adding descriptive error messages for assertions.

The assertions could benefit from more descriptive error messages. What do you think about:

-    assert (
-        concurrent_streams[0].cursor.state.get("state") != state_blob.__dict__
-    ), "State was not migrated."
-    assert concurrent_streams[0].cursor.state.get("states") == [
-        {"cursor": {"updated_at": "2024-08-21"}, "partition": {"type": "type_1"}},
-        {"cursor": {"updated_at": "2024-08-21"}, "partition": {"type": "type_2"}},
-    ], "State was migrated, but actual state don't match expected"
+    assert (
+        concurrent_streams[0].cursor.state.get("state") != state_blob.__dict__
+    ), "Expected state to be migrated but it remained unchanged"
+    assert concurrent_streams[0].cursor.state.get("states") == [
+        {"cursor": {"updated_at": "2024-08-21"}, "partition": {"type": "type_1"}},
+        {"cursor": {"updated_at": "2024-08-21"}, "partition": {"type": "type_2"}},
+    ], "Expected migrated state to have two partitions with updated_at='2024-08-21' but got different state"
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1)

3284-3340: The test looks good but could be more comprehensive!

The test effectively validates the basic state migration functionality. However, we could make it more robust by adding a few more test cases. What do you think about:

  1. Testing when should_migrate returns False?
  2. Testing with multiple state migrations to ensure they're applied in order?
  3. Adding assertions to verify the input state is not modified?

Here's a suggested enhancement to make the test more comprehensive:

 def test_create_concurrent_cursor_from_datetime_based_cursor_runs_state_migrations():
     class DummyStateMigration:
         def should_migrate(self, stream_state: Mapping[str, Any]) -> bool:
             return True

         def migrate(self, stream_state: Mapping[str, Any]) -> Mapping[str, Any]:
             updated_at = stream_state["updated_at"]
             return {
                 "states": [
                     {
                         "partition": {"type": "type_1"},
                         "cursor": {"updated_at": updated_at},
                     },
                     {
                         "partition": {"type": "type_2"},
                         "cursor": {"updated_at": updated_at},
                     },
                 ]
             }

+    class NoOpStateMigration:
+        def should_migrate(self, stream_state: Mapping[str, Any]) -> bool:
+            return False
+
+        def migrate(self, stream_state: Mapping[str, Any]) -> Mapping[str, Any]:
+            return stream_state
+
     stream_name = "test"
     config = {
         "start_time": "2024-08-01T00:00:00.000000Z",
         "end_time": "2024-09-01T00:00:00.000000Z",
     }
     stream_state = {"updated_at": "2025-01-01T00:00:00.000000Z"}
+    original_state = stream_state.copy()
     connector_builder_factory = ModelToComponentFactory(emit_connector_builder_messages=True)
     connector_state_manager = ConnectorStateManager()
     cursor_component_definition = {
         "type": "DatetimeBasedCursor",
         "cursor_field": "updated_at",
         "datetime_format": "%Y-%m-%dT%H:%M:%S.%fZ",
         "start_datetime": "{{ config['start_time'] }}",
         "end_datetime": "{{ config['end_time'] }}",
         "partition_field_start": "custom_start",
         "partition_field_end": "custom_end",
         "step": "P10D",
         "cursor_granularity": "PT0.000001S",
         "lookback_window": "P3D",
     }
     concurrent_cursor = (
         connector_builder_factory.create_concurrent_cursor_from_datetime_based_cursor(
             state_manager=connector_state_manager,
             model_type=DatetimeBasedCursorModel,
             component_definition=cursor_component_definition,
             stream_name=stream_name,
             stream_namespace=None,
             config=config,
             stream_state=stream_state,
-            stream_state_migrations=[DummyStateMigration()],
+            stream_state_migrations=[NoOpStateMigration(), DummyStateMigration()],
         )
     )
     assert concurrent_cursor.state["states"] == [
         {"cursor": {"updated_at": stream_state["updated_at"]}, "partition": {"type": "type_1"}},
         {"cursor": {"updated_at": stream_state["updated_at"]}, "partition": {"type": "type_2"}},
     ]
+    assert stream_state == original_state, "Input state should not be modified"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4452468 and e1b1626.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
  • unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1 hunks)
  • unit_tests/sources/declarative/test_concurrent_declarative_source.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

957-961: Applying state migrations correctly

The logic for applying state migrations to stream_state looks good. This ensures that any necessary migrations are applied before processing. Nice work!

@darynaishchenko
Copy link
Contributor Author

@maxi297 @brianjlai
Updated pr with the changes to create_concurrent_cursor_from_datetime_based_cursor. Could you please take a look?

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment on the scope of the migration for perpartition. I think we can make it even more useful

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just small nits but I'm good with this one. Thanks Daryna!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

937-946: Would you consider adding type hints and docstring? wdyt?

The new static method could benefit from:

  1. Type hints for the Any type to be more specific (e.g., StateMigration)
  2. A docstring explaining the method's purpose, parameters, and return value
    @staticmethod
-   def apply_stream_state_migrations(
-       stream_state_migrations: List[Any] | None, stream_state: MutableMapping[str, Any]
-   ) -> MutableMapping[str, Any]:
+   def apply_stream_state_migrations(
+       stream_state_migrations: List[StateMigration] | None,
+       stream_state: MutableMapping[str, Any]
+   ) -> MutableMapping[str, Any]:
+       """Apply a list of state migrations to the given stream state.
+
+       Args:
+           stream_state_migrations: List of state migrations to apply
+           stream_state: The current stream state to migrate
+
+       Returns:
+           The migrated stream state
+       """
        if stream_state_migrations:
            for state_migration in stream_state_migrations:
                if state_migration.should_migrate(stream_state):
                    stream_state = dict(state_migration.migrate(stream_state))
        return stream_state
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1aae4b7 and b5c98c4.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)

948-969: LGTM! The state migration integration looks good.

The addition of the stream_state_migrations parameter and its application to the stream state before processing is well-placed and follows the existing pattern of optional parameters.


1194-1257: Could you verify if passing migrations twice is intentional? wdyt?

The migrations are applied in two places:

  1. Passed to the cursor factory (line 1253)
  2. Applied directly to the stream state (line 1256)

Is this double application necessary, or should we only apply the migrations once? If it's intentional, consider adding a comment explaining why both are needed.


1765-1765: LGTM! The state migrations are correctly passed to the cursor creation.

The addition of stream_state_migrations parameter is consistent with the changes in other methods.

@darynaishchenko darynaishchenko merged commit 74631d8 into main Feb 10, 2025
23 checks passed
@darynaishchenko darynaishchenko deleted the daryna/low-code/run-state-migrations-for-concurrent-streams branch February 10, 2025 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants