Skip to content

Conversation

pedrorfdez
Copy link

Feature or Bugfix

  • Feature

Detail

  • Implemented full support for Iceberg MERGE INTO operations in _write_iceberg.py/to_iceberg
  • Added new parameters for:
    • merge_on_clause: Custom ON statement in the MERGE INTO ... USING ... ON [custom_expression] to allow <, <=, > and >= operators. Until now, only column equality was allowed. Risk of having more than one match in target table is warned in stringdocs.
    • merge_condition: Added new accepted value conditional_merge
    • merge_conditional_clauses: List of dictionaries specifying custom conditional clauses for the MERGE INTO statement.
      Each dictionary should have:
      - 'when': One of ['MATCHED', 'NOT MATCHED', 'NOT MATCHED BY SOURCE']
      - 'action': One of ['UPDATE', 'DELETE', 'INSERT']
      - 'condition': (optional) Additional SQL condition for the clause
      - 'columns': (optional) List of columns to update or insert
      Used only when merge_condition is 'conditional_merge'.
  • Added argument validation for mutually exclusive and required parameters merge_cols, merge_on_clause, merge_match_nulls, merge_condition, merge_conditional_clauses,
  • Added and updated unit tests to cover new validation logic and merge scenarios.
  • Updated docstrings for new parameters and behaviors.
  • Backward compatibility

Relates

This is the first draft of the implementation, feel free to suggest any changes in the approach. I am open to suggestions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@pedrorfdez pedrorfdez changed the title Add full-featured Iceberg MERGE INTO conditional merges support and argument validation feat: Add full-featured Iceberg MERGE INTO conditional merges support and argument validation Sep 10, 2025
@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-4rfo0GHQ0u9a
  • Commit ID: eb25d2a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-4rfo0GHQ0u9a
  • Commit ID: 262bfc6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@pedrorfdez
Copy link
Author

Currently failing 1 Athena test:

FAILED tests/unit/test_athena_iceberg.py::test_to_iceberg_conditional_merge_happy_path - AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="id") are different

Attribute "dtype" are different
[left]: int64
[right]: Int64

Plan to fix on next commit.

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-4rfo0GHQ0u9a
  • Commit ID: 3f7ca50
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-4rfo0GHQ0u9a
  • Commit ID: 8b694c5
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-4rfo0GHQ0u9a
  • Commit ID: f33af54
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: eb25d2a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 262bfc6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 3f7ca50
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@pedrorfdez pedrorfdez marked this pull request as ready for review September 11, 2025 17:18
@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubCodeBuild8756EF16-4rfo0GHQ0u9a
  • Commit ID: 5c65f85
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 5c65f85
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants