Skip to content

[Feature][Connector-V2] Support databend source/sink connector #9331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jun 17, 2025

Conversation

hantmac
Copy link
Contributor

@hantmac hantmac commented May 18, 2025

Purpose of this pull request

Add the Databend connector. Fix #9315

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Add UT and e2e tests.

Check list

@hantmac hantmac marked this pull request as draft May 18, 2025 06:31
@nielifeng nielifeng requested a review from Copilot May 19, 2025 03:47
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Add a new Databend connector module, providing both source and sink implementations for JDBC-based reads and writes to Databend.

  • Introduces DatabendSource and DatabendSink with factories, reader/writer, state, and utility classes
  • Supplies example HOCON configs and updates plugin_config, plugin-mapping.properties, and distribution POM
  • Adds documentation (README.md) and service registration for auto-discovery

Reviewed Changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
databend_source_example.conf Example HOCON for source usage
databend_sink_example.conf Example HOCON for sink usage
META-INF/services/.../Factory Registers Databend source and sink factories
DatabendUtil.java JDBC connection, type conversion, SQL generation
DatabendSourceFactory.java / DatabendSinkFactory.java Factory rules and plugin identifiers
DatabendSourceReader.java / DatabendSinkWriter.java Runtime reader and writer logic
README.md Connector documentation with usage examples
config/plugin_config Adds connector-databend to plugin config

@github-actions github-actions bot added CI&CD core SeaTunnel core module labels May 20, 2025
@hantmac hantmac requested a review from Copilot May 22, 2025 01:47
Copilot

This comment was marked as outdated.

@hantmac hantmac force-pushed the feat/databend-connector branch from e81aa26 to 7fdc2d2 Compare May 26, 2025 08:53
@hantmac hantmac changed the title feat: support databend connector feat: support databend source/sink connector May 26, 2025
@hantmac hantmac marked this pull request as ready for review May 26, 2025 12:35
@hantmac hantmac force-pushed the feat/databend-connector branch from 0680416 to c4eda00 Compare May 26, 2025 14:37
@hailin0
Copy link
Member

hailin0 commented May 29, 2025

Thanks @hantmac

Please update docs

@hantmac
Copy link
Contributor Author

hantmac commented Jun 3, 2025

Thanks @hantmac

Please update docs

@hailin0 Thanks! I have added the docs. PTAL.

@hantmac hantmac requested a review from Hisoka-X June 3, 2025 07:07
@Hisoka-X Hisoka-X changed the title feat: support databend source/sink connector [Feature][Connector-V2] Support databend source/sink connector Jun 4, 2025
Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice dataend read/write data use jdbc, why not add new dialect in connector-jdbc?

@hantmac
Copy link
Contributor Author

hantmac commented Jun 4, 2025

I notice dataend read/write data use jdbc, why not add new dialect in connector-jdbc?

@Hisoka-X We want to maintain extensibility, we plan to add streaming upload as a data import method to Databend later. Also, we can use replace into and merge into to impl CDC feature in later PR to make databend sink much more strong.

@Hisoka-X
Copy link
Member

Hisoka-X commented Jun 4, 2025

Please follow the guide to open ci on your fork reposity. https://github.com/apache/seatunnel/pull/9331/checks?check_run_id=43433559569

@hantmac hantmac force-pushed the feat/databend-connector branch from e7dd58f to 773aa96 Compare June 4, 2025 03:47
@hantmac
Copy link
Contributor Author

hantmac commented Jun 6, 2025

hi @Hisoka-X , there are some errors in CI in some other module and it seems that the tests of databend connector module not running?

@Hisoka-X
Copy link
Member

Hisoka-X commented Jun 6, 2025

Hi @hantmac . Databend e2e already started in https://github.com/hantmac/seatunnel/actions/runs/15437221086/job/43446415225. Please fix the e2e test error.

The unit test have some test also failed.
image

@Hisoka-X
Copy link
Member

Hisoka-X commented Jun 6, 2025

Other errors may be caused by instability and can be resolved by retrying.

@hantmac
Copy link
Contributor Author

hantmac commented Jun 6, 2025

hi @Hisoka-X , sorry to bother you, I run the databend-e2e and ut locally and pass all the tests.
image

image

Why in the ci I always got condition timeout error https://github.com/hantmac/seatunnel/actions/runs/15482805214/job/43591670655#step:5:31657?

@Hisoka-X
Copy link
Member

Hisoka-X commented Jun 6, 2025

We have a feature called thread leak check for connector, which checks for threads that have not been released in the connector. It seems that there are threads that are delayed in releasing in databend, which causes this error. It's okay, maybe we can adjust the timeout to 120 seconds, just like you did at the beginning. https://github.com/apache/seatunnel/pull/9331/files#diff-828bb95593582c2297e60ef9a6ced4733f251eef77389235d8c3b506dd0beb47R343

@hantmac
Copy link
Contributor Author

hantmac commented Jun 6, 2025

adjust the timeout

Got it. Thanks!

@hantmac
Copy link
Contributor Author

hantmac commented Jun 8, 2025

Hi @Hisoka-X , I check the test CI logs and it seems that the tests of databend maybe have past successfully.
image
And What need I do in the next step?

import java.util.stream.Stream;

@Slf4j
public class TestDatabendCase extends TestSuiteBase implements TestResource {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you merge TestDatabendCase into DatabendIT? We use name to match it is unit test or not. Start with Test class is unit test. So the unit test failed now. https://github.com/hantmac/seatunnel/actions/runs/15484736824/job/43596998748

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have merge TestDatabendCase into DatabendIT and pass the e2e locally.

@hantmac hantmac requested review from Hisoka-X and Copilot June 10, 2025 05:35
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds a new Databend connector for both source and sink, addressing issue #9315. The changes include the implementation of connector classes, configuration options, exception handling, schema management, catalog factory integration, documentation updates, and necessary CI/pom modifications.

Reviewed Changes

Copilot reviewed 50 out of 50 changed files in this pull request and generated 2 comments.

File Description
connector-databend/src/main/java/org/apache/seatunnel/connectors/seatunnel/databend/sink/DatabendSink.java Implements the Databend sink with type conversion and writer creation.
connector-databend/src/main/java/org/apache/seatunnel/connectors/seatunnel/databend/schema/SchemaChangeManager.java Provides schema evolution handling for Databend tables.
connector-databend/src/main/java/org/apache/seatunnel/connectors/seatunnel/databend/config/* Adds configuration and options classes for source and sink.
Other files (exceptions, catalog, pom.xml, plugin-mapping.properties, docs, workflows) Update ancillary configurations, documentation and CI definitions to support the new connector.

@Hisoka-X
Copy link
Member

waiting test case passes.

@hantmac
Copy link
Contributor Author

hantmac commented Jun 11, 2025

waiting test case passes.

@Hisoka-X Is there any test case still failed?

@Hisoka-X
Copy link
Member

waiting test case passes.

@Hisoka-X Is there any test case still failed?

We need all test case passes before merge. So if other test case failed please retrigger failed ci.

@hantmac
Copy link
Contributor Author

hantmac commented Jun 17, 2025

waiting test case passes.

@Hisoka-X Is there any test case still failed?

We need all test case passes before merge. So if other test case failed please retrigger failed ci.

Hi bro @Hisoka-X , I make all the tests pass finally. What need todo in the next step

Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hantmac

@hailin0 hailin0 merged commit 2f96f2e into apache:dev Jun 17, 2025
6 checks passed
chncaesar pushed a commit to chncaesar/seatunnel that referenced this pull request Jun 30, 2025
dybyte pushed a commit to dybyte/seatunnel that referenced this pull request Jul 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature][Databend] Support Databend connector source/sink
3 participants