Skip to content

Conversation

ZorinAnton
Copy link
Contributor

@ZorinAnton ZorinAnton commented Jul 1, 2025

DDL:

For the above mentioned sql statements create the following relations:

  • NamedWrite for create table as
  • NamedDdl for create view
    All other types of sql statements are handled as before.

DML:

The class SubstraitRelVisitor now supports the following operations for the TableModify node:

  • insert
  • update
  • delete

@CLAassistant
Copy link

CLAassistant commented Jul 1, 2025

CLA assistant check
All committers have signed the CLA.

@ZorinAnton ZorinAnton force-pushed the zor-sql-to-ddl branch 3 times, most recently from fe86d02 to 9ce46e3 Compare July 3, 2025 12:31
@ZorinAnton ZorinAnton changed the title Add support of ddl statements create table as and create view to SqlToSubstrait Add support of ddl and dml sql statements to SqlToSubstrait Jul 3, 2025
@ZorinAnton ZorinAnton marked this pull request as ready for review July 3, 2025 12:54
Copy link
Member

@vbarua vbarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR motivated me to continue some ongoing work for #362, specifically #430.

Effectively, the plan is to break SqlToSubstrait down into a conversion of SQL into Calcite, and then Calcite into Substrait. The DdlRelBuilder you've introduced goes against this work because it couples the SQL to the Substrait translation. It also seems somewhat duplicative, because I think we should be able to use the Calcite machinary to convert DDL SqlNodes into RelNodes, and then all we would need would be your SubstraitRelVisitor updates.

var parsedList = parser.parseStmtList();
SqlToRelConverter converter = createSqlToRelConverter(validator, catalogReader);
// IMPORTANT: parsedList gets filtered in the call below
List<io.substrait.plan.Plan.Root> ddlRelRoots = ddlSqlToRootNodes(parsedList, converter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to process DDL statements separately? Would it be possible to just parse all statements, and then just use the conversion code in SubstraitRelVisitor?

@ZorinAnton
Copy link
Contributor Author

I think it's a good idea to break down SqlToSubstrait into Sql->Calcite and Calcite->Substrait, as it will give better flexibility.
However, DDL statements will probably require a special processing. Calcite can parse them, but it doesn't translate them to Relational Algebra, that's why I created DdlRelBuilder that translates SqlNode directly to substrait classes. SqlValidator also doesn't support DDL statements and I wasn't sure if it makes sense to add such a support.

@vbarua
Copy link
Member

vbarua commented Jul 7, 2025

DDL statements will probably require a special processing. Calcite can parse them, but it doesn't translate them to Relational Algebra

Ah, TIL. Poking around your PR, the DML statements INSERT, UPDATE and DELETE get turned into a LogicalTableModify relations, but the DDL statements don't have Calcite equivalents. I guess that's because Calcite expects/wants tables and views to be available in the catalog, so it doesn't add them.

I want to think more about the DDL API, and honestly kind of want to land it after #430.

That being said, would you be open to splitting the DML and DDL processing into separate PRs? The DML works is pretty straightforward and should be easier to review on it's own.

@ZorinAnton
Copy link
Contributor Author

ZorinAnton commented Jul 8, 2025

That being said, would you be open to splitting the DML and DDL processing into separate PRs? The DML works is pretty straightforward and should be easier to review on it's own.

sure, will do.

#431
#432

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants