Skip to content

Allow for strongly-typed query properties #100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

fredvollmer
Copy link

@fredvollmer fredvollmer commented Oct 30, 2024

Hopefully there wasn't a simpler way of resolving this.

Note: this PR includes a breaking change.

Currently, it isn't possible to render a non-string property into the query sent to kSSQL--the JSON serializer will wrap all string values in quotes (as expected). Thus, when setting a Boolean property such as ksql.query.pull.table.scan.enabled, the query string includes the parameter with a stringified value (ksql.query.pull.table.scan.enabled = "true"). This causes kSQL to throw an error, since it is expecting a Boolean value for this parameter.

To work around this, we altered the Properties dictionary to hold System.Text.Json.JsonValue's instead of strings, and added Get<T> and Set<T> methods to allow for setting properties of any type. When the properties are serialized, the Serializer will respect the underlying type: strings will be quoted, Booleans will be unquoted, etc.

As part of this change, we also marked the ProcessingGuarantee and AutoOffsetReset enums to be serialized using the built-in snake case JSON enum serializer. This meant the helper methods for manually marshaling between the enum and their corresponding string representation were no longer needed.

Before:

{
  "sql": "SELECT * FROM MY_TABLE;",
  "properties": {
    "ksql.query.pull.table.scan.enabled": "true"
  }
}

After:

{
  "sql": "SELECT * FROM MY_TABLE;",
  "properties": {
    "ksql.query.pull.table.scan.enabled": true
  }
}

AI-generated summary of changes

This pull request includes several changes to improve the handling of QueryStreamParameters and QueryParameters in the test files. The most important changes include replacing dictionary-style access with the Set and Get methods for better type safety and readability, and adding a new test for the JsonSnakeCaseStringEnumConverterAttribute.

Improvements to Parameter Handling:

Enhancements to Test Assertions:

New Test Addition:

Context Options Handling:

Options Builder Enhancements:

Copy link
Owner

@tomasfabian tomasfabian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fredvollmer,
thank you for your PR so far. Could you please evaluate my suggestions on the breaking changes before I proceed with a full review?


public interface IQueryOptions
{
Dictionary<string, string> Properties { get; }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help avoid a breaking change to keep the dictionary but change the value type to an object?

public interface IQueryOptions
{
  Dictionary<string, object> Properties { get; }
}

Have you encountered any obstacles with this approach?

Copy link
Author

@fredvollmer fredvollmer Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would still be breaking, as the dictionary is a public field. I've encountered a lot of code which directly references the dictionary and expects the value to be a string. Because of this, I don't believe an object would be any better than JsonValue as far as breaking changes are concerned. Am I missing something?

Copy link
Owner

@tomasfabian tomasfabian Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For string values, the API remains the same:

["auto.offset.reset"] = "latest"
opt["auto.offset.reset"] = "latest"

Existing code would still compile as before:

opt[KSqlDbConfigs.KsqlQueryPullTableScanEnabled] = "true";

For booleans and similar types, only the assigned value changes if someone consciously decides to fix it:

opt[KSqlDbConfigs.KsqlQueryPullTableScanEnabled] = true;

I couldn't reproduce the runtime error, so I suggested adding an integration test. It's probably me who is missing something. Thank you again!

Copy link
Owner

@tomasfabian tomasfabian Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand how this issue could have gone unnoticed for so long. You mentioned that you

encountered a lot of code which directly references the dictionary

Was this code also assigning boolean values as strings?
I'm trying to reproduce the issue using KSQLDB_VERSION=0.29.0.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thought as far as how it could have gone unnoticed. I did add an integration test. Using the latest stable release of this library, we observed the following error on kSQL 0.28.2 when setting ksql.query.pull.table.scan.enabled to true:

java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Boolean (java.lang.String and java.lang.Boolean are in module java.base of loader 'bootstrap')
	at io.confluent.ksql.rest.server.resources.streaming.PullQueryConfigPlannerOptions.getTableScansEnabled(PullQueryConfigPlannerOptions.java:41)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For string values, the API remains the same:

["auto.offset.reset"] = "latest"
opt["auto.offset.reset"] = "latest"

Existing code would still compile as before:

opt[KSqlDbConfigs.KsqlQueryPullTableScanEnabled] = "true";

For booleans and similar types, only the assigned value changes if someone consciously decides to fix it:

opt[KSqlDbConfigs.KsqlQueryPullTableScanEnabled] = true;

I couldn't reproduce the runtime error, so I suggested adding an integration test. It's probably me who is missing something. Thank you again!

Ahh, you're right! I think this is the way to go.

One small thing--this will maintain the same API for writes, but what about reads? This is obviously a much smaller use case, but is it not still a breaking change if opt[KSqlDbConfigs.KsqlQueryPullTableScanEnabled] returns an object rather than a string? I still think object is preferrable over JsonValue if possible, just not sure if it gets us entirely out of the breaking change issue.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I downgraded the ksqldb server to 0.28.2 and reran the integration test you provided, using a stringified boolean, and it passed successfully. I didn’t encounter the ClassCastException, even in the ksqldb server logs, which I would have expected based on the information you shared.

Nevertheless, it’s great that you’re addressing the issue, @fredvollmer. If you don’t mind, I’d also like to get a third opinion on the API changes. @mrt181, do you have any thoughts on this PR?

Copy link
Owner

@tomasfabian tomasfabian Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry—I didn’t notice the new message from you before I published my previous answer.

That’s a great point: receiving values from the dictionary will still result in a breaking change. Please go ahead and increase the package’s major version with a note in the changelog.md, or I can handle it later if needed.
Users will need to downcast to a concrete type, which I think is an acceptable trade-off, as the value will more often be set than read.
I hope the documentation remains intact as well. Or even better, examples like the following could be changed to:

options[KSqlDbConfigs.KsqlQueryPullTableScanEnabled] = true;

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @fredvollmer, I have a pending release with a bumped major version. How do we want to proceed with your PR? Do you have time to update the public API as per our discussion, or should I take over the task?

/// </summary>
/// <param name="key">The key of the property.</param>
/// <returns>The value of the property.</returns>
string this[string key] { get; set; }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent a breaking change, would it be better to keep the indexer but change its return type to an object?

object this[string key] { get; set; }

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested changing the string value to an object type, and both test cases below passed. I'd like to proceed with these less disruptive changes unless you identify any critical downsides.

[Test]
public void SetupPullQuery_BooleanOptionWasConfigured()
{
  //Arrange
  var setupParameters = ClassUnderTest.UseKSqlDb(TestParameters.KsqlDbUrl);

  //Act
  var parameters = setupParameters.SetupPullQuery(opt =>
  {
    opt[KSqlDbConfigs.KsqlQueryPullTableScanEnabled] = true;
  }).Options.PullQueryParameters;


  var json = JsonSerializer.Serialize(parameters);

  //Assert
  json.Should().BeEquivalentTo("{\"Sql\":null,\"Properties\":{\"ksql.query.pull.table.scan.enabled\":true}}");
}

[Test]
public void SetupPullQuery_ProcessingGuaranteeEnumOptionWasConfigured()
{
  //Arrange
  var setupParameters = ClassUnderTest.UseKSqlDb(TestParameters.KsqlDbUrl);

  //Act
  var parameters = setupParameters.SetupPushQuery(opt =>
  {
    opt[KSqlDbConfigs.ProcessingGuarantee] = ProcessingGuarantee.AtLeastOnce;
  }).Options.QueryStreamParameters;


  var json = JsonSerializer.Serialize(parameters);

  //Assert
  json.Should().BeEquivalentTo("{\"Sql\":null,\"Properties\":{\"auto.offset.reset\":\"earliest\",\"processing.guarantee\":\"at_least_once\"}}");
}

The JsonSnakeCaseStringEnumConverter has been applied in this case as well.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer less churn on the public API if possible and would prefer the change to object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants