-
Notifications
You must be signed in to change notification settings - Fork 162
feat(csharp/src/Drivers/Databricks): Add Activity-based logging to DatabricksStatement #3617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| protected override Schema GetSchemaFromMetadata(TGetResultSetMetadataResp metadata) | ||
| { | ||
| // Log schema parsing decision | ||
| Activity.Current?.SetTag("statement.schema.has_arrow_schema", metadata.__isset.arrowSchema); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this output logs correctly? Why not using this.TraceActivity?
@birschick-bq What should be the recommended way for trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned that Activity.Current might not be set to the activity in your current block - i.e., it might have been set for a child call. But in thinking about it, I don't think that would be the case, as long as the child call disposed the Activity in their scope.
However, I think the best usage pattern for Activity.Current is where you don't start a new activity and you don't pass an existing activity as a parameter. That is, a situation in which you might be called from a method that may or may not have a current activity started.
So I think Activity.Current is used appropriately in the case shown in line 93, above.
Use this.TraceActivity when you want a new activity (line) with its own events and tags. Typically something "major" or structural in your code. If the new activity (this.TraceActivity) is a child call, the ParentId will be set to indicate it is a child activity of the parent activity.
Summary
Adds comprehensive Activity-based logging to
DatabricksStatement.csfor improved observability and debugging of Databricks ADBC operations.Changes
Methods with Logging Added
SetStatementProperties: Logs configuration (Arrow native types, CloudFetch settings, async mode)GetSchemaFromMetadata: Logs schema parsing decisions (Arrow vs Thrift, column count)GetCatalogsAsync: Logs catalog queries with feature flags and row countsGetSchemasAsync: Logs schema queries with catalog filteringGetTablesAsync: Logs table queries with SPARK catalog handlingGetColumnsAsync: Logs column queries with search criteriaGetPrimaryKeysAsync: Logs primary key queriesGetCrossReferenceAsync: Logs foreign key queriesGetColumnsExtendedAsync: Logs DESC TABLE EXTENDED queriesSetOption: Logs configuration changesWhat Gets Logged
Tags:
enable_multiple_catalog_support,pk_fk_enabled)db.response.returned_rows)Events:
statement.<operation>.start/statement.<operation>.completecalling_base_implementation,returning_empty_result,fallback_to_base)using_arrow_schema,fallback_to_thrift)Example Log Output
{ "OperationName": "GetCatalogs", "Duration": "00:00:00.582", "TagObjects": { "statement.feature.enable_multiple_catalog_support": true, "db.response.returned_rows": 28 }, "Events": [ { "Name": "statement.get_catalogs.start" }, { "Name": "statement.get_catalogs.calling_base_implementation" }, { "Name": "statement.get_catalogs.complete" } ] }Testing
Tested locally by enabling logging with:
Verified that all tags, events, and distributed tracing context are captured correctly in trace files for all implemented methods (GetCatalogs, GetSchemas, GetTables, GetColumns, and statement configuration).
PR generated by Cursor.