-
Notifications
You must be signed in to change notification settings - Fork 886
add "ann" as reserved keyword #2005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 4.x
Are you sure you want to change the base?
Conversation
Good catch @Hazel-Datastax! We actually had to address something very similar to this for dsbulk. Should've occurred to me this part of the Java driver might have an issue as well. |
So, there's definitely something weird going on here. In Apache Cassandra 5.x "ann" is very definitely an unreserved keyword. The CQL docs in the Cassandra repo talk about the distinction a bit; reserved keywords can never be used as an identifier while unreserved keywords can in some situations... but those situations aren't specified. If an unreserved identifier is used in a spot that might introduce conflict it presumably would have to be quoted... but it's not clear how the driver can identify such a situation. The dsbulk change I referenced above doesn't need to worry about this distinction. It includes it's own ANTLR-derived parser (a subset of what's actually used in Cassandra) so it can identify these keyword cases using (essentially) the same grammar Apache Cassandra uses. I also note that the set "ann" is added to in this PR is explicitly for reserved keywords; note that each member of that set is a reserved keyword (as defined in the CQL docs above) and that no unreserved keywords are included. Presumably that's true because the code can always quote reserved keywords when generating CQL strings... but unreserved keywords are a bit tricker. To make it even worse: I note the following against Apache Cassandra 5.0.0:
The string "ann" works just fine as a table name there. But when I try something similar on Astra I get results similar to what I think you're describing:
So we've clearly got inconsistencies in the behaviour here between Astra and Apache Cassandra. But to make matters worse Astra is internally inconsistent: some unreserved keywords (such as "filtering" and "function") are just fine to use as table names while I can't get "ann" to be used as a table name whether I quote it or not. |
@adutra @aratno @tolbertam I'm curious about what you guys think of this. Short version:
My current thinking is that there isn't really much we can do here. Without better guidance as to when unreserved keywords should be quoted or not the Java driver can't really interject so it's up to the user to quote unreserved keywords when appropriate. If you have a full-blown CQL parser you could do better (see the referenced dsbulk issue above) but short of that you're kind of limited. Thoughts? |
The token I agree that unreserved keywords lack of a clear, well-defined meaning, but in any case, they can be table identifiers since the
So, I agree with @absurdfarce and I don't think it's correct to add About Astra vs C* 5.0 observed differences:
But in any case, and until we get more insights, the Astra behavior does not invalidate the fact that |
I found a corner case when using Data API (stargate/data-api#1806). I cannot use
ann
as my table name, but I can use it in CQL:The reason is, inside the Java Driver, it has a set that contains all the reserved keywords. When the query builder builds the create table query, it will call
tableName.asCql(true)
. InsideasCql(true)
method, it will check if the string is in the reserved keywords set and double quoted if it’s in. Unfortunately, the set doesn’t containann
.I guess
ann
was introduced later and the keywords set hasn't been updated accordingly.