Skip to content

Conversation

@ericraio
Copy link
Contributor

@ericraio ericraio commented Oct 3, 2025

Motivation:

The Cassandra client previously supported scalar types and arrays, but lacked support for map types which are a fundamental Cassandra collection type. Maps are commonly used for storing key-value pairs in Cassandra schemas (e.g., user preferences, metadata, configuration settings). Without map support, users had to work around this limitation or couldn't use certain Cassandra features effectively.

Modifications:

  • Created Statement+Maps.swift with map binding logic for all 63 type combinations (7 key types × 9 value types)
  • Added all 63 map enum cases to Statement.Value enum
  • Implemented bindMap<K,V> helper using CASS_COLLECTION_TYPE_MAP
  • Created Data+Maps.swift with map reading logic for all 63 combinations
  • Implemented toMap<K,V> using cass_iterator_from_map(), cass_iterator_get_map_key(), and cass_iterator_get_map_value()
  • Added 63 map properties on Column and 126 convenience methods on Row
  • Extended testMapTypes to comprehensively test all 63 map combinations
  • Organized map-related code in separate extension files to keep codebase clean

Supported key types: Int8, Int16, Int32, Int64, String, UUID, TimeBasedUUID
Supported value types: Int8, Int16, Int32, Int64, Float32, Double, Bool, String, UUID

Result:

Users can now bind and read all Cassandra map types with full type safety. The implementation follows the same pattern as arrays, providing consistent API ergonomics. All 63 map type combinations are tested and verified to work correctly with round-trip insert/read operations.

case doubleArray([Double])
case stringArray([String])

case int8Int8Map([Int8: Int8])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an API breaking which is okay since we are pre 1.0.0 but we should probably come up with a way to introduce new types without breaking API. This doesn't need to be solved in this PR though

Copy link
Contributor Author

@ericraio ericraio Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, personally not a fan of this current approach for how tedious it was to add all of the permutations but when all said and done, it's okay as a user of the library to be able to just be explicit.

Went with this approach given that it follows the existing collection pattern.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, this is how the Java driver handles compound codecs: https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/internal/core/type/codec/MapCodec.java

I'm not as familiar with Swift, but I'd recommend following a similar approach here, with a MapCodec that accepts generic key and value type parameters, that are each constrained to be codecs as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aratno yeah, I would probably opt for a more of a generic approach so that you can create more complex structures which is what the Java implementation is setup to be.

@yifan-c yifan-c self-requested a review October 9, 2025 18:56
Copy link
Collaborator

@yifan-c yifan-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. As you mentioned, the patch is consistent with the existing implementation for array.
I have left some questions. Also some CI tasks failed, seemingly unrelated. Please double check.

Comment on lines 281 to 285
guard let key = self.extractValue(from: keyPointer, as: K.self),
let value = self.extractValue(from: valuePointer, as: V.self)
else {
continue
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ignores the extraction failures, which can produce surprising result. Should it log the failed extractions or throw exception?
That said, I noticed that the silent handling also exists in Data.swift for the array case. So that the code here is following the existing pattern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. We should definitely come up with a strategy. What happens in other language drivers?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the java driver as example too. The codec throws exception on deserialization failure, https://github.com/apache/cassandra-java-driver/blob/4aee8648430dd7dcfccd8b6fe599cad32a1201e3/driver-core/src/main/java/com/datastax/driver/core/TypeCodec.java#L1798-L1812
I am not familiar with the C++ driver that this project leverages. Looks like its handling is to log an error and stop https://github.com/datastax/cpp-driver/blob/d9ae6b96a8938c75b38ae218f056661eef681c5d/src/decoder.hpp#L481-L493

    if (remaining_ - sizeof(int32_t) <= 0) {
      notify_error("decimal value", remaining_ - sizeof(int32_t));
      return false;
    }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Java driver throws an exception, which would propagate to the caller that attempted to serialize it: https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/internal/core/type/codec/IntCodec.java#L69

If decoding fails, a user shouldn't receive a filtered or truncated result, they should receive an exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also expecting an error to be thrown here instead of just silently ignoring it.

Comment on lines 19 to 33
var int8Int8Map: [Int8: Int8]? {
self.toMap(keyType: Int8.self, valueType: Int8.self)
}

var int8Int16Map: [Int8: Int16]? {
self.toMap(keyType: Int8.self, valueType: Int16.self)
}

var int8Int32Map: [Int8: Int32]? {
self.toMap(keyType: Int8.self, valueType: Int32.self)
}

var int8Int64Map: [Int8: Int64]? {
self.toMap(keyType: Int8.self, valueType: Int64.self)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware that the patch has a well defined goal on adding the specific key and value types.
However, Cassandra do support more data types and nested collections. The current approach seems to be a potential blocker to expanding the support. I am not a swift expert, would like to hear from you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the other data types? I am not too familiar with Cassandra so if you could provide a list that would be helpful to judge if we are moving ourselves into a corner here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Cassandra Java Driver has the most comprehensive implementation. Please refer to this table for all the data types. https://github.com/apache/cassandra-java-driver/tree/4.x/manual/core#cql-to-java-type-mapping

Regarding the nested collections, we could expect some complex cases like this, map<text, frozen<list<tuple<double, double>>>>, consider it as a collection of geo-barriers, where the outter map key is the barrier name and the list of tuple (as geo-point) defines the barrier. There could be more complex nested collection s permitted by CQL grammar. The frozen keyword stores the collection as a blob, it is internal to cassandra, so driver may not need to pay too much attention.
Similarly, the geo-barrier map can be expressed with user-defined-type (UDT), which is a dynamic data type that database users can define. In this example, the below UDT can be defined to replace the use of tuple.

CREATE TYPE geo_point (
    latitude double,
    longitude double
);

So... hopefully this gives a clear idea about how the number of combinations would explode as moving forward with current approach.

Copy link
Contributor Author

@ericraio ericraio Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this approach isn't sustainable and needs to be a breaking change with a refactor to be honest.

@FranzBusch FranzBusch added the 🆕 semver/minor Adds new public API. label Oct 16, 2025
@ericraio ericraio force-pushed the eraio/add-map-data-type-support branch from 46443fa to 3504893 Compare October 16, 2025 12:29
@ericraio ericraio requested a review from yifan-c October 18, 2025 21:37
@ericraio ericraio force-pushed the eraio/add-map-data-type-support branch from 3504893 to 5a961e9 Compare October 24, 2025 12:17
Motivation:

The Cassandra client previously supported scalar types and arrays, but
lacked support for map types which are a fundamental Cassandra collection
type. Maps are commonly used for storing key-value pairs in Cassandra
schemas (e.g., user preferences, metadata, configuration settings). Without
map support, users had to work around this limitation or couldn't use
certain Cassandra features effectively.

Modifications:

- Created Statement+Maps.swift with map binding logic for all 63 type
  combinations (7 key types × 9 value types)
- Added all 63 map enum cases to Statement.Value enum
- Implemented bindMap<K,V> helper using CASS_COLLECTION_TYPE_MAP
- Created Data+Maps.swift with map reading logic for all 63 combinations
- Implemented toMap<K,V> using cass_iterator_from_map(),
  cass_iterator_get_map_key(), and cass_iterator_get_map_value()
- Added 63 map properties on Column and 126 convenience methods on Row
- Extended testMapTypes to comprehensively test all 63 map combinations
- Organized map-related code in separate extension files to keep codebase clean

Supported key types: Int8, Int16, Int32, Int64, String, UUID, TimeBasedUUID
Supported value types: Int8, Int16, Int32, Int64, Float32, Double, Bool, String, UUID

Result:

Users can now bind and read all Cassandra map types with full type safety.
The implementation follows the same pattern as arrays, providing consistent
API ergonomics. All 63 map type combinations are tested and verified to
work correctly with round-trip insert/read operations.
@ericraio ericraio force-pushed the eraio/add-map-data-type-support branch from 5a961e9 to ab3c61b Compare October 24, 2025 12:37
@yifan-c
Copy link
Collaborator

yifan-c commented Oct 24, 2025

Thank you! All checks have passed. The support for the other map data types can be a follow up.

@yifan-c yifan-c merged commit 08b45f6 into apple:main Oct 27, 2025
38 checks passed
@ericraio ericraio deleted the eraio/add-map-data-type-support branch October 28, 2025 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🆕 semver/minor Adds new public API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants