Skip to content

Comments

Replace unnest with compactColumns for columnar data transformation#171

Open
DZakh wants to merge 23 commits intodz/decode-encodefrom
claude/replace-unnest-compactcolumns-Zs0Za
Open

Replace unnest with compactColumns for columnar data transformation#171
DZakh wants to merge 23 commits intodz/decode-encodefrom
claude/replace-unnest-compactcolumns-Zs0Za

Conversation

@DZakh
Copy link
Owner

@DZakh DZakh commented Feb 7, 2026

Summary

This PR replaces the S.unnest function with a new S.compactColumns function that provides better support for columnar data transformations. The new implementation uses a format field instead of a boolean flag and includes improved encoder/decoder logic for bidirectional transformations.

Key Changes

  • New compactColumns function: Replaces S.unnest with improved semantics for transforming between row-oriented and column-oriented data formats

    • Input: array of objects (rows)
    • Output: array of arrays (columns)
    • Supports bidirectional transformation via .to() chains
  • Type system updates:

    • Added arrayFormat type with CompactColumns variant
    • Extended format union to include array formats
    • Updated schema type definitions to use format?: arrayFormat instead of unnest?: bool
    • Removed unnest field from internal schema representation
  • Encoder/Decoder implementation:

    • New compactColumnsEncoder: Handles forward direction (columnar → rows)
    • New compactColumnsDecoder: Handles both forward (rows → columnar) and reverse (columnar → rows) directions
    • Proper handling of property extraction from schema chains
    • Support for .skipTo flag to prevent double-processing in chained transformations
  • Expression generation: Enhanced toExpression to properly display columnar data types with column arrays notation

  • Code formatting: Normalized S.js to use consistent line endings (removed semicolons)

  • Test infrastructure: Added e2e test files for genType integration

Implementation Details

The new compactColumns implementation creates an outer array schema with format = CompactColumns and attaches custom encoder/decoder builders. The decoders intelligently detect whether they're in a forward or reverse transformation chain and adjust behavior accordingly:

  • Forward: Validates columnar input and reconstructs row objects
  • Reverse: Takes row objects and transposes them into columnar format

This approach provides cleaner semantics than the previous unnest implementation while maintaining full bidirectional support.

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE

claude and others added 23 commits February 1, 2026 14:55
- Renamed 'unnest' schema field to 'compactColumns' in type definitions
- Removed old unnestSerializer and unnest function
- Implemented new compactColumns function with decoder/encoder architecture
  inspired by jsonStringDecoder, jsonDecoder, and textEncodeDecoder
- New API: S.compactColumns(S.unknown)->S.to(objectSchema)
- Updated tests to use new API
- Tests are currently failing for reverse operations (encoder)

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Changed compactColumns field type from bool to schema type
- Updated S.d.ts with proper generic types for compactColumns
- Updated .resi files with correct type signatures
- Removed FIXME comment about deleting compactColumns on reverse
- Fixed S_test.ts to not use 'as' cast
- Updated test to verify schema is stored in compactColumns field

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Reverse compactColumns schema on S.reverse() call
- Add decoder test to CompactColumns test
- Add test with S.compactColumns(S.json) and S.bigint field
- Restore expectType check in tests

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
…ays schema

- Add arrayFormat type with CompactColumns variant
- Remove compactColumns field from schema types
- compactColumns function now creates array of arrays schema using additionalItems
- Set arrayFormat = CompactColumns as identifier
- Update type signature to t<'value> => t<array<array<'value>>>
- Update interface files and TypeScript definitions
- Update tests to use %raw for runtime type comparisons

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Add arrayFormat to format type spread instead of separate field
- Remove arrayFormat field from Array variant and untagged types
- Use existing format field for compactColumns identifier
- compactColumnsDecoder falls back to arrayDecoder when no .to
- Use array factory instead of base for inner array creation
- Add TypeScript types: NumberFormat, StringFormat, ArrayFormat, Format
- Update tests to check format field

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Fix toExpression to return inner element type for forward compactColumns schemas
- Fix toExpression to return object expression + "[]" for reversed compactColumns schemas
- Add compactColumns export to S.js for JavaScript/TypeScript users
- Update S.d.ts to use ArrayFormat type for array schemas
- Note: Field-level transformations (nullAsOption, bigint) are not applied in
  compactColumns - it creates raw objects without field transformations

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Fix S.to to store properties directly on compactColumns schema instead
  of chaining via .to, which prevents objectSchema decoder from running
  in the reversed direction
- Fix compactColumns encoding for nullAsOption fields by properly
  detecting direction using selfSchema.parser presence
- Add CompactColumnsSchema type and S.to overload for proper TypeScript
  type inference with compactColumns
- Fix test argument order in edge case tests
- Update test expectations for non-object schema error message

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Use default S.to logic instead of special-casing compactColumns
- Add skipTo handling in compactColumnsDecoder for both forward and reverse
- Add check in objectDecoder to skip when .to is compactColumns (reversed chain)
- Fix direction detection using selfSchema.to presence
- Update toExpression to show proper array types for compactColumns
- Fix toExpression for reversed compactColumns to show []
- Keep CompactColumnsSchema type and S.to overload for proper type inference

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Add objectDecoder skip logic for reversed compactColumns chain
- Restore direction detection in compactColumnsDecoder
- Add compactColumns and reverseConvertOrThrow to S.js exports
- Add reverseConvertOrThrow type to S.d.ts
- Update TypeScript tests to use parser and reverseConvertOrThrow

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Add compactColumns export to Pack.res for TypeScript/JS usage
- Update S.d.ts: remove reverseConvertOrThrow (use S.encoder), fix compactColumns type to Output[][]
- Fix toExpression for compactColumns without S.to to show inner schema type (e.g., "string[][]")
- Update S_test.ts to use S.encoder instead of reverseConvertOrThrow
- Add test for compactColumns toExpression without S.to

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Add Alpha.6 section documenting compactColumns feature
- Remove accidentally committed genType generated files

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Fix compactColumnsDecoder direction detection to be based on input
  type rather than selfSchema.to
- Fix compactColumnsEncoder to pass through when input is already
  in columnar format (happens when called as parser after reverse)
- Add U.assertCompiledCode checks to all compactColumns test cases
- Use S.untag instead of Obj.magic to access format field in tests

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Merge Alpha.6 into Alpha.5 in IDEAS.md
- Support empty objects in compactColumns (parse empty columnar to empty array)
- Improve error message for non-object schemas to:
  "S.compactColumns supports only object schemas. Use S.compactColumns(S.unknown)->S.to(objectSchema)."
- Update tests to reflect new behavior

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
…nto claude/replace-unnest-compactcolumns-Zs0Za
…lAsOption fields

- Forward direction (parse): Convert null to undefined for fields with null variant
- Reverse direction (encode): Detect undefined->null transformation by checking
  if field schema's anyOf contains an undefined variant with .to pointing to null
- Update tests to expect null<->undefined transformation in compiled code
- Fix TypeScript test to expect undefined output when parsing nullable fields

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
The compactColumns transformation produces/consumes an array of objects,
so the target schema must be S.array(objectSchema) not just objectSchema.

Changes:
- Update compactColumnsDecoder to extract properties from selfSchema.to.additionalItems
- Update TypeScript tests to use S.array(S.schema({...}))
- Update ReScript tests to use S.array(S.schema(...))
- Update error message to reflect the new API format
- Update compiled code expectations for the array wrapper validation

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Fix Example_test to use reverseConvertOrThrow instead of reverseConvertToJsonOrThrow
  (optional fields with undefined can't be converted to JSON)
- Fix S_null_test to expect throw for reverseConvertToJsonStringOrThrow with complex
  optional nullable fields
- Update S_null_test compiled code snapshot for new validation patterns
- Update S_refine_test compiled code snapshots for new refine compilation
- Fix bug in getShapedSerializerOutput where flattenedOutput.vals could be None

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
- Remove propsToUse logic in toExpression for compactColumns (compactColumns never has its own properties)
- Reuse array expression logic for compactColumns without S.to
- Remove Object with CompactColumns toExpression case
- Remove shouldSkipForCompactColumns logic from objectDecoder (only compactColumns decoder should handle this)
- Revert unrelated flatten fix
- Change mut.serializer to mut.encoder for compactColumns
- Update S_toExpression_test.res to use S.array wrapper for reversed compactColumns test

https://claude.ai/code/session_01SKZEuGXnrhtTCktzecM4WE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants