Skip to content

perf: reduce parse node allocations when deserializing primitive types#180

Draft
Copilot wants to merge 5 commits intomainfrom
copilot/reduce-parse-node-allocations
Draft

perf: reduce parse node allocations when deserializing primitive types#180
Copilot wants to merge 5 commits intomainfrom
copilot/reduce-parse-node-allocations

Conversation

Copy link
Copy Markdown

Copilot AI commented Mar 12, 2026

  • perf: reduce parse node allocations when deserializing primitive types
  • test: add unit tests for tokenToValue, loadJsonTreeFromToken, GetObjectValue untyped section, getPrimitiveValue
  • fix: add default error cases to loadJsonTreeFromToken; fix BenchmarkObjectDeserialization to call GetObjectValue
  • refactor: remove redundant BenchmarkGetCollectionOfPrimitiveValues; keep only BenchmarkCollectionOfPrimitiveValues
Original prompt

This section details on the original issue you should resolve

<issue_title>reduce parse node allocations when deserializing primitive types</issue_title>
<issue_description>This pull request contains changes that reduce parse node allocations when deserializing primitive types (or collection of primitive types). Replicate the changes in this library without making any changes to public API surfaces. Use conventional commits</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: baywet <7905502+baywet@users.noreply.github.com>
@baywet
Copy link
Copy Markdown
Member

baywet commented Mar 12, 2026

@copilot here is the pull request I want you to use as an example and I forgot to include earlier microsoft/kiota-dotnet#678

Copilot AI changed the title [WIP] Fix parse node allocations when deserializing primitive types perf: reduce parse node allocations when deserializing primitive types Mar 12, 2026
Copy link
Copy Markdown
Author

Copilot AI commented Mar 12, 2026

@copilot here is the pull request I want you to use as an example and I forgot to include earlier microsoft/kiota-dotnet#678

Thanks for the reference! I've already implemented the equivalent optimization in commit f6e946b.

The C# PR's approach is: add private helpers that accept JsonElement directly and call them in collection iteration to avoid allocating a new JsonParseNode(element) per element. Since JsonElement is a C# struct (value type), it can be passed with no heap allocation.

Go's *JsonParseNode is always heap-allocated, so the Go-equivalent approach I used is:

  • Tree-building: store raw primitive values (*float64, *string, *bool, *int64, …) directly in []interface{} and map[string]interface{} — no *JsonParseNode wrapper per element at parse time
  • Collection access (GetCollectionOfPrimitiveValues, GetCollectionOfEnumValues): dispatch directly from raw values via rawToPrimitiveValue / rawToUntypedNodeable helpers, skipping any intermediate *JsonParseNode
  • GetObjectValue: for known fields, wraps on-demand; for unknown fields (additionalData), stores raw values directly — zero *JsonParseNode allocations for unknown properties

This achieves the same outcome: for an array of N primitives, we save N *JsonParseNode allocations (one per element).

Copy link
Copy Markdown
Member

@baywet baywet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot the coverage for the following methods is low, add additional unit tests:

  • JsonParseNode tokenToValue
  • JsonParseNode loadJsonTreeFromToken
  • JsonParseNode GetObjectValue (the untyped nodes section)
  • JsonParseNode getPrimitiveValue

@github-project-automation github-project-automation Bot moved this to In Progress 🚧 in Kiota Mar 17, 2026
…ctValue untyped section, and getPrimitiveValue

Co-authored-by: baywet <7905502+baywet@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Mar 17, 2026

@copilot the coverage for the following methods is low, add additional unit tests:

  • JsonParseNode tokenToValue...

Added in commit af6326f. Coverage now includes:

  • tokenToValue: all token types — float64, string, bool (true/false), json.Number (int & float & invalid), nil, unknown type (error), { and [ delimiters
  • loadJsonTreeFromToken: string, bool, float64, json.Number (int & float), nil, object (primitives, null value, nested object), array (primitives, null element, nested array)
  • GetObjectValue untyped section: *bool, *string, *float32, *float64, *int32, *int64, nil-valued node, map[string]interface{} with raw primitives/nested *JsonParseNode, []interface{} with raw primitives/nested *JsonParseNode
  • getPrimitiveValue: nil node, nil value, unsupported type error, and all 14 valid target types (string, bool, uint8, byte, float32, float64, int32, int64, time, timeonly, dateonly, isoduration, uuid, base64)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the JSON parse tree representation to reduce heap allocations for primitive values during deserialization, while keeping the public API unchanged.

Changes:

  • Refactors parse-tree loading into loadJsonTree / loadJsonTreeFromToken / tokenToValue, storing primitives as raw Go values and only wrapping complex types in *JsonParseNode.
  • Updates accessor methods to handle map[string]interface{} and []interface{} storage, wrapping primitives into *JsonParseNode on-demand.
  • Adds extensive unit tests for the new internal helpers and updated behaviors, plus benchmarks for allocation measurement.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
json_parse_node.go Implements the allocation-reduced parse-tree representation and updates accessors to work with raw primitive storage.
json_parse_node_test.go Adds unit tests and benchmarks to validate/measure the new parsing and access patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread json_parse_node.go
Comment thread json_parse_node_test.go
Comment thread json_parse_node_test.go Outdated
Comment thread json_parse_node_test.go
Copy link
Copy Markdown
Author

Copilot AI commented Apr 14, 2026

@baywet Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub.

You can ask me to try again later by mentioning me in a new comment.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: 24b4ceb5-d66a-4746-bd7d-a0bb05730d71

Sorry for the inconvenience!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread json_parse_node.go
Comment on lines +846 to 883
case []interface{}:
result := make([]interface{}, len(v))
for i, x := range v {
val, err := x.GetRawValue()
if err != nil {
return nil, err
if x == nil {
result[i] = nil
continue
}
if jn, ok := x.(*JsonParseNode); ok {
val, err := jn.GetRawValue()
if err != nil {
return nil, err
}
result[i] = val
} else {
// Raw primitive – return as-is
result[i] = x
}
result[i] = val
}
return result, nil
case map[string]*JsonParseNode:
case map[string]interface{}:
m := make(map[string]interface{})
for key, element := range v {
elementVal, err := element.GetRawValue()
if err != nil {
return nil, err
if element == nil {
m[key] = nil
continue
}
if jn, ok := element.(*JsonParseNode); ok {
elementVal, err := jn.GetRawValue()
if err != nil {
return nil, err
}
m[key] = elementVal
} else {
// Raw primitive – return as-is
m[key] = element
}
m[key] = elementVal
}
return m, nil
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetRawValue now returns raw primitives inside arrays/maps as stored (e.g., *string, *float64) rather than JSON-like unboxed values (string, float64). This can make GetRawValue inconsistent with the primitive-node path and can leak internal storage details to callers (including additionalData). Consider normalizing raw primitives here (e.g., dereference *string/*bool/*float64/*int64/... or reuse a small helper that converts the raw stored value into the same representation GetRawValue returns for standalone primitive nodes).

Copilot uses AI. Check for mistakes.
Comment thread json_parse_node.go
Comment on lines +130 to +146
decoder.Token() // skip the closing curly
return &JsonParseNode{value: v}, nil
case '[':
v := make([]interface{}, 0)
for decoder.More() {
elemToken, err := decoder.Token()
if err != nil {
return nil, err
}
elem, err := tokenToValue(decoder, elemToken)
if err != nil {
return nil, err
}
v = append(v, elem)
}
return c, nil
case string:
v := token.(string)
c := &JsonParseNode{}
c.setValue(&v)
return c, nil
case bool:
c := &JsonParseNode{}
v := token.(bool)
c.setValue(&v)
return c, nil
case int8:
c := &JsonParseNode{}
v := token.(int8)
c.setValue(&v)
return c, nil
case byte:
c := &JsonParseNode{}
v := token.(byte)
c.setValue(&v)
return c, nil
case float64:
c := &JsonParseNode{}
v := token.(float64)
c.setValue(&v)
return c, nil
case float32:
c := &JsonParseNode{}
v := token.(float32)
c.setValue(&v)
return c, nil
case int32:
c := &JsonParseNode{}
v := token.(int32)
c.setValue(&v)
return c, nil
case int64:
c := &JsonParseNode{}
v := token.(int64)
c.setValue(&v)
return c, nil
case nil:
return nil, nil
decoder.Token() // skip the closing bracket
return &JsonParseNode{value: v}, nil
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closing delimiter is consumed via decoder.Token() but the returned token/error is ignored. Even with json.Valid, this can mask internal misuse (e.g., calling loadJsonTreeFromToken when the decoder isn’t positioned inside the container) and makes failures harder to diagnose. Suggest capturing the token and error and validating it matches the expected closing delimiter (} / ]), returning a descriptive error if not.

Copilot uses AI. Check for mistakes.
Comment thread json_parse_node.go
Comment on lines +44 to +53
case json.Number:
i, err := t.Int64()
if err == nil {
return &i, nil
}
f, err := t.Float64()
if err == nil {
return &f, nil
}
return nil, errors.New("failed to parse number token")
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number-token parse errors are inconsistent (\"failed to parse number token\" vs \"failed to parse number\") and don’t include the offending value, which makes debugging harder. Consider using a single message in both places and including the token text (e.g., fmt.Errorf(\"failed to parse number %q: %w\", t, err)) so callers can diagnose problematic payloads.

Copilot uses AI. Check for mistakes.
Comment thread json_parse_node.go
if err == nil {
return &JsonParseNode{value: &f}, nil
}
return nil, errors.New("failed to parse number")
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number-token parse errors are inconsistent (\"failed to parse number token\" vs \"failed to parse number\") and don’t include the offending value, which makes debugging harder. Consider using a single message in both places and including the token text (e.g., fmt.Errorf(\"failed to parse number %q: %w\", t, err)) so callers can diagnose problematic payloads.

Suggested change
return nil, errors.New("failed to parse number")
return nil, fmt.Errorf("failed to parse number %q: %w", t, err)

Copilot uses AI. Check for mistakes.
Comment thread json_parse_node.go
Comment on lines +282 to +285
parsable, err = jn.GetObjectValue(absser.CreateUntypedNodeFromDiscriminatorValue)
if err != nil {
return nil, errors.New("cannot parse object value")
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These branches drop the underlying error when parsing untyped objects/arrays (errors.New(\"cannot parse object value\")). Returning a wrapped error (e.g., fmt.Errorf(\"cannot parse object value: %w\", err)) would preserve the root cause and significantly improve diagnosability.

Copilot uses AI. Check for mistakes.
Comment thread json_parse_node.go
Comment on lines +304 to +307
parsable, err = jn.GetObjectValue(absser.CreateUntypedNodeFromDiscriminatorValue)
if err != nil {
return nil, errors.New("cannot parse object value")
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These branches drop the underlying error when parsing untyped objects/arrays (errors.New(\"cannot parse object value\")). Returning a wrapped error (e.g., fmt.Errorf(\"cannot parse object value: %w\", err)) would preserve the root cause and significantly improve diagnosability.

Copilot uses AI. Check for mistakes.
Comment thread json_parse_node.go
Comment on lines +548 to +563
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetTimeValue()
case "timeonly":
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetTimeOnlyValue()
case "dateonly":
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetDateOnlyValue()
case "isoduration":
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetISODurationValue()
case "uuid":
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetUUIDValue()
case "base64":
tmpNode := &JsonParseNode{value: rawValue}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rawToPrimitiveValue allocates/constructs a temporary *JsonParseNode for several target types, which partially undermines the goal of eliminating intermediate node allocations for primitive collections. If possible, parse these types directly from the raw stored value (likely *string) or use a non-escaping temporary (e.g., stack-allocated) node pattern to minimize allocations in hot paths.

Suggested change
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetTimeValue()
case "timeonly":
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetTimeOnlyValue()
case "dateonly":
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetDateOnlyValue()
case "isoduration":
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetISODurationValue()
case "uuid":
tmpNode := &JsonParseNode{value: rawValue}
return tmpNode.GetUUIDValue()
case "base64":
tmpNode := &JsonParseNode{value: rawValue}
tmpNode := JsonParseNode{value: rawValue}
return tmpNode.GetTimeValue()
case "timeonly":
tmpNode := JsonParseNode{value: rawValue}
return tmpNode.GetTimeOnlyValue()
case "dateonly":
tmpNode := JsonParseNode{value: rawValue}
return tmpNode.GetDateOnlyValue()
case "isoduration":
tmpNode := JsonParseNode{value: rawValue}
return tmpNode.GetISODurationValue()
case "uuid":
tmpNode := JsonParseNode{value: rawValue}
return tmpNode.GetUUIDValue()
case "base64":
tmpNode := JsonParseNode{value: rawValue}

Copilot uses AI. Check for mistakes.
Comment thread json_parse_node_test.go Outdated
Comment on lines +845 to +885
func BenchmarkCollectionOfPrimitiveValues(b *testing.B) {
source := []byte(`{"values":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]}`)
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
node, err := NewJsonParseNode(source)
if err != nil {
b.Fatal(err)
}
child, err := node.GetChildNode("values")
if err != nil {
b.Fatal(err)
}
_, err = child.GetCollectionOfPrimitiveValues("int32")
if err != nil {
b.Fatal(err)
}
}
}

// BenchmarkGetCollectionOfPrimitiveValues measures allocations when calling
// GetCollectionOfPrimitiveValues on an already-parsed node, isolating access cost.
func BenchmarkGetCollectionOfPrimitiveValues(b *testing.B) {
source := []byte(`{"values":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]}`)
node, err := NewJsonParseNode(source)
if err != nil {
b.Fatal(err)
}
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
child, err := node.GetChildNode("values")
if err != nil {
b.Fatal(err)
}
_, err = child.GetCollectionOfPrimitiveValues("int32")
if err != nil {
b.Fatal(err)
}
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot those benchmarks are highly similar. Refactor to keep only one

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit e41dc9c. Removed BenchmarkGetCollectionOfPrimitiveValues; only BenchmarkCollectionOfPrimitiveValues remains, measuring the full parse+access path.

@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
57.8% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress 🚧

Development

Successfully merging this pull request may close these issues.

reduce parse node allocations when deserializing primitive types

3 participants