perf: reduce parse node allocations when deserializing primitive types#180
perf: reduce parse node allocations when deserializing primitive types#180
Conversation
Co-authored-by: baywet <7905502+baywet@users.noreply.github.com>
|
@copilot here is the pull request I want you to use as an example and I forgot to include earlier microsoft/kiota-dotnet#678 |
Thanks for the reference! I've already implemented the equivalent optimization in commit f6e946b. The C# PR's approach is: add private helpers that accept Go's
This achieves the same outcome: for an array of N primitives, we save N |
baywet
left a comment
There was a problem hiding this comment.
@copilot the coverage for the following methods is low, add additional unit tests:
- JsonParseNode tokenToValue
- JsonParseNode loadJsonTreeFromToken
- JsonParseNode GetObjectValue (the untyped nodes section)
- JsonParseNode getPrimitiveValue
…ctValue untyped section, and getPrimitiveValue Co-authored-by: baywet <7905502+baywet@users.noreply.github.com>
Added in commit af6326f. Coverage now includes:
|
There was a problem hiding this comment.
Pull request overview
This PR optimizes the JSON parse tree representation to reduce heap allocations for primitive values during deserialization, while keeping the public API unchanged.
Changes:
- Refactors parse-tree loading into
loadJsonTree/loadJsonTreeFromToken/tokenToValue, storing primitives as raw Go values and only wrapping complex types in*JsonParseNode. - Updates accessor methods to handle
map[string]interface{}and[]interface{}storage, wrapping primitives into*JsonParseNodeon-demand. - Adds extensive unit tests for the new internal helpers and updated behaviors, plus benchmarks for allocation measurement.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
json_parse_node.go |
Implements the allocation-reduced parse-tree representation and updates accessors to work with raw primitive storage. |
json_parse_node_test.go |
Adds unit tests and benchmarks to validate/measure the new parsing and access patterns. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@baywet Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub. You can ask me to try again later by mentioning me in a new comment. If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: Sorry for the inconvenience! |
…rkObjectDeserialization Agent-Logs-Url: https://github.com/microsoft/kiota-serialization-json-go/sessions/268695bb-c542-42d7-99d9-acc18ffcc61d Co-authored-by: baywet <7905502+baywet@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case []interface{}: | ||
| result := make([]interface{}, len(v)) | ||
| for i, x := range v { | ||
| val, err := x.GetRawValue() | ||
| if err != nil { | ||
| return nil, err | ||
| if x == nil { | ||
| result[i] = nil | ||
| continue | ||
| } | ||
| if jn, ok := x.(*JsonParseNode); ok { | ||
| val, err := jn.GetRawValue() | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| result[i] = val | ||
| } else { | ||
| // Raw primitive – return as-is | ||
| result[i] = x | ||
| } | ||
| result[i] = val | ||
| } | ||
| return result, nil | ||
| case map[string]*JsonParseNode: | ||
| case map[string]interface{}: | ||
| m := make(map[string]interface{}) | ||
| for key, element := range v { | ||
| elementVal, err := element.GetRawValue() | ||
| if err != nil { | ||
| return nil, err | ||
| if element == nil { | ||
| m[key] = nil | ||
| continue | ||
| } | ||
| if jn, ok := element.(*JsonParseNode); ok { | ||
| elementVal, err := jn.GetRawValue() | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| m[key] = elementVal | ||
| } else { | ||
| // Raw primitive – return as-is | ||
| m[key] = element | ||
| } | ||
| m[key] = elementVal | ||
| } | ||
| return m, nil |
There was a problem hiding this comment.
GetRawValue now returns raw primitives inside arrays/maps as stored (e.g., *string, *float64) rather than JSON-like unboxed values (string, float64). This can make GetRawValue inconsistent with the primitive-node path and can leak internal storage details to callers (including additionalData). Consider normalizing raw primitives here (e.g., dereference *string/*bool/*float64/*int64/... or reuse a small helper that converts the raw stored value into the same representation GetRawValue returns for standalone primitive nodes).
| decoder.Token() // skip the closing curly | ||
| return &JsonParseNode{value: v}, nil | ||
| case '[': | ||
| v := make([]interface{}, 0) | ||
| for decoder.More() { | ||
| elemToken, err := decoder.Token() | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| elem, err := tokenToValue(decoder, elemToken) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| v = append(v, elem) | ||
| } | ||
| return c, nil | ||
| case string: | ||
| v := token.(string) | ||
| c := &JsonParseNode{} | ||
| c.setValue(&v) | ||
| return c, nil | ||
| case bool: | ||
| c := &JsonParseNode{} | ||
| v := token.(bool) | ||
| c.setValue(&v) | ||
| return c, nil | ||
| case int8: | ||
| c := &JsonParseNode{} | ||
| v := token.(int8) | ||
| c.setValue(&v) | ||
| return c, nil | ||
| case byte: | ||
| c := &JsonParseNode{} | ||
| v := token.(byte) | ||
| c.setValue(&v) | ||
| return c, nil | ||
| case float64: | ||
| c := &JsonParseNode{} | ||
| v := token.(float64) | ||
| c.setValue(&v) | ||
| return c, nil | ||
| case float32: | ||
| c := &JsonParseNode{} | ||
| v := token.(float32) | ||
| c.setValue(&v) | ||
| return c, nil | ||
| case int32: | ||
| c := &JsonParseNode{} | ||
| v := token.(int32) | ||
| c.setValue(&v) | ||
| return c, nil | ||
| case int64: | ||
| c := &JsonParseNode{} | ||
| v := token.(int64) | ||
| c.setValue(&v) | ||
| return c, nil | ||
| case nil: | ||
| return nil, nil | ||
| decoder.Token() // skip the closing bracket | ||
| return &JsonParseNode{value: v}, nil |
There was a problem hiding this comment.
The closing delimiter is consumed via decoder.Token() but the returned token/error is ignored. Even with json.Valid, this can mask internal misuse (e.g., calling loadJsonTreeFromToken when the decoder isn’t positioned inside the container) and makes failures harder to diagnose. Suggest capturing the token and error and validating it matches the expected closing delimiter (} / ]), returning a descriptive error if not.
| case json.Number: | ||
| i, err := t.Int64() | ||
| if err == nil { | ||
| return &i, nil | ||
| } | ||
| f, err := t.Float64() | ||
| if err == nil { | ||
| return &f, nil | ||
| } | ||
| return nil, errors.New("failed to parse number token") |
There was a problem hiding this comment.
The number-token parse errors are inconsistent (\"failed to parse number token\" vs \"failed to parse number\") and don’t include the offending value, which makes debugging harder. Consider using a single message in both places and including the token text (e.g., fmt.Errorf(\"failed to parse number %q: %w\", t, err)) so callers can diagnose problematic payloads.
| if err == nil { | ||
| return &JsonParseNode{value: &f}, nil | ||
| } | ||
| return nil, errors.New("failed to parse number") |
There was a problem hiding this comment.
The number-token parse errors are inconsistent (\"failed to parse number token\" vs \"failed to parse number\") and don’t include the offending value, which makes debugging harder. Consider using a single message in both places and including the token text (e.g., fmt.Errorf(\"failed to parse number %q: %w\", t, err)) so callers can diagnose problematic payloads.
| return nil, errors.New("failed to parse number") | |
| return nil, fmt.Errorf("failed to parse number %q: %w", t, err) |
| parsable, err = jn.GetObjectValue(absser.CreateUntypedNodeFromDiscriminatorValue) | ||
| if err != nil { | ||
| return nil, errors.New("cannot parse object value") | ||
| } |
There was a problem hiding this comment.
These branches drop the underlying error when parsing untyped objects/arrays (errors.New(\"cannot parse object value\")). Returning a wrapped error (e.g., fmt.Errorf(\"cannot parse object value: %w\", err)) would preserve the root cause and significantly improve diagnosability.
| parsable, err = jn.GetObjectValue(absser.CreateUntypedNodeFromDiscriminatorValue) | ||
| if err != nil { | ||
| return nil, errors.New("cannot parse object value") | ||
| } |
There was a problem hiding this comment.
These branches drop the underlying error when parsing untyped objects/arrays (errors.New(\"cannot parse object value\")). Returning a wrapped error (e.g., fmt.Errorf(\"cannot parse object value: %w\", err)) would preserve the root cause and significantly improve diagnosability.
| tmpNode := &JsonParseNode{value: rawValue} | ||
| return tmpNode.GetTimeValue() | ||
| case "timeonly": | ||
| tmpNode := &JsonParseNode{value: rawValue} | ||
| return tmpNode.GetTimeOnlyValue() | ||
| case "dateonly": | ||
| tmpNode := &JsonParseNode{value: rawValue} | ||
| return tmpNode.GetDateOnlyValue() | ||
| case "isoduration": | ||
| tmpNode := &JsonParseNode{value: rawValue} | ||
| return tmpNode.GetISODurationValue() | ||
| case "uuid": | ||
| tmpNode := &JsonParseNode{value: rawValue} | ||
| return tmpNode.GetUUIDValue() | ||
| case "base64": | ||
| tmpNode := &JsonParseNode{value: rawValue} |
There was a problem hiding this comment.
rawToPrimitiveValue allocates/constructs a temporary *JsonParseNode for several target types, which partially undermines the goal of eliminating intermediate node allocations for primitive collections. If possible, parse these types directly from the raw stored value (likely *string) or use a non-escaping temporary (e.g., stack-allocated) node pattern to minimize allocations in hot paths.
| tmpNode := &JsonParseNode{value: rawValue} | |
| return tmpNode.GetTimeValue() | |
| case "timeonly": | |
| tmpNode := &JsonParseNode{value: rawValue} | |
| return tmpNode.GetTimeOnlyValue() | |
| case "dateonly": | |
| tmpNode := &JsonParseNode{value: rawValue} | |
| return tmpNode.GetDateOnlyValue() | |
| case "isoduration": | |
| tmpNode := &JsonParseNode{value: rawValue} | |
| return tmpNode.GetISODurationValue() | |
| case "uuid": | |
| tmpNode := &JsonParseNode{value: rawValue} | |
| return tmpNode.GetUUIDValue() | |
| case "base64": | |
| tmpNode := &JsonParseNode{value: rawValue} | |
| tmpNode := JsonParseNode{value: rawValue} | |
| return tmpNode.GetTimeValue() | |
| case "timeonly": | |
| tmpNode := JsonParseNode{value: rawValue} | |
| return tmpNode.GetTimeOnlyValue() | |
| case "dateonly": | |
| tmpNode := JsonParseNode{value: rawValue} | |
| return tmpNode.GetDateOnlyValue() | |
| case "isoduration": | |
| tmpNode := JsonParseNode{value: rawValue} | |
| return tmpNode.GetISODurationValue() | |
| case "uuid": | |
| tmpNode := JsonParseNode{value: rawValue} | |
| return tmpNode.GetUUIDValue() | |
| case "base64": | |
| tmpNode := JsonParseNode{value: rawValue} |
| func BenchmarkCollectionOfPrimitiveValues(b *testing.B) { | ||
| source := []byte(`{"values":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]}`) | ||
| b.ReportAllocs() | ||
| b.ResetTimer() | ||
| for i := 0; i < b.N; i++ { | ||
| node, err := NewJsonParseNode(source) | ||
| if err != nil { | ||
| b.Fatal(err) | ||
| } | ||
| child, err := node.GetChildNode("values") | ||
| if err != nil { | ||
| b.Fatal(err) | ||
| } | ||
| _, err = child.GetCollectionOfPrimitiveValues("int32") | ||
| if err != nil { | ||
| b.Fatal(err) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // BenchmarkGetCollectionOfPrimitiveValues measures allocations when calling | ||
| // GetCollectionOfPrimitiveValues on an already-parsed node, isolating access cost. | ||
| func BenchmarkGetCollectionOfPrimitiveValues(b *testing.B) { | ||
| source := []byte(`{"values":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]}`) | ||
| node, err := NewJsonParseNode(source) | ||
| if err != nil { | ||
| b.Fatal(err) | ||
| } | ||
| b.ReportAllocs() | ||
| b.ResetTimer() | ||
| for i := 0; i < b.N; i++ { | ||
| child, err := node.GetChildNode("values") | ||
| if err != nil { | ||
| b.Fatal(err) | ||
| } | ||
| _, err = child.GetCollectionOfPrimitiveValues("int32") | ||
| if err != nil { | ||
| b.Fatal(err) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
@copilot those benchmarks are highly similar. Refactor to keep only one
There was a problem hiding this comment.
Done in commit e41dc9c. Removed BenchmarkGetCollectionOfPrimitiveValues; only BenchmarkCollectionOfPrimitiveValues remains, measuring the full parse+access path.
|
Agent-Logs-Url: https://github.com/microsoft/kiota-serialization-json-go/sessions/d2935479-4be8-479f-a3d1-85b0e1fd60a7 Co-authored-by: baywet <7905502+baywet@users.noreply.github.com>


Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.