Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 24, 2025

📄 358% (3.58x) speedup for marshal_json in src/mistralai/utils/serializers.py

⏱️ Runtime : 25.2 milliseconds 5.50 milliseconds (best of 57 runs)

📝 Explanation and details

The optimized code achieves a 358% speedup through two key optimizations:

1. Pydantic Model Caching (Primary optimization)

  • Added _marshaller_cache to store created Pydantic models by type
  • The original code called create_model() on every invocation, which is extremely expensive (93.5% of total runtime in profiler)
  • Caching reduces create_model calls from 70 hits to only 31 hits for new types, with cached lookups being ~1000x faster
  • This optimization is most effective for repeated serialization of the same types, as shown in the test results where basic type serializations see 15-40x speedups

2. Direct Dictionary Access

  • Replaced d[next(iter(d))] with direct d["body"] access
  • Since model_dump() always creates a dict with a single "body" key, direct access eliminates iterator overhead
  • Minor but consistent improvement across all test cases

3. Micro-optimization in is_nullable

  • Cached get_origin(arg) result to avoid redundant calls in the loop
  • Small but measurable improvement in type checking

Performance characteristics:

  • Basic types: 15-40x speedup due to model caching eliminating expensive Pydantic model creation
  • Large data structures: 3-6x speedup as serialization overhead becomes more significant relative to model creation
  • Cache hits: Near-instant model lookup vs. expensive create_model() call
  • Best for: Applications that repeatedly serialize the same types, which is common in API serialization workflows

The caching strategy is particularly effective because type objects are hashable and immutable, making them ideal cache keys.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 64 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import json
# Patch the marshal_json and is_nullable to use the above dummy classes for this test context
import sys
from typing import Any, Dict, List, Optional, Union, get_args

# imports
import pytest
from mistralai.types.basemodel import Nullable, OptionalNullable
from mistralai.utils.serializers import marshal_json
# Helper: a simple Pydantic model for testing
from pydantic import BaseModel, ConfigDict, Field, create_model
from typing_extensions import Annotated, get_origin

# ---- UNIT TESTS ----

# Dummy Nullable and OptionalNullable for testing (since mistralai.types.basemodel is not available)
class NullableMeta(type):
    pass
class Nullable(metaclass=NullableMeta):
    pass
class OptionalNullable(metaclass=NullableMeta):
    pass


class User(BaseModel):
    id: int
    name: str

class AliasModel(BaseModel):
    foo: int = Field(..., alias="bar")

# 1. BASIC TEST CASES

def test_basic_int():
    # Test marshaling a simple int
    codeflash_output = marshal_json(42, int) # 314μs -> 16.7μs (1782% faster)

def test_basic_float():
    # Test marshaling a float
    codeflash_output = marshal_json(3.14, float) # 294μs -> 16.1μs (1727% faster)

def test_basic_str():
    # Test marshaling a string
    codeflash_output = marshal_json("hello", str) # 280μs -> 11.0μs (2454% faster)

def test_basic_bool():
    # Test marshaling a boolean
    codeflash_output = marshal_json(True, bool) # 288μs -> 14.2μs (1933% faster)
    codeflash_output = marshal_json(False, bool) # 233μs -> 6.53μs (3471% faster)

def test_basic_list():
    # Test marshaling a list of ints
    codeflash_output = marshal_json([1,2,3], List[int]) # 337μs -> 14.8μs (2185% faster)

def test_basic_dict():
    # Test marshaling a dict of str to int
    codeflash_output = marshal_json({"a": 1, "b": 2}, Dict[str, int]) # 369μs -> 16.5μs (2132% faster)

def test_basic_pydantic_model():
    # Test marshaling a simple Pydantic model
    u = User(id=1, name="Alice")
    codeflash_output = marshal_json(u, User) # 320μs -> 15.1μs (2021% faster)



def test_basic_optional_value():
    # Test marshaling an optional value (not None)
    codeflash_output = marshal_json(123, Optional[int]) # 362μs -> 21.4μs (1599% faster)

def test_basic_union_value():
    # Test marshaling a union type
    codeflash_output = marshal_json("foo", Union[int, str]) # 376μs -> 17.6μs (2043% faster)
    codeflash_output = marshal_json(99, Union[int, str]) # 308μs -> 9.72μs (3077% faster)

# 2. EDGE TEST CASES





def test_edge_empty_list():
    # Test marshaling an empty list
    codeflash_output = marshal_json([], List[int]) # 386μs -> 22.5μs (1615% faster)

def test_edge_empty_dict():
    # Test marshaling an empty dict
    codeflash_output = marshal_json({}, Dict[str, int]) # 367μs -> 16.3μs (2155% faster)

def test_edge_empty_string():
    # Test marshaling an empty string
    codeflash_output = marshal_json("", str) # 284μs -> 11.8μs (2318% faster)

def test_edge_zero():
    # Test marshaling zero values
    codeflash_output = marshal_json(0, int) # 281μs -> 14.1μs (1900% faster)
    codeflash_output = marshal_json(0.0, float) # 232μs -> 8.49μs (2634% faster)

def test_edge_false():
    # Test marshaling False
    codeflash_output = marshal_json(False, bool) # 278μs -> 13.3μs (1997% faster)



def test_edge_list_of_none():
    # Test marshaling a list of None values
    typ = List[Optional[int]]
    codeflash_output = marshal_json([None, None], typ) # 392μs -> 19.8μs (1881% faster)

def test_edge_dict_with_none_value():
    # Test marshaling a dict with a None value (should be excluded)
    d = {"a": 1, "b": None}
    typ = Dict[str, Optional[int]]
    codeflash_output = marshal_json(d, typ) # 421μs -> 18.9μs (2127% faster)



def test_edge_annotated_type():
    # Test marshaling an Annotated type
    typ = Annotated[int, "some meta"]
    codeflash_output = marshal_json(5, typ) # 372μs -> 23.3μs (1495% faster)

# 3. LARGE SCALE TEST CASES

def test_large_list_of_ints():
    # Test marshaling a large list of ints (length 1000)
    data = list(range(1000))
    codeflash_output = marshal_json(data, List[int]); result = codeflash_output # 427μs -> 81.8μs (423% faster)

def test_large_dict_of_str_int():
    # Test marshaling a large dict of str->int (length 500)
    data = {f"key{i}": i for i in range(500)}
    codeflash_output = marshal_json(data, Dict[str, int]); result = codeflash_output # 516μs -> 135μs (282% faster)
    # Should be a compact JSON object
    parsed = json.loads(result)
    for i in range(500):
        pass


def test_large_list_of_models():
    # Test marshaling a list of models (length 200)
    users = [User(id=i, name=f"user{i}") for i in range(200)]
    codeflash_output = marshal_json(users, List[User]); result = codeflash_output # 507μs -> 126μs (302% faster)
    parsed = json.loads(result)
    for i, u in enumerate(parsed):
        pass

def test_large_dict_with_nested_models():
    # Test marshaling a dict of str->User (length 100)
    users = {f"user{i}": User(id=i, name=f"u{i}") for i in range(100)}
    codeflash_output = marshal_json(users, Dict[str, User]); result = codeflash_output # 486μs -> 91.7μs (431% faster)
    parsed = json.loads(result)
    for i in range(100):
        u = parsed[f"user{i}"]


#------------------------------------------------
import json
from typing import Any, Dict, List, Optional, Union, get_args

# imports
import pytest  # used for our unit tests
from mistralai.types.basemodel import Nullable, OptionalNullable
from mistralai.utils.serializers import marshal_json
from pydantic import BaseModel, ConfigDict, Field, create_model
from typing_extensions import get_origin

# unit tests

# --- Basic Test Cases ---

def test_basic_int():
    # Basic integer serialization
    codeflash_output = marshal_json(5, int) # 309μs -> 18.1μs (1603% faster)

def test_basic_float():
    # Basic float serialization
    codeflash_output = marshal_json(3.14, float) # 293μs -> 16.4μs (1689% faster)

def test_basic_str():
    # Basic string serialization
    codeflash_output = marshal_json("hello", str) # 288μs -> 11.0μs (2512% faster)

def test_basic_bool_true():
    # Boolean True serialization
    codeflash_output = marshal_json(True, bool) # 286μs -> 14.2μs (1922% faster)

def test_basic_bool_false():
    # Boolean False serialization
    codeflash_output = marshal_json(False, bool) # 286μs -> 13.3μs (2048% faster)

def test_basic_list_int():
    # List of integers serialization
    codeflash_output = marshal_json([1, 2, 3], List[int]) # 346μs -> 16.3μs (2033% faster)

def test_basic_dict_str_int():
    # Dict of string to int serialization
    codeflash_output = marshal_json({"a": 1, "b": 2}, Dict[str, int]) # 367μs -> 17.3μs (2024% faster)

def test_basic_nested_list_dict():
    # Nested structure: list of dicts
    val = [{"a": 1}, {"b": 2}]
    typ = List[Dict[str, int]]
    codeflash_output = marshal_json(val, typ) # 420μs -> 19.6μs (2052% faster)

def test_basic_optional_present():
    # Optional value present
    codeflash_output = marshal_json(42, Optional[int]) # 360μs -> 18.7μs (1828% faster)

def test_basic_nullable_present():
    # Nullable value present
    codeflash_output = marshal_json("abc", Nullable[str]) # 409μs -> 12.1μs (3277% faster)

# --- Edge Test Cases ---



def test_edge_none_optional():
    # Optional type with None should return an empty string (not "null")
    codeflash_output = marshal_json(None, Optional[int]) # 384μs -> 12.0μs (3113% faster)

def test_edge_empty_list():
    # Empty list serialization
    codeflash_output = marshal_json([], List[int]) # 349μs -> 15.7μs (2126% faster)

def test_edge_empty_dict():
    # Empty dict serialization
    codeflash_output = marshal_json({}, Dict[str, int]) # 362μs -> 14.7μs (2368% faster)

def test_edge_empty_str():
    # Empty string serialization
    codeflash_output = marshal_json("", str) # 290μs -> 10.8μs (2597% faster)

def test_edge_zero_int():
    # Zero integer serialization
    codeflash_output = marshal_json(0, int) # 291μs -> 13.3μs (2091% faster)

def test_edge_zero_float():
    # Zero float serialization
    codeflash_output = marshal_json(0.0, float) # 290μs -> 13.2μs (2103% faster)

def test_edge_false_bool():
    # Boolean False serialization (redundant with basic, but for edge focus)
    codeflash_output = marshal_json(False, bool) # 281μs -> 13.2μs (2037% faster)

def test_edge_union_nullable():
    # Union with Nullable and None
    typ = Union[Nullable[int], None]
    codeflash_output = marshal_json(None, typ) # 3.83μs -> 3.77μs (1.57% faster)
    codeflash_output = marshal_json(7, typ) # 460μs -> 15.3μs (2905% faster)

def test_edge_union_optional():
    # Union with Optional
    typ = Union[int, None]
    codeflash_output = marshal_json(None, typ) # 348μs -> 12.6μs (2662% faster)
    codeflash_output = marshal_json(99, typ) # 303μs -> 11.4μs (2561% faster)

def test_edge_union_nullable_optional():
    # Union with Nullable and Optional
    typ = Union[Nullable[int], Optional[str]]
    codeflash_output = marshal_json(None, typ) # 3.44μs -> 3.11μs (10.7% faster)
    codeflash_output = marshal_json(3, typ) # 496μs -> 15.1μs (3188% faster)
    codeflash_output = marshal_json("abc", typ) # 425μs -> 9.89μs (4209% faster)


def test_edge_nested_optional():
    # Nested Optional inside Dict
    typ = Dict[str, Optional[int]]
    codeflash_output = marshal_json({"a": None, "b": 2}, typ) # 420μs -> 17.1μs (2356% faster)








def test_large_list_of_ints():
    # Large list of ints (performance and correctness)
    data = list(range(1000))
    codeflash_output = marshal_json(data, List[int]); result = codeflash_output # 421μs -> 80.8μs (422% faster)
    arr = json.loads(result)

def test_large_dict_of_str_to_int():
    # Large dict of str to int
    data = {str(i): i for i in range(1000)}
    codeflash_output = marshal_json(data, Dict[str, int]); result = codeflash_output # 614μs -> 227μs (170% faster)
    obj = json.loads(result)

def test_large_nested_structure():
    # Large nested structure: list of dicts with lists
    data = [{"a": [i, i+1, i+2]} for i in range(100)]
    typ = List[Dict[str, List[int]]]
    codeflash_output = marshal_json(data, typ); result = codeflash_output # 559μs -> 97.0μs (477% faster)
    arr = json.loads(result)


def test_large_list_of_nullable():
    # List of Nullable[int] with many None values
    data = [i if i % 2 == 0 else None for i in range(1000)]
    typ = List[Nullable[int]]
    codeflash_output = marshal_json(data, typ); result = codeflash_output # 564μs -> 81.0μs (597% faster)
    arr = json.loads(result)

def test_large_dict_of_optional():
    # Dict with many Optional[int] values, some None (should be excluded)
    data = {str(i): i if i % 2 == 0 else None for i in range(1000)}
    typ = Dict[str, Optional[int]]
    codeflash_output = marshal_json(data, typ); result = codeflash_output # 679μs -> 226μs (200% faster)
    obj = json.loads(result)
    # Only even keys should be present
    expected = {str(i): i for i in range(0, 1000, 2)}

# --- Mutation Testing Guards ---

def test_mutation_guard_nullable_vs_optional():
    # Nullable[None] returns "null", Optional[None] returns ""
    codeflash_output = marshal_json(None, Nullable[int]) # 1.87μs -> 2.01μs (6.63% slower)
    codeflash_output = marshal_json(None, Optional[int]) # 366μs -> 13.8μs (2554% faster)

def test_mutation_guard_exclude_none_behavior():
    # Dict[str, Optional[int]] with None values: None keys are excluded
    data = {"a": 1, "b": None}
    typ = Dict[str, Optional[int]]
    codeflash_output = marshal_json(data, typ); result = codeflash_output # 425μs -> 19.8μs (2045% faster)


def test_mutation_guard_empty_input():
    # Empty input for various types
    codeflash_output = marshal_json(None, Nullable[int]) # 2.10μs -> 2.10μs (0.048% faster)
    codeflash_output = marshal_json(None, Optional[int]) # 383μs -> 15.9μs (2316% faster)
    codeflash_output = marshal_json([], List[int]) # 297μs -> 14.8μs (1905% faster)
    codeflash_output = marshal_json({}, Dict[str, int]) # 296μs -> 8.38μs (3434% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-marshal_json-mh4k0i53 and push.

Codeflash

The optimized code achieves a **358% speedup** through two key optimizations:

**1. Pydantic Model Caching (Primary optimization)**
- Added `_marshaller_cache` to store created Pydantic models by type
- The original code called `create_model()` on every invocation, which is extremely expensive (93.5% of total runtime in profiler)
- Caching reduces `create_model` calls from 70 hits to only 31 hits for new types, with cached lookups being ~1000x faster
- This optimization is most effective for **repeated serialization of the same types**, as shown in the test results where basic type serializations see 15-40x speedups

**2. Direct Dictionary Access**
- Replaced `d[next(iter(d))]` with direct `d["body"]` access
- Since `model_dump()` always creates a dict with a single "body" key, direct access eliminates iterator overhead
- Minor but consistent improvement across all test cases

**3. Micro-optimization in `is_nullable`**
- Cached `get_origin(arg)` result to avoid redundant calls in the loop
- Small but measurable improvement in type checking

**Performance characteristics:**
- **Basic types**: 15-40x speedup due to model caching eliminating expensive Pydantic model creation
- **Large data structures**: 3-6x speedup as serialization overhead becomes more significant relative to model creation
- **Cache hits**: Near-instant model lookup vs. expensive `create_model()` call
- **Best for**: Applications that repeatedly serialize the same types, which is common in API serialization workflows

The caching strategy is particularly effective because type objects are hashable and immutable, making them ideal cache keys.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 24, 2025 07:53
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants