serpyco-rs: a serializer for python dataclasses

What is serpyco-rs ?

Serpyco is a serialization library for Python 3.10+ dataclasses that works just by defining your dataclasses:

import dataclasses
import serpyco_rs

@dataclasses.dataclass
class Example:
    name: str
    num: int
    tags: list[str]


serializer = serpyco_rs.Serializer(Example)

result = serializer.dump(Example(name="foo", num=2, tags=["hello", "world"]))
print(result)

>> {'name': 'foo', 'num': 2, 'tags': ['hello', 'world']}

Inspired by serpyco.

serpyco-rs works by analysing the dataclass fields and can recognize many types : list, tuple, Optional... You can also embed other dataclasses in a definition.

The main use-case for serpyco-rs is to serialize objects for an API, but it can be helpful whenever you need to transform objects to/from builtin Python types.

Installation

Use pip to install:

$ pip install serpyco-rs

Development

Run tests with Python and Rust coverage:

$ cargo install cargo-llvm-cov
$ brew install lcov  # or apt-get install lcov
$ nox -s coverage

The session writes coverage/python.lcov, coverage/rust.lcov, and the combined coverage/lcov.info report. It also writes a single HTML report to coverage/html/index.html.

Features

Serialization and deserialization of dataclasses
Validation of input data
Very fast
Support recursive schemas
Generate JSON Schema Specification (Draft 2020-12)
Support custom encoders/decoders for fields
Support deserialization from query string parameters (MultiDict like structures) with from string coercion

Supported field types

There is support for generic types from the standard typing module:

Decimal
UUID
Time
Date
DateTime
Enum
List
Dict
Bytes (pass through)
TypedDict
Mapping
Sequence
Tuple (fixed size)
Literal[str, int, Enum.variant, ...]
Unions / Tagged unions
typing.NewType
PEP 695 (Type Parameter Syntax) - Python 3.12+

Benchmarks

Linux

Linux Debian 13 ARM64 / Python 3.14

Load

Library	Median latency (milliseconds)	Operations per second	Relative (latency)
serpyco_rs	0.05	18212.1	1
mashumaro	0.17	5975.3	3.05
pydantic	0.19	5131	3.55
serpyco	0.51	1979.7	9.19
marshmallow	2.95	338.4	53.66

Dump

Library	Median latency (milliseconds)	Operations per second	Relative (latency)
serpyco_rs	0.04	25695.5	1
serpyco	0.04	24676.2	1.04
mashumaro	0.04	22632.2	1.13
pydantic	0.13	7787.2	3.3
marshmallow	0.68	1475.2	17.45

MacOS

macOS Sequoia 15.6 / Apple M4 Max / 36GB RAM / Python 3.14

Load

Library	Median latency (milliseconds)	Operations per second	Relative (latency)
serpyco_rs	0.05	20921.7	1
mashumaro	0.14	7078.3	2.96
pydantic	0.19	5359.2	3.91
serpyco	0.52	1926.8	10.88
marshmallow	2.53	394.7	53.11

Dump

Library	Median latency (milliseconds)	Operations per second	Relative (latency)
serpyco_rs	0.03	30582.2	1
serpyco	0.04	27437.9	1.11
mashumaro	0.04	22598.9	1.35
pydantic	0.11	8771	3.48
marshmallow	0.59	1700.5	17.94

Supported annotations

serpyco-rs supports changing load/dump behavior with typing.Annotated.

Currently available:

Alias
FieldFormat (CamelCase / NoFormat)
NoneFormat (OmitNone / KeepNone)
Discriminator
Min / Max
MinLength / MaxLength
CustomEncoder
NoneAsDefaultForOptional (ForceDefaultForOptional)
Flatten
JsonSchemaExtension

Alias

Alias is needed to override the field name in the structure used for load / dump.

from dataclasses import dataclass
from typing import Annotated
from serpyco_rs import Serializer
from serpyco_rs.metadata import Alias

@dataclass
class A:
    foo: Annotated[int, Alias('bar')]

ser = Serializer(A)

print(ser.load({'bar': 1}))
>> A(foo=1)

print(ser.dump(A(foo=1)))
>> {'bar': 1}

FieldFormat

Used to have response bodies in camelCase while keeping your python code in snake_case.

from dataclasses import dataclass
from typing import Annotated
from serpyco_rs import Serializer
from serpyco_rs.metadata import CamelCase, NoFormat

@dataclass
class B:
    buz_filed: str

@dataclass
class A:
    foo_filed: int
    bar_filed: Annotated[B, NoFormat]

ser = Serializer(Annotated[A, CamelCase])  # or ser = Serializer(A, camelcase_fields=True)

print(ser.dump(A(foo_filed=1, bar_filed=B(buz_filed='123'))))
>> {'fooFiled': 1, 'barFiled': {'buz_filed': '123'}}

print(ser.load({'fooFiled': 1, 'barFiled': {'buz_filed': '123'}}))
>> A(foo_filed=1, bar_filed=B(buz_filed='123'))

NoneFormat

Via OmitNone we can drop None values for non required fields in the serialized dicts

from dataclasses import dataclass
from serpyco_rs import Serializer

@dataclass
class A:
    required_val: bool | None
    optional_val: bool | None = None

ser = Serializer(A, omit_none=True) # or Serializer(Annotated[A, OmitNone])

print(ser.dump(A(required_val=None, optional_val=None)))
>>> {'required_val': None}

Unions

serpyco-rs supports unions of types.

from dataclasses import dataclass
from serpyco_rs import Serializer

@dataclass
class Foo:
    val: int

ser = Serializer(Foo | int)

print(ser.load({'val': 1}))
>> Foo(val=1)
print(ser.load(1))
>> 1

But performance of unions is worse than for single dataclasses. Because we need to check all possible types in the union. For better performance, you can use Tagged unions.

Tagged unions

Supports tagged joins with discriminator field.

All classes in the union must be dataclasses or attrs with discriminator field Literal[str] or Literal[Enum.variant].

The discriminator field is always mandatory.

from typing import Annotated, Literal
from dataclasses import dataclass
from serpyco_rs import Serializer
from serpyco_rs.metadata import Discriminator

@dataclass
class Foo:
    type: Literal['foo']
    value: int

@dataclass(kw_only=True)
class Bar:
    type: Literal['bar'] = 'bar'
    value: str

ser = Serializer(list[Annotated[Foo | Bar, Discriminator('type')]])

print(ser.load([{'type': 'foo', 'value': 1}, {'type': 'bar', 'value': 'buz'}]))
>>> [Foo(type='foo', value=1), Bar(type='bar', value='buz')]

Min / Max

Supported for int / float / Decimal types and only for validation on load.

By default, bounds are inclusive (matching JSON Schema minimum/maximum semantics). Use inclusive=False for exclusive bounds (exclusiveMinimum/exclusiveMaximum).

from typing import Annotated
from serpyco_rs import Serializer
from serpyco_rs.metadata import Min, Max

# Inclusive (default): 1 <= value <= 10
ser = Serializer(Annotated[int, Min(1), Max(10)])
ser.load(1)   # OK
ser.load(10)  # OK
ser.load(123)
>> SchemaValidationError: [ErrorItem(message='123 is greater than the maximum of 10', instance_path='')]

# Exclusive: 0 < value < 100
ser = Serializer(Annotated[float, Min(0, inclusive=False), Max(100, inclusive=False)])
ser.load(0.0)
>> SchemaValidationError: [ErrorItem(message='0 is less than the minimum of 0', instance_path='')]

MinLength / MaxLength

MinLength / MaxLength can be used to restrict the length of loaded strings or lists. Bounds are inclusive.

from typing import Annotated
from serpyco_rs import Serializer
from serpyco_rs.metadata import MinLength

ser = Serializer(Annotated[str, MinLength(5)])
ser.load("hello")  # OK (exactly 5 characters)

ser.load("1234")
>> SchemaValidationError: [ErrorItem(message='"1234" is shorter than 5 characters', instance_path='')]

NoneAsDefaultForOptional

ForceDefaultForOptional / KeepDefaultForOptional can be used to set None as default value for optional (nullable) fields.

from dataclasses import dataclass
from serpyco_rs import Serializer


@dataclass
class Foo:
    val: int                 # not nullable + required
    val1: int | None         # nullable + required
    val2: int | None = None  # nullable + not required

ser_force_default = Serializer(Foo, force_default_for_optional=True)  # or Serializer(Annotated[Foo, ForceDefaultForOptional])
ser = Serializer(Foo)

# all fields except val are optional and nullable
assert ser_force_default.load({'val': 1}) == Foo(val=1, val1=None, val2=None)

# val1 field is required and nullable and val1 should be present in the dict
ser.load({'val': 1})
>> SchemaValidationError: [ErrorItem(message='"val1" is a required property', instance_path='')]

Flatten

Flatten allows you to flatten nested structures into the parent structure, similar to serde's flatten attribute in Rust.

from dataclasses import dataclass
from typing import Annotated, Any
from serpyco_rs import Serializer
from serpyco_rs.metadata import Flatten

@dataclass
class Address:
    street: str
    city: str

@dataclass
class Person:
    name: str
    address: Annotated[Address, Flatten]        # Flatten struct fields
    extra: Annotated[dict[str, Any], Flatten]   # Collect additional properties

ser = Serializer(Person)

person = Person(
    name="John",
    address=Address(street="123 Main St", city="New York"),
    extra={"phone": "555-1234"}
)

# Serialization flattens all nested fields
result = ser.dump(person)
>> {'name': 'John', 'street': '123 Main St', 'city': 'New York', 'phone': '555-1234'}

# Deserialization reconstructs nested structures and collects extra fields
loaded = ser.load({'name': 'Jane', 'street': '456 Oak Ave', 'city': 'LA', 'email': 'jane@example.com'})
>> Person(name='Jane', address=Address(street='456 Oak Ave', city='LA'), extra={'email': 'jane@example.com'})

Validation Rules:

Only one dict flatten field per dataclass/TypedDict
No field name conflicts between regular and struct flatten fields (use Alias to resolve)
Only dataclass, TypedDict, and dict types can be flattened

JSON Schema: Flattened struct fields appear as top-level properties; objects with dict flatten have additionalProperties: true

Forbidding Extra Properties

Use dict[str, Never] with Flatten to forbid any additional properties not defined in the schema:

from dataclasses import dataclass
from typing import Annotated, Never
from serpyco_rs import Serializer
from serpyco_rs.metadata import Flatten

@dataclass
class StrictPerson:
    name: str
    age: int
    _: Annotated[dict[str, Never], Flatten]  # Forbid extra properties

ser = Serializer(StrictPerson)

# Valid data loads successfully
ser.load({'name': 'John', 'age': 30})
>> StrictPerson(name='John', age=30, _={})

# Extra properties cause validation error
ser.load({'name': 'John', 'age': 30, 'extra': 'field'})
>> SchemaValidationError: [ErrorItem(message='"field" is not of type "Never (no value allowed)"', instance_path='extra')]

# JSON Schema has additionalProperties: false
ser.get_json_schema()
>> {..., 'additionalProperties': False, ...}

JsonSchemaExtension

JsonSchemaExtension allows attaching arbitrary extension fields to the generated JSON Schema via Annotated. Multiple extensions on the same type are merged automatically.

from dataclasses import dataclass
from typing import Annotated
from serpyco_rs import Serializer
from serpyco_rs.metadata import JsonSchemaExtension

@dataclass
class User:
    email: Annotated[str, JsonSchemaExtension({"x-custom-tag": "pii"})]
    name: str

ser = Serializer(User)
schema = ser.get_json_schema()
# schema["components"]["schemas"]["User"]["properties"]["email"]["x-custom-tag"] == "pii"

Extensions only affect JSON Schema output — serialization and deserialization behavior is unchanged.

Custom encoders for fields

You can provide CustomEncoder with serialize and deserialize functions, or serialize_with and deserialize_with annotations.

from typing import Annotated
from dataclasses import dataclass
from serpyco_rs import Serializer
from serpyco_rs.metadata import CustomEncoder

@dataclass
class Foo:
    val: Annotated[str, CustomEncoder[str, str](serialize=str.upper, deserialize=str.lower)]

ser = Serializer(Foo)
val = ser.dump(Foo(val='bar'))
>> {'val': 'BAR'}
assert ser.load(val) == Foo(val='bar')

Note: CustomEncoder has no effect to validation and JSON Schema generation.

Bytes fields

serpyco-rs can loads bytes fields as is (without base64 encoding and validation).

from dataclasses import dataclass
from serpyco_rs import Serializer

@dataclass
class Foo:
    val: bytes

ser = Serializer(Foo)
ser.load({'val': b'123'}) == Foo(val=b'123')

PEP 695 Support

serpyco-rs supports the type parameter syntax from PEP 695, which was introduced in Python 3.12. This allows you to use a more concise and readable syntax for generic types.

Generic Dataclasses

from dataclasses import dataclass
from serpyco_rs import Serializer

@dataclass
class Container[T]:
    value: T
    items: list[T]

# Usage with concrete type
ser = Serializer(Container[int])

result = ser.dump(Container(value=42, items=[1, 2, 3]))
print(result)
>> {'value': 42, 'items': [1, 2, 3]}

loaded = ser.load({'value': 42, 'items': [1, 2, 3]})
print(loaded)
>> Container(value=42, items=[1, 2, 3])

Type Aliases

from dataclasses import dataclass
from serpyco_rs import Serializer

# New type alias syntax from PEP 695
type StrList = list[str]
type StrKeyDict[T] = dict[str, T]

@dataclass
class Data:
    names: StrList
    values: StrKeyDict[int]

ser = Serializer(Data)

result = ser.dump(Data(names=['alice', 'bob'], values={'a': 1, 'b': 2}))
print(result)
>> {'names': ['alice', 'bob'], 'values': {'a': 1, 'b': 2}}

Getting JSON Schema

serpyco-rs can generate JSON Schema for your dataclasses (Draft 2020-12).

from dataclasses import dataclass
from serpyco_rs import Serializer

@dataclass
class A:
    """Description of A"""
    foo: int
    bar: str

ser = Serializer(A)

print(ser.get_json_schema())
>> {
    '$schema': 'https://json-schema.org/draft/2020-12/schema',
    '$ref': '#/components/schemas/A',
    'components': {
        'schemas': {
            'A': {
                'properties': {
                    'foo': {'type': 'integer'},
                    'bar': {'type': 'string'}
                },
                'required': ['foo', 'bar'],
                'type': 'object',
                'description': 'Description of A'
            }
        }
    }
}

Also, you can configure the schema generation via JsonSchemaBuilder.

from dataclasses import dataclass
from serpyco_rs import Serializer, JsonSchemaBuilder

@dataclass
class A:
    foo: int
    bar: str

ser = Serializer(A)

builder = JsonSchemaBuilder(
  add_dialect_uri=False,
  ref_prefix='#/definitions',
)

print(builder.build(ser))
>> {'$ref': '#/definitions/__main__.A'}

print(builder.get_definitions())
>> {
  "__main__.A": {
    "properties": {
      "foo": {
        "type": "integer"
      },
      "bar": {
        "type": "string"
      }
    },
    "required": [
      "foo",
      "bar"
    ],
    "type": "object"
  }
}

Query string deserialization

serpyco-rs can deserialize query string parameters (MultiDict like structures) with from string coercion.

from dataclasses import dataclass
from urllib.parse import parse_qsl

from serpyco_rs import Serializer
from multidict import MultiDict

@dataclass
class A:
    foo: int
    bar: str

ser = Serializer(A)

print(ser.load_query_params(MultiDict(parse_qsl('foo=1&bar=2'))))
>> A(foo=1, bar='2')

Custom Type Support

In serpyco-rs, you can add support for your own types by using the custom_type_resolver parameter and the CustomType class. This allows you to define how your custom types should be serialized and deserialized.

CustomType

The CustomType class is a way to define how a custom type should be serialized and deserialized. It is a generic class that takes two type parameters: the type of the object to be serialized/deserialized and the type of the serialized/deserialized object.

Here is an example of a CustomType for IPv4Address:

from serpyco_rs import CustomType
from ipaddress import IPv4Address, AddressValueError

class IPv4AddressType(CustomType[IPv4Address, str]):
    def serialize(self, obj: IPv4Address) -> str:
        return str(obj)

    def deserialize(self, data: str) -> IPv4Address:
        try:
            return IPv4Address(data)
        except AddressValueError:
            raise ValueError(f"Invalid IPv4 address: {data}")

    def get_json_schema(self) -> dict:
        return {"type": "string", "format": "ipv4"}

In this example, IPv4AddressType is a CustomType that serializes IPv4Address objects to strings and deserializes strings to IPv4Address objects. The get_json_schema method returns the JSON schema for the custom type.

custom_type_resolver

The custom_type_resolver is a function that takes a type as input and returns an instance of CustomType if the type is supported, or None otherwise. This function is passed to the Serializer constructor.

Here is an example of a custom_type_resolver that supports IPv4Address:

def custom_type_resolver(t: type) -> CustomType | None
    if t is IPv4Address:
        return IPv4AddressType()
    return None

ser = Serializer(MyDataclass, custom_type_resolver=custom_type_resolver)

In this example, the custom_type_resolver function checks if the type is IPv4Address and returns an instance of IPv4AddressType if it is. Otherwise, it returns None. This function is then passed to the Serializer constructor, which uses it to handle IPv4Address fields in the dataclass.

Full Example

from dataclasses import dataclass
from ipaddress import IPv4Address
from serpyco_rs import Serializer, CustomType

# Define custom type for IPv4Address
class IPv4AddressType(CustomType[IPv4Address, str]):
    def serialize(self, value: IPv4Address) -> str:
        return str(value)

    def deserialize(self, value: str) -> IPv4Address:
        return IPv4Address(value)

    def get_json_schema(self):
        return {
            'type': 'string',
            'format': 'ipv4',
        }

# Defining custom_type_resolver
def custom_type_resolver(t: type) -> CustomType | None:
    if t is IPv4Address:
        return IPv4AddressType()
    return None

@dataclass
class Data:
    ip: IPv4Address

# Use custom_type_resolver in Serializer
serializer = Serializer(Data, custom_type_resolver=custom_type_resolver)

# Example usage
data = Data(ip=IPv4Address('1.1.1.1'))
serialized_data = serializer.dump(data)  # {'ip': '1.1.1.1'}
deserialized_data = serializer.load(serialized_data)  # Data(ip=IPv4Address('1.1.1.1'))

Date and time formats

serpyco-rs parses and emits datetime.datetime, datetime.date, and datetime.time using the RFC 3339 profile of ISO 8601 (via the speedate crate). Strings shorter or longer than RFC 3339 (e.g. Python's isoformat() with a space separator, or trailing nanoseconds) are accepted with the rules below; output is always RFC 3339.

`datetime.datetime`

Load accepts a string in the form YYYY-MM-DDTHH:MM:SS[.ffffff][Z|±HH:MM]:

Input	Result
`2022-10-10T14:23:43`	naive `datetime(2022, 10, 10, 14, 23, 43)`
`2022-10-10T14:23:43.123456`	naive with microseconds
`2022-10-10T14:23:43Z`	aware, `tzinfo=UTC`
`2022-10-10T14:23:43+01:00`	aware, fixed offset `+01:00`
`2024-04-02T12:21:53.725421224`	sub-microsecond digits are truncated to microseconds (`725421`)

Dump produces an RFC 3339 string. Aware datetimes are written with an explicit Z or ±HH:MM offset; naive datetimes are written without one. Pass Serializer(datetime, naive_datetime_to_utc=True) to force naive datetimes to be emitted as UTC (...Z).

`datetime.date`

Load accepts YYYY-MM-DD. Dump produces YYYY-MM-DD. Dumping a datetime into a date field drops the time component:

Serializer(date).load('2022-10-14')                  # date(2022, 10, 14)
Serializer(date).dump(datetime(2022, 10, 13, 12, 34, 56))  # '2022-10-13'

`datetime.time`

Load accepts HH:MM, HH:MM:SS, HH:MM:SS.ffffff, optionally followed by Z or ±HH:MM:

Input	Result
`12:34`	`time(12, 34)`
`12:34:56`	`time(12, 34, 56)`
`12:34:56.000078`	`time(12, 34, 56, 78)`
`12:34:57.000095987`	sub-microsecond digits truncated (`microsecond=95`)
`12:34:56.000078+03:00`	aware time with fixed `+03:00` offset

Dump produces HH:MM:SS (or HH:MM:SS.ffffff when microseconds are non-zero), with Z / ±HH:MM appended for aware times.

Name		Name	Last commit message	Last commit date
Latest commit History 561 Commits
.github		.github
bench		bench
python/serpyco_rs		python/serpyco_rs
requirements		requirements
src		src
tests		tests
.envrc.example		.envrc.example
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
build.rs		build.rs
noxfile.py		noxfile.py
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml

Folders and files

Latest commit

History

Repository files navigation

serpyco-rs: a serializer for python dataclasses

What is serpyco-rs ?

Installation

Development

Features

Supported field types

Benchmarks

Load

Dump

Load

Dump

Supported annotations

Alias

FieldFormat

NoneFormat

Unions

Tagged unions

Min / Max

MinLength / MaxLength

NoneAsDefaultForOptional

Flatten

Forbidding Extra Properties

JsonSchemaExtension

Custom encoders for fields

Bytes fields

PEP 695 Support

Generic Dataclasses

Type Aliases

Getting JSON Schema

Query string deserialization

Custom Type Support

CustomType

custom_type_resolver

Full Example

Date and time formats

datetime.datetime

datetime.date

datetime.time

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 61

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`datetime.datetime`

`datetime.date`

`datetime.time`

Packages