Can buffers be efficiently allocated ahead of time? #1880

joshuamaiche · 2025-08-02T04:35:08Z

joshuamaiche
Aug 2, 2025

I've got a struct I want to serialize, but it lives in a dynamic library (.dll, .so). The library exposes this Serialize() function as part of a C API, so while the function itself can use whatever C++ goodies I want, the function parameters can't use things like templates.

From what I know, this gives me 4 options with Glaze:

Create the dynamic buffer on the library side; return a const char*. This means the string has to be managed somehow, adding complexity to the interaction between the library and the caller.
Create the dynamic buffer on the caller side. Pass in the type-erased buffer, and callback functions to access and resize the buffer. On the library side, wrap the parameters in a custom class that matches the concepts needed for Glaze to treat the buffer as resizable. There might be a slight indirection cost to using callbacks, but this is probably the most performant way.
Create a static buffer on the caller side. Pass a const char* and a size to the library. Glaze then gracefully handles the cases where the buffer is too small, and this is reported to the user. Beyond this depending on functionality I don't think Glaze has, this also doesn't answer the question of what to do if the buffer is too small--just double its capacity until the function succeeds?
Have a function that returns the size needed to serialize the struct. The caller then sizes the buffer accordingly, and nothing needs to be checked.

I'm leaning towards (2) since only it and (1) are available, and (2) feels safer and more performant (can use the target buffer directly). I'm wondering, however, how viable (3) and (4) would be long-term. Is there a world where the cost of going through the struct twice (once to compute the buffer needed; once to actually serialize) is cheaper than the cost of checking when the buffer needs to be resized, multiple reallocations, and all the memcpy calls? If so, not only would this be a boon for performance, it would also unlock a way to easily make C apis!

stephenberry · 2025-08-02T16:26:06Z

stephenberry
Aug 2, 2025
Maintainer

I think either 1 or 2 are the best options. If you don't care about having this buffer sit around in memory then you can use a static thread_local std::string in the C++ and then pass this over the C API as a const char*. This way you don't have to care about deleting memory, but it does mean that this memory will use up RAM as long as your dll is loaded and you've made serialization calls.

Option 2 would be nice and is definitely feasible, but it does take some work. I don't think the callbacks would be too expensive.

I really hope to add support for option 3 soon, but it would just give an error if the buffer was too small, so you'd then have to allocate a larger buffer and call serialize again, which would be slow.

Option 4 would probably require serializing the data twice, once into a temporary buffer to get the size and then another time into the target buffer. This would be half the speed.

Is there a world where the cost of going through the struct twice (once to compute the buffer needed; once to actually serialize) is cheaper than the cost of checking when the buffer needs to be resized, multiple reallocations, and all the memcpy calls?

Glaze doesn't have a way to compute the buffer size needed without serializing, so while this in theory might be better in edge cases, it isn't generally a good idea. A primary reason is that if you call your serialize function more than once, it just makes sense to reuse the buffer. Glaze is designed to make reusing buffers easy and fast, because often a resized buffer for the first serialization will contain enough capacity for future serialization passes.

On a side note. I'm currently working on a C shared library interface for Glaze that provides type information so that other languages can directly access memory from C++ standard library structures. This enables accessing std::string buffers from C++ across a C interface. This will be able to support lots of languages, but I'm currently focused on interfacing directly with Julia.

What programming language are you loading your C++ shared library into?

0 replies

joshuamaiche · 2025-08-05T21:53:10Z

joshuamaiche
Aug 5, 2025
Author

Thanks for the help, Stephen!

If you don't care about having this buffer sit around in memory then you can use a static thread_local std::string in the C++ and then pass this over the C API as a const char*.

That's an interesting idea! Basically just keep the buffer valid until the next call.

A primary reason is that if you call your serialize function more than once, it just makes sense to reuse the buffer.

I think I see your point. There is a theoretical world where dynamically resizing a buffer is slower than a check-and-allocate approach, for the first time both are run. However, assuming the buffer is reused, the average serialization time will be faster with dynamic resizes, since resizing will almost never happen, whereas a check-and-allocate approach will need to process the data twice every time it's run.

This might be highlighting our different usage patterns. For something like a message-passing library, this makes tons of sense--the buffer sizes are small, and reused frequently. For my case, I'm editing a large set of data, and saving it / loading it infrequently. I can't keep the buffers around, because the memory consumption would be too high. I'm guessing, given this, that option 2 is the best, so I'll gravitate towards that for now. Thanks for the help!

What programming language are you loading your C++ shared library into?

I'm using C++ on both sides, haha. I just have the C bottleneck between them to avoid C++ ABI dependencies. I'd probably not use your library for STL structures, but if it also acts as a general structure descriptor, that might be super-useful for me to check structures are binary-compatible across different plugins! Is the code visible anywhere?

1 reply

stephenberry Aug 6, 2025
Maintainer

Thanks for sharing your use case. That makes sense that you don't want to keep the memory alive, which does change the optimal approach. I do think option 2 is a good approach.

If you expect to be consistent with compiler usage and therefore don't need extremely strict ABI compatibility, Glaze currently has code for building shared libraries that allow C++ types to be passed across and type checked: https://stephenberry.github.io/glaze/glaze-interfaces/

But, this approach was designed for using the same compiler brand (e.g. GCC) for the shared library and the code loading it. The shared library will give a runtime error if compilers mismatch, as the compiler gets embedded in the type hash for the interface.

Your option 2 is probably best, but I thought I'd share this feature that I've used before for high performance, direct memory access.

The new code I'm developing uses pure C to wrap STL structs, which allows direct memory access through C function calls, and is therefore completely ABI safe, even across compilers. I should be able to move the C++ side into Glaze sometime this week into an active pull request so that you can take a look at it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can buffers be efficiently allocated ahead of time? #1880

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Can buffers be efficiently allocated ahead of time? #1880

Uh oh!

joshuamaiche Aug 2, 2025

Replies: 2 comments · 1 reply

Uh oh!

stephenberry Aug 2, 2025 Maintainer

Uh oh!

Uh oh!

joshuamaiche Aug 5, 2025 Author

Uh oh!

stephenberry Aug 6, 2025 Maintainer

joshuamaiche
Aug 2, 2025

Replies: 2 comments 1 reply

stephenberry
Aug 2, 2025
Maintainer

joshuamaiche
Aug 5, 2025
Author

stephenberry Aug 6, 2025
Maintainer