Infrastructure for saving/loading hls4ml models #1158

vloncar · 2024-12-22T22:51:34Z

Description

Adds the ability to save and load hls4ml models (serialize/deserialize them). Given a ModelGraph, this will serialize it in a a single file which can be loaded at a later stage. The saved model doesn't depend on the original Keras/PyTorch/ONNX model in any way.

The feature is in part inspired by Keras' model saving feature. The main format used for serialization is JSON, all objects save their state in dictionaries which are serialized into JSON. Assuming disk space is not a problem, generated JSON is nicely formatted during writing to file. No objects are pickled, as this is way too unsafe. The numpy arrays (weights) are saved in npz format. We save model graph (list of layers), the model information and config into separate files. This (along with some versioning information is packaged into a .fml file, which is just a .tar.gz with a different name.

Internally, this works by adding a few methods to types, quantizers, layers and model graph itself. The interface is defined by the Serializable class. Classes would typically implement serialize_state() method, which should return a dictionary of current state of the object. Additionally, there's also a serialize_class_name() which is needed to know what instance are we saving, but most classes won't need to deal with this. Deserialization is done with a class method deserialize(). To support this feature some restructuring had to be done. ModelGraph has been intended to be created only with a layer list from a converter, which is not compatible with (de)serialization, so it was split into initialization of empty ModelGraph and conversion of layer list from converters to Layer objects. Furthermore, Layer's initialization has to be skipped, as we're basically restoring a state post-initialization. Types and quantizers are more straightforward to save/load. Loaded model should be indistinguishable from the original, but there may be some corner cases of some hacks of internal state of layers (or partially optimized models) not working on loaded models, we can catch these over time. But for "final" models (one you're happy enough with to call write()/compile()/build() on) saving/loading should always work.

One somewhat ugly part in the current implementation is that due to the creation of dynamic wrapper classes, we cannot directly deserialize to them, instead we create the original types, and have to run <backend>:specific_types optimizer to truly get an object that is identical to the original one. Running that optimizer for a given backend looks a bit hacky, but is ok for now since all backends have an optimizer by that name.

Type of change

New feature (non-breaking change which adds functionality)

Tests

Included is a test in test_serialization.py that tests saving/loading QKeras and QONNX models. These cover serialization of most types and quantizers that can appear in a model, but obviously not all possible layers. Maybe a more thorough test would be to extend most existing tests to save and load a model and then continue working with a loaded model. But I'll leave that to a future PR.

Checklist

I've done all the usual checks prior to opening this PR.

jmitrevs · 2025-01-10T15:38:05Z

Even though the running time exceeded the limits, there were failures in test_serialization beforehand.

JanFSchulte · 2025-01-10T15:40:35Z

Even though the running time exceeded the limits, there were failures in test_serialization beforehand.

Indeed. I am just rerunning the tests to see which part is taking so long, and the QONNX test runs really fast, so it is the serialization test itself that is very slow.

jmitrevs · 2025-01-10T16:02:02Z

Running locally on my linux machine the serialization tests are pretty quick (1-2min), but I get:

FAILED test_serialization.py::test_qkeras_model[io_stream-oneAPI] - subprocess.CalledProcessError: Command 'make lib' returned non-zero exit status 2.
FAILED test_serialization.py::test_qonnx_model[oneAPI] - subprocess.CalledProcessError: Command 'make lib' returned non-zero exit status 2.
FAILED test_serialization.py::test_qkeras_model[io_parallel-oneAPI] - subprocess.CalledProcessError: Command 'make lib' returned non-zero exit status 2.

The exact failure is:

icpx: error: fpga compiler command failed with exit code 14 (use -v to see invocation)

I will investigate. One thing that is kind of annoying is that I get more failures on my mac since ap_math doesn't really support clang:

firmware/ac_math/include/ac_math/ac_pow_pwl.h:300:70: error: typedef 'pit_t' cannot be referenced with a class specifier
  300 |     typedef class comp_pii_exp<W, I, S, n_frac_bits + extra_f_bits>::pit_t input_inter_type;
      |                                                                      ^

But that's unrelated to this PR.

…o run

bo3z

First round of reviews of the saving/loading infrastructure of hls4ml models. Generally, looks very good - most of the comments are minor with some questions for my better understanding. Two high-level questions / comments, for here as well:

Would it be better to name the methods save_model and load_model? This is closer to what Keras does and maybe a bit clearer to most users?
At what stage can a model be saved? After conversion but before calling .compile(...) or also after compilation? Most of the examples/tests I've seen in this PR are before compilation. Generally, it would be useful to be able to save/load the model after compilation (e.g. I converted the model on my local machine, compiled it to verify accuracy with predict(...) and then saved it, to synthesize it on a remote node). This is trickier though and can open up non-clear issues....e.g, if the CPU ISA is different between nodes the comiled hls4ml library is useless. But also useful, since compilation can sometimes take long (especially with lots of code generation).

test/pytest/test_serialization.py

bo3z · 2025-04-07T14:38:44Z

hls4ml/model/graph.py

+        config.pop('OnnxModel', None)
+        config.pop('PytorchModel', None)
+
+        # Much of this may not be needed and is already in 'config' dict but is kept here to be sure


This seems a bit error-prone, because if there's a new attribute added to HLS config and it's not reflected in these two functions, it could fail later on. I guess there are three ways to handle this:

As the comment suggests, assume most (all?) of the variables are reflected in config, so the others are simply fail-safe mechanisms... do we know which of this variables aren't directly stored in config?

Find a way to implicitly iterate through all the internal member variables of HLSConfig and store them to state. This seems like something that's doable in Python

Ignore for now, as it doesn't seem the HLS config will change a lot in the (near) future.

Another one of those cases of "if only we had better config infrastructure" 😄

The problem we have is that there's the initial config dict, and the current config class. Then there may be code that uses one or the other, so we need to save both. Your second solution is error prone too, as it would assume we know how to handle all possible values of internal members so we can save them.

In another project we're investigating Pydantic for for config schemas, maybe we can apply that to hls4ml in the future. For now I feel we could maybe make it clear to future developers that if they add new internal members they need to also follow up with serialization functions.

bo3z · 2025-04-07T14:42:30Z

hls4ml/model/graph.py

-        self._applied_flows = []
+    @classmethod
+    def from_layer_list(cls, config_dict, layer_list, inputs=None, outputs=None):
+        def _find_output_variable_names(layer_list, layer_names):


Just a personal preference - but I am not the biggest fan of functions inside of functions unless absolutely necessary. Is there a reason this function got moved?

No other function uses it and it doesn't depend on self. If it was separate it would also have to be a classmethod, but it felt weird to have a class method intended only for internal use. I actually considered moving it to a separate utility file, but that also felt like too much given that it really is used only once.

bo3z · 2025-04-07T14:43:57Z

hls4ml/model/graph.py

+    @classmethod
+    def deserialize(cls, state):
+        raise Exception(
+            '{cls.__name__} is not intended to be deserialized directly. Use {cls.__name__}.from_saved_state instead.'


Is this missing a string formatter? f'...{}...'

indeed it does

bo3z · 2025-04-07T14:47:51Z

hls4ml/model/quantizers.py

@@ -76,6 +90,10 @@ def __call__(self, data):
        ones = np.ones_like(data)
        return np.where(data > 0.5, ones, np.where(data <= -0.5, -ones, zeros))

+    def serialize_state(self):
+        state = {}


Minor, but not sure if I am missing something obvious here: the BinaryQuantizer has a state but the TernaryQuantizer doesn't? In my understanding of the serialization, neither should have the state? Or if yes, both?

This is a weird one... The "state" here is simply the choice of the number of bits used (1 or 2). For ternary we will always use 2 bits so there's no state to save. State is something you will need to pass to __init__(...) to recreate the object.

bo3z · 2025-04-07T14:52:05Z

hls4ml/model/types.py

@@ -588,5 +731,16 @@ def __init__(self, code):
    def __str__(self):
        return str(self.code)

+    def serialize_class_name(self):
+        cls = self.__class__


Why is this function re-implemented here? To me it seems the same as the one in the Serializable class?

It's different. The wrangling of the name is needed because types are created dynamically (belonging to a backend). Source isn't one of them

bo3z · 2025-04-07T14:53:10Z

hls4ml/utils/serialization.py

+from .._version import version
+
+
+def serialize_model(model, file_path):


Could we add a short docstring to this function? Could be a copy of the docs, but still nice to have since it's a user-facing function.

Sure I'll do that

vloncar · 2025-04-08T17:48:58Z

For the general comments:

Can be done. I actually tried not to copy Keras naming convention, but if that's what people prefer, I'm fine with renaming the function
You can call it afterwards. We save the IR, not what the IR ultimately generates (writes as a project). But we ensure that what is written is the same whether the model was serialized or not. write() doesn't change the internal state, so you're safe to call it many times at whichever point you want. compile() will call write(), then compile to create the .so and link it. This .so and the subsequent link to python runtime cannot be saved as they are not portable at all. Even on the same CPU type, but different OS family. For example on a desktop/laptop people would use Ubuntu or something else modern, and then move it to server for synthesis which is likely to run RHEL or clone. The compiled .so won't work and will need recompilation. Luckily you can be sure that when you call write(), compile() or build() you that first stage of writing the project to disk will result in the the same thing as the original model.

vloncar · 2025-04-29T20:42:13Z

Now also includes an option to "convert" from an existing project written on disk which we can argue is a form of model loading. Adds a function convert_from_existing_project() that returns a special type of ModelGraph where only the compile(), predict() and build() can run. This is useful if you want to use python integration after the project already exists. I used this "feature" for a while for development, as it allows one to generate an HLS project, manually make changes to it, and then go back to python to test stuff with predict() without the use of C++ testbench (which let's be honest, is a cumbersome way to test out the project). The person that asked for this to be an official feature uses it for testing in a separate environment that doesn't have ML frameworks

JanFSchulte · 2025-04-29T20:51:51Z

That looks pretty cool! I'm just wondering if convert_from_existing_project() is the best name for the function, as no actual conversion is happening IIUC this is not a way to modify the project, just to be able to re-run these functions. Maybe load_existing_project() would be a bit more clear on the scope.

vloncar · 2025-04-29T20:59:07Z

I didn't know where to put it, in converters or in utilities. If it is in converter, then it should be a convert_from. Also not sure where to put the FilesystemModelGraph class definition either. Feels like a very weird appendage currently in graph.py. We need to converge on the naming of functions in this PR anyway, so let's come up with a solution to all of this. Otherwise the PR is ready.

jmitrevs · 2025-04-29T22:30:18Z

Should we maybe rename serialize_model to save_model and convert_from_existing_project load_model? Or something similar? One could also potentially have hls_model.save(filename) too.

vloncar · 2025-04-30T09:40:42Z

Currently there is serialize()/deserialize() that work with a specific format that contains all information about a model in a .fml file. Then there is convert_from_existing_project() which creates a limited graph from existing project created with write() (full reconstruction is not possible). Calling that feature load_model() doesn't feel right. What would we do with deserialize() in that case?

I thought about renaming the main functions as export()/import() but it still feels that load_model for existing project is out of place as there's no equivalent save() or save_model().

JanFSchulte · 2025-04-30T12:09:03Z

I like export() and import() for the proper saving/loading of models. For the other case, I'd still go with something like load_project() to make sure that it allows to interact with an HLS project and not an hls4ml model. This specific functionality, including the FilesystemModelGraph could live in a hls4ml/utils/projects.py, or similar.

vloncar · 2025-04-30T13:43:34Z

But load_model() doesn't load a "model", it creates a model that links to an existing hls project. link_project() or link_existing_project() may make sense too (then I'd put the whole feature under hls4ml/utils/link.py). I'd like to avoid having the word "HLS" or "hls" in whatever name we end up with.

JanFSchulte · 2025-04-30T13:47:34Z

But load_model() doesn't load a "model", it creates a model that links to an existing hls project.

Yeah, that's why I proposed load_project() ;) load_existing_project() would be even more clear I think. Maybe link is even more precise than load, but might be a bit harder to comprehend what it's for. Either would be fine with me.

vloncar · 2025-05-02T17:57:46Z

I simplified the API a bit: Saving is done with ModelGraph.save() (as suggested by Jovan), loading with hls4ml.converters.load_saved_model(). The existing project can be linked with hls4ml.converters.link_existing_project(). That way all user-facing model-creation calls are in the same place and all model manipulation calls are on the model object, no need for callinfg functions from other parts of the API.

JanFSchulte · 2025-05-02T18:13:12Z

Thanks! I think this is ready for merge, except that the documentation needs to be updated to reflect this latest change to the interface.

Infrastructure for saving/loading hls4ml models

21c30b6

vloncar added the please test Trigger testing by creating local PR branch label Dec 22, 2024

bo3z added this to the v1.1.0 milestone Jan 7, 2025

JanFSchulte added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Jan 7, 2025

vloncar added 4 commits April 1, 2025 18:42

Merge branch 'main' into serialization

76dc49a

Handle serialization of oneAPI types correctly

f1af921

Remove oneAPI from qonnx serialization test since it takes too long t…

b06e016

…o run

Fix oneAPI type transformation

5a59d32

vloncar force-pushed the serialization branch from a36d007 to 5a59d32 Compare April 1, 2025 19:13

vloncar added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Apr 1, 2025

Merge remote-tracking branch 'upstream/main' into serialization

29c0f5a

calad0i mentioned this pull request Apr 5, 2025

Support for multi graph build #1174

Merged

7 tasks

bo3z requested changes Apr 7, 2025

View reviewed changes

bo3z added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Apr 7, 2025

bo3z modified the milestones: v1.1.0, v1.2.0 Apr 7, 2025

Add serialization doctrings

adcbd6c

vloncar force-pushed the serialization branch from eeb83f6 to adcbd6c Compare April 8, 2025 18:15

vloncar added 3 commits April 25, 2025 20:29

Merge remote-tracking branch 'upstream/main' into serialization

80134ac

Support for pseudo-deserialization from existing projects

42c8183

Merge remote-tracking branch 'upstream/main' into serialization

110c87e

vloncar added 2 commits May 2, 2025 19:54

Finalize saving/loading API

889d3c1

Merge remote-tracking branch 'upstream/main' into serialization

9169470

Update docs on saving/loading/linking

2cd79b9

JanFSchulte approved these changes May 3, 2025

View reviewed changes

bo3z self-requested a review May 5, 2025 20:50

bo3z approved these changes May 5, 2025

View reviewed changes

bo3z merged commit 631332c into fastmachinelearning:main May 5, 2025
5 checks passed

		from .._version import version


		def serialize_model(model, file_path):

Infrastructure for saving/loading hls4ml models #1158

Infrastructure for saving/loading hls4ml models #1158

Uh oh!

Conversation

vloncar commented Dec 22, 2024

Description

Type of change

Tests

Checklist

Uh oh!

jmitrevs commented Jan 10, 2025

Uh oh!

JanFSchulte commented Jan 10, 2025

Uh oh!

jmitrevs commented Jan 10, 2025

Uh oh!

bo3z left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vloncar commented Apr 8, 2025

Uh oh!

vloncar commented Apr 29, 2025

Uh oh!

JanFSchulte commented Apr 29, 2025

Uh oh!

vloncar commented Apr 29, 2025

Uh oh!

jmitrevs commented Apr 29, 2025

Uh oh!

vloncar commented Apr 30, 2025

Uh oh!

JanFSchulte commented Apr 30, 2025

Uh oh!

vloncar commented Apr 30, 2025

Uh oh!

JanFSchulte commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vloncar commented May 2, 2025

Uh oh!

JanFSchulte commented May 2, 2025

Uh oh!

Uh oh!

Uh oh!

JanFSchulte commented Apr 30, 2025 •

edited

Loading