Expose structure of serialized state #148

pcarrott · 2025-10-21T15:29:58Z

State_intf now exposes the definition of the serialized type, including global information and the serialized_atom type, which are both needed for RUXt. The serialized structures had to be factored out of the corresponding Make functor for modules Freeable, Tree_block and Rtree_block.

N1ark · 2025-10-21T17:03:24Z

soteria-rust/lib/state_intf.ml

+
+  type serialized = (serialized_atom list, sloc Typed.t list) Template.t
+
+  and serialized_atom =
+    sloc Typed.t
+    * (Sptr.t Rtree_block.serialized_val, sint Value.t) Tree_block.serialized
+      Freeable.serialized
+


We can't do this -- if we want to add a new state model for polymorphic analysis the state shape will be different. Instead, in State.mli add with type serialized = ..., to expose the signature for that one module, and then your implementation will just rely on the fact that's the state model used, which is ok

N1ark · 2025-10-21T17:04:37Z

soteria-rust/lib/state_intf.ml

+module Template = struct
+  type ('a, 'b) t = { heap : 'a; globals : 'b }
+  [@@deriving show { with_path = false }]
+end


also this doesn't necessarily apply to all states either, so define it within state.ml and expose it in state.mli if needed

N1ark · 2025-10-21T17:08:21Z

soteria-rust/lib/state.ml

+and serialized_globals = T.sloc Typed.t list
+[@@deriving show { with_path = false }]
+
+let serialize_globals globals : serialized_globals =
+  ListLabels.fold_left (GlobMap.bindings globals) ~init:[]
+    ~f:(fun acc (_, ((ptr : Sptr.t), _)) -> Typed.Ptr.loc ptr.ptr :: acc)


this doesn't make sense to me -- just knowing "a global" is in a pointer isn't enough; what you should have is probably serialized_global = Charon.Types.global_decl_id * T.sloc Typed.t.
Sidenote, I'm not sure why you have serialized_globals as a list rather than serialized_global and then a list of those, like with serialized_atom

Should glob_map be an instance of Pure_fun and then serialize comes for free?

hmmm that'd require changing the state to be PMap (Product (Pure_fun (GlobalThing)) (Freeable (TreeBlock))) i think, which could be done but id rather avoid it, given i have plans of putting Freeable (TreeBlock) into a With_info thing that tracks what kind of allocation it is like here
https://github.com/rust-lang/rust/blob/be0ade2b602bdfe37a3cc259fcc79e8624dcba94/compiler/rustc_middle/src/mir/interpret/mod.rs#L260-L276

N1ark · 2025-10-21T17:11:48Z

soteria-rust/lib/state.ml

+type serialized =
+  (serialized_atom list, serialized_globals) State_intf.Template.t


Rather than State_intf.Template.t, with { heap; globals } i would much rather have

type serialized = | Heap of serialized_atom | Global of serialized_global

We'll probably need more information later on: for treeborrows maybe, for pointer decays (mapping a location to it's exposed integer value), for functions (mapping a function reference to a pointer? though i have other plans there anyways), and for errors (if the pre/post condition imply unwinding, what the error is)

this also corresponds much more closely to work on CSE with core predicates, so I'd prefer we stick to what we know works :P

I believe this change is directly inspired by a similar thing in Coteria. A serialized heap need not be decomposed in disjunctions of things? As long as I can produce/consume it it's fine

But, now that we're discussing this, maybe Opale is correct here, in that we have discovered with Pedro's work that we want atoms of the assertion to be as small as possible as to enable feasible matching plans

to me it never made sense to have more than an atom is serialized -- it makes using them harder; it should just be Missing of serialized list list, where serialized is just atoms and is built to be kept small :P

my intuition comes mostly from using core predicates within Gillian but i don't see a reason to complexify it

Efficiency. It might be that knowing at one time what you are going to produce/consume is cheaper than consuming atom by atom.

E.g. if I know that I'm going to produce a whole sub-tree, or maybe a whole object into memory, it's cheaper to do it in one step rather than doing it bit by bit and re-going through pmap and treeblock operations for each bit

fair enough :) but in that case id rather this only be done where it's relevant (PMap and PList only) and nowhere else; e.g. there is no advantage in producing globals and heap data "together" vs splitting it into two

also to ease processing i wonder if as a utility function in e.g. the matching plan one can have some flatten : serialized -> serialized list and merge : serialized list -> serialized list that e.g. converts (loc, [ ser1; ser2; ser3 ])to [(loc, [ ser1 ]); (loc, [ ser2 ]); (loc [ ser3 ])] and inversely (purely syntactically for the merge) -- maybe Pedro already has this

N1ark · 2025-10-21T17:14:21Z

soteria-rust/lib/state.ml

+  let iter_globals =
+    let append acc x = Iter.append acc (Typed.iter_vars x) in
+    List.fold_left append Iter.empty s.globals
+  in


instead of using list and calling Iter.append a bunch of times (bad), do Iter.of_list globals |> Iter.map ...

N1ark · 2025-10-21T17:16:32Z

soteria-rust/lib/state.ml

+let lift_fix_globals globals res =
+  let+? heap = res in
+  State_intf.Template.{ heap; globals = serialize_globals globals }


also here youre adding all globals to all fixes raised -- i don't think this is normal? i wouldve expected globals = [], since you're lifting the fixes, not adding globlals which already exist (ie. producing them would be a no-op)

instead (with the change im recommending), youd just have let+? heap_fixes = res in List.map (fun f -> Heap f) heap_fixes, i think

N1ark · 2025-10-21T17:20:05Z

soteria/lib/sym_states/freeable.ml

-  let subst_serialized subst_inner subst_var = function
-    | Freed -> Freed
-    | Alive a -> Alive (subst_inner subst_var a)
-
-  let iter_vars_serialized iter_inner serialized f =
-    match serialized with Freed -> () | Alive a -> iter_inner a f
-
-  let pp = pp
-  let pp_serialized = pp
-


another thing :p i'm not sure these should be moved out of the functor; they could be but really im not sure why you'd need it; instead you need to e.g. use State.Freeable.subst_serialized, which will be the correct instantiated module

N1ark · 2025-10-21T17:21:10Z

soteria/lib/sym_states/tree_block.ml

-    List.map subst_atom serialized
+    subst_serialized subst_var serialized

-  let iter_vars_serialized serialized f =


same comment here -- i dont think any of these definitions should be moved out of the functor, just use the right module instead

giltho · 2025-10-21T17:37:54Z

@pcarrott could you please provide context to Opale as to why the types need to be extracted from the functors?
I remember there was a circular dependency somewhere that would prevent you from using the instantiated functors

pcarrott · 2025-10-21T18:12:40Z

soteria-rust/lib/state_intf.ml

+  type serialized = (serialized_atom list, sloc Typed.t list) Template.t
+
+  and serialized_atom =
+    sloc Typed.t
+    * (Sptr.t Rtree_block.serialized_val, sint Value.t) Tree_block.serialized
+      Freeable.serialized


We need to extract the types from the functors because otherwise I can't write this in the include line in the .mli file. Freeable.serialized, Tree_block.serialized and Rtree_block.MemVal.serialized are only accessible after their respective functor application, which is done internally in state.ml. Extracting them from the functors makes it possible to access the serialized structures when I don't have knowledge of how the functor was applied (as is the case in the .mli). Maybe an alternative could be to provide each of those modules in the .mli file where the signature includes those serialized structures. That way the general interface stays the same.

Unfortunately, that is still not enough because Sptr.t is only accessible after the include. I can't access Sptr in the .mli, which is why I need the type to be defined in the interface. Again, an alternative here would be to define an Sptr module in the .mli, but in this case I think I would have to redefine the interface that Sptr has, if I'm thinking correctly.

Basically, as is, I can't expose the serialized structure in the .mli in a clean way. I'm honestly alright with adding the modules Freeable, Tree_block and MemVal to the .mli (along with their serialized types), but the Sptr module is the main issue.

Why does this not work ? 2fef10c

(the example doesn't compile bc i was lazy but supposedly the interface works fine -- maybe i missed sth)

I get the idea. My main issue with this is that ArithPtr is being hardcoded, which I was trying to avoid as I should probably only rely on the Sptr interface. But if that just sounds like overthinking of my part, then that should do the work, thanks!

i mean you're hardcoding the entire serialised structure anyways so adding Sptr doesn't matter much -- this is a particular module that is made for ArithPtr, not State_intf, which is the general one

it's ok to specialise State because all the engine relies on State_intf, it just allows to expose information for your particular tool

Alright, thanks! I'll make things work like this for RUXt then, so I guess this PR can be ignored for now

sure! if you want to edit this / re-open a PR to expose serialized in State directly I'm happy to merge, there's no harm in that

giltho · 2025-10-21T21:11:44Z

The merging of this PR will happen after PLDI anyway, but I'll try to understand precisely what Pedro needs and why it didn't work for Opale, and we'll make that PR work :)

pcarrott added 2 commits October 21, 2025 16:17

expose serialized structure in state interface

bf42e06

fix printing of serialized blocks

de8fc0c

pcarrott requested review from N1ark and giltho as code owners October 21, 2025 15:29

N1ark requested changes Oct 21, 2025

View reviewed changes

pcarrott commented Oct 21, 2025

View reviewed changes

N1ark added enhancement New feature or request soteria-rust Issues related to soteria-rust labels Nov 11, 2025

		type serialized =
		(serialized_atom list, serialized_globals) State_intf.Template.t

Expose structure of serialized state #148

Are you sure you want to change the base?

Expose structure of serialized state #148

Uh oh!

Conversation

pcarrott commented Oct 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

giltho Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

N1ark Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

giltho commented Oct 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

N1ark Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

giltho commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

giltho Oct 21, 2025 •

edited

Loading

N1ark Oct 21, 2025 •

edited

Loading

N1ark Oct 21, 2025 •

edited

Loading