-
Notifications
You must be signed in to change notification settings - Fork 6
Expose structure of serialized state #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| type serialized = (serialized_atom list, sloc Typed.t list) Template.t | ||
|
|
||
| and serialized_atom = | ||
| sloc Typed.t | ||
| * (Sptr.t Rtree_block.serialized_val, sint Value.t) Tree_block.serialized | ||
| Freeable.serialized | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't do this -- if we want to add a new state model for polymorphic analysis the state shape will be different. Instead, in State.mli add with type serialized = ..., to expose the signature for that one module, and then your implementation will just rely on the fact that's the state model used, which is ok
| module Template = struct | ||
| type ('a, 'b) t = { heap : 'a; globals : 'b } | ||
| [@@deriving show { with_path = false }] | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also this doesn't necessarily apply to all states either, so define it within state.ml and expose it in state.mli if needed
| and serialized_globals = T.sloc Typed.t list | ||
| [@@deriving show { with_path = false }] | ||
|
|
||
| let serialize_globals globals : serialized_globals = | ||
| ListLabels.fold_left (GlobMap.bindings globals) ~init:[] | ||
| ~f:(fun acc (_, ((ptr : Sptr.t), _)) -> Typed.Ptr.loc ptr.ptr :: acc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't make sense to me -- just knowing "a global" is in a pointer isn't enough; what you should have is probably serialized_global = Charon.Types.global_decl_id * T.sloc Typed.t.
Sidenote, I'm not sure why you have serialized_globals as a list rather than serialized_global and then a list of those, like with serialized_atom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should glob_map be an instance of Pure_fun and then serialize comes for free?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm that'd require changing the state to be PMap (Product (Pure_fun (GlobalThing)) (Freeable (TreeBlock))) i think, which could be done but id rather avoid it, given i have plans of putting Freeable (TreeBlock) into a With_info thing that tracks what kind of allocation it is like here
https://github.com/rust-lang/rust/blob/be0ade2b602bdfe37a3cc259fcc79e8624dcba94/compiler/rustc_middle/src/mir/interpret/mod.rs#L260-L276
| type serialized = | ||
| (serialized_atom list, serialized_globals) State_intf.Template.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than State_intf.Template.t, with { heap; globals } i would much rather have
type serialized =
| Heap of serialized_atom
| Global of serialized_globalWe'll probably need more information later on: for treeborrows maybe, for pointer decays (mapping a location to it's exposed integer value), for functions (mapping a function reference to a pointer? though i have other plans there anyways), and for errors (if the pre/post condition imply unwinding, what the error is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this also corresponds much more closely to work on CSE with core predicates, so I'd prefer we stick to what we know works :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this change is directly inspired by a similar thing in Coteria. A serialized heap need not be decomposed in disjunctions of things? As long as I can produce/consume it it's fine
But, now that we're discussing this, maybe Opale is correct here, in that we have discovered with Pedro's work that we want atoms of the assertion to be as small as possible as to enable feasible matching plans
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to me it never made sense to have more than an atom is serialized -- it makes using them harder; it should just be Missing of serialized list list, where serialized is just atoms and is built to be kept small :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my intuition comes mostly from using core predicates within Gillian but i don't see a reason to complexify it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Efficiency. It might be that knowing at one time what you are going to produce/consume is cheaper than consuming atom by atom.
E.g. if I know that I'm going to produce a whole sub-tree, or maybe a whole object into memory, it's cheaper to do it in one step rather than doing it bit by bit and re-going through pmap and treeblock operations for each bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough :) but in that case id rather this only be done where it's relevant (PMap and PList only) and nowhere else; e.g. there is no advantage in producing globals and heap data "together" vs splitting it into two
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also to ease processing i wonder if as a utility function in e.g. the matching plan one can have some flatten : serialized -> serialized list and merge : serialized list -> serialized list that e.g. converts (loc, [ ser1; ser2; ser3 ])to [(loc, [ ser1 ]); (loc, [ ser2 ]); (loc [ ser3 ])] and inversely (purely syntactically for the merge) -- maybe Pedro already has this
| let iter_globals = | ||
| let append acc x = Iter.append acc (Typed.iter_vars x) in | ||
| List.fold_left append Iter.empty s.globals | ||
| in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using list and calling Iter.append a bunch of times (bad), do Iter.of_list globals |> Iter.map ...
| let lift_fix_globals globals res = | ||
| let+? heap = res in | ||
| State_intf.Template.{ heap; globals = serialize_globals globals } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also here youre adding all globals to all fixes raised -- i don't think this is normal? i wouldve expected globals = [], since you're lifting the fixes, not adding globlals which already exist (ie. producing them would be a no-op)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead (with the change im recommending), youd just have let+? heap_fixes = res in List.map (fun f -> Heap f) heap_fixes, i think
| let subst_serialized subst_inner subst_var = function | ||
| | Freed -> Freed | ||
| | Alive a -> Alive (subst_inner subst_var a) | ||
|
|
||
| let iter_vars_serialized iter_inner serialized f = | ||
| match serialized with Freed -> () | Alive a -> iter_inner a f | ||
|
|
||
| let pp = pp | ||
| let pp_serialized = pp | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another thing :p i'm not sure these should be moved out of the functor; they could be but really im not sure why you'd need it; instead you need to e.g. use State.Freeable.subst_serialized, which will be the correct instantiated module
| List.map subst_atom serialized | ||
| subst_serialized subst_var serialized | ||
| let iter_vars_serialized serialized f = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment here -- i dont think any of these definitions should be moved out of the functor, just use the right module instead
|
@pcarrott could you please provide context to Opale as to why the types need to be extracted from the functors? |
| type serialized = (serialized_atom list, sloc Typed.t list) Template.t | ||
|
|
||
| and serialized_atom = | ||
| sloc Typed.t | ||
| * (Sptr.t Rtree_block.serialized_val, sint Value.t) Tree_block.serialized | ||
| Freeable.serialized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to extract the types from the functors because otherwise I can't write this in the include line in the .mli file. Freeable.serialized, Tree_block.serialized and Rtree_block.MemVal.serialized are only accessible after their respective functor application, which is done internally in state.ml. Extracting them from the functors makes it possible to access the serialized structures when I don't have knowledge of how the functor was applied (as is the case in the .mli). Maybe an alternative could be to provide each of those modules in the .mli file where the signature includes those serialized structures. That way the general interface stays the same.
Unfortunately, that is still not enough because Sptr.t is only accessible after the include. I can't access Sptr in the .mli, which is why I need the type to be defined in the interface. Again, an alternative here would be to define an Sptr module in the .mli, but in this case I think I would have to redefine the interface that Sptr has, if I'm thinking correctly.
Basically, as is, I can't expose the serialized structure in the .mli in a clean way. I'm honestly alright with adding the modules Freeable, Tree_block and MemVal to the .mli (along with their serialized types), but the Sptr module is the main issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this not work ? 2fef10c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(the example doesn't compile bc i was lazy but supposedly the interface works fine -- maybe i missed sth)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the idea. My main issue with this is that ArithPtr is being hardcoded, which I was trying to avoid as I should probably only rely on the Sptr interface. But if that just sounds like overthinking of my part, then that should do the work, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mean you're hardcoding the entire serialised structure anyways so adding Sptr doesn't matter much -- this is a particular module that is made for ArithPtr, not State_intf, which is the general one
it's ok to specialise State because all the engine relies on State_intf, it just allows to expose information for your particular tool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks! I'll make things work like this for RUXt then, so I guess this PR can be ignored for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! if you want to edit this / re-open a PR to expose serialized in State directly I'm happy to merge, there's no harm in that
|
The merging of this PR will happen after PLDI anyway, but I'll try to understand precisely what Pedro needs and why it didn't work for Opale, and we'll make that PR work :) |
State_intfnow exposes the definition of theserializedtype, including global information and theserialized_atomtype, which are both needed for RUXt. The serialized structures had to be factored out of the correspondingMakefunctor for modulesFreeable,Tree_blockandRtree_block.