Skip to content

Conversation

@giltho
Copy link
Contributor

@giltho giltho commented Sep 10, 2025

This PR introduces the Sym_data folder, which contains utilities for what I call (in my thesis) Symbolic Abstractions. Really... simple abstract domains.

S_eq is an abstraction that implements equality, while S_int is an abstraction over integers (i.e. a symbolic "thing" that can be interpreted as an integer).
I also add S_range (used for TreeBlocks) and S_map (used for PMap) -- We really need a linter to enforce naming conventions 😭

I also extract more clearly what sbool is and explicitly, i.e. an abstraction over booleans. Some of the resulting code is slightly more verbose but I think that's ok.

Also, some of the code in TreeBlock is a bit verbose because of MemVal.S_int.zero () everywhere but #125 fixes this already :)

@giltho giltho requested a review from N1ark as a code owner September 10, 2025 20:58
Copy link
Contributor

@N1ark N1ark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually this is cool but I am not very convinced, because rn it adds a lot of code and signatures that are barely used, even where they seem to fit (Symex).

The dependency Symex -> Sym_data is quite unpractical, and I wonder if there is not a cleverer split to do, like:

  1. Sym_val for symbolic values, ie. just defining the shape of different sorts of values (with S_bool, S_int, S_eq, S_elt, S_range)
  2. Symex, defines symbolic processes using symbolic values
  3. Sym_data, defines data structures using symbolic values processes. s_map right now, but could include sets, maybe binary trees, idk.

We could just re-export Sym_val from Sym_data to only provide one module of symbolic things, but at least it would avoid the weird split we have rn, because Symex not reusing S_eq etc. is super weird to me

let open Typed.Infix in
let* loc = Csymex.nondet Typed.t_loc in
let+ () = Symex.assume [ Typed.not (loc ==@ Typed.Ptr.null_loc) ] in
let+ () = Symex.assume [ Typed.S_bool.not (loc ==@ Typed.Ptr.null_loc) ] in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I did for bv_values to avoid this, which is super unpractical, is that I include S_bool in Typed -- would make these simpler since bool operations are primitive enough that you always want them imo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah nvm you can't because the type is t so it'd override stuff.... hmm

| _ -> Unop (Not, sv) <| TBool
let conj l = List.fold_left S_bool.and_ v_true l

let rec split_ands (sv : t) (f : t -> unit) : unit =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not include split_ands and conj in S_bool? You're not restricting its type to the module type S_bool anyways so you can add stuff

Comment on lines +1 to +19
module S_bool = struct
module type S = sig
type +'a v
type t

val not : t v -> t v
val and_ : t v -> t v -> t v
val or_ : t v -> t v -> t v
val to_bool : t v -> bool option
val of_bool : bool -> t v
end

module Make_syntax (S_bool : S) = struct
let ( &&@ ) = S_bool.and_
let ( ||@ ) = S_bool.or_
let not = S_bool.not
end
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm being picky but it was very weird seeing s_int.ml in sym_data but having to go here for S_bool

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok it's because module sym_data depends on Symex but Symex depends on this.... :s

(** Typed constructors *)
(** {3 Typed constructors} *)

module S_bool : Symex.Value.S_bool.S with type 'a v = 'a t and type t = sbool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you could do sig include Symex.Value.S_bool.S ... val split_ands : ... end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least to me it makes a lot more sense if you package all bool operations together

Comment on lines 112 to 114
open Sym_data.S_int.Make_syntax (MemVal.S_int)
open Make_bool_syntax (Symex.Value.S_bool)
module Range = Sym_data.S_range.Make (MemVal.S_int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok this is quite nice -- only thing is the weird mismatch in names/paths between Sym_data.S_int.Make_syntax and Make_bool_syntax but ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because Symex is overriden so I can't go and fetch Symex.Value.S_bool.Syntax :/

Comment on lines 28 to 29
val iter_vars : 'a t -> 'b ty Var.iter_vars
val subst : (Var.t -> Var.t) -> 'a t -> 'a t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again maybe i'm being slow but I find it confusing we still have this here despite defining the same functions is s_elt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah :/ It's a bit messy I agree

@N1ark
Copy link
Contributor

N1ark commented Sep 11, 2025

The State_monad i made for rust could also fit in sym_data maybe

@giltho
Copy link
Contributor Author

giltho commented Sep 11, 2025

The State_monad i made for rust could also fit in sym_data maybe

I don't think so, the state_monad you have is not necessarily symbolic. It's just a state monad transformer

@giltho
Copy link
Contributor Author

giltho commented Sep 11, 2025

Gave it a thought and I agree: I should declare S_eq and S_elt in Value, and then declare that value is just

include S_eq
include S_elt
module S_bool : S_bool

...

giltho and others added 20 commits September 16, 2025 10:41
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
Signed-off-by: Sacha Ayoun <[email protected]>
S with type key = Key.t and module Symex = Symex = struct
module Symex = Symex
open Symex.Syntax
module Raw_map = Stdlib.Map.Make (Key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking back -- why don't you include Raw_map? or at least reexpose some of it, since e.g. one probably wants add, empty functions

@N1ark N1ark added soteria-core Issues related to the Soteria core library cleanliness Quality of life improvements, to keep the codebase clean and healthy labels Nov 11, 2025
Signed-off-by: Sacha Ayoun <[email protected]>
Copy link
Contributor

@N1ark N1ark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still not convinced tbh :/ (sorry if this wasnt ready for review)

Comment on lines +32 to +33
let add = Infix.( +!@ )
let sub = Infix.( -!@ )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should mark these as checked since we assume no overflows within blocktrees

Suggested change
let add = Infix.( +!@ )
let sub = Infix.( -!@ )
let add = Infix.( +!!@ )
let sub = Infix.( -!!@ )

Comment on lines +42 to +43
let add = Infix.( +!@ )
let sub = Infix.( -!@ )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, see original code

Suggested change
let add = Infix.( +!@ )
let sub = Infix.( -!@ )
(* We assume addition/overflow within the range of an allocation may never overflow.
This allows extremely good reductions around inequalities, which Tree_block relies on. *)
let ( +@ ) = Infix.( +!!@ )
let ( -@ ) = Infix.( -!!@ )

| Unop (Not, e) ->
let e' = simplify e in
if Svalue.equal e e' then fallback v else Svalue.Bool.not e'
if Svalue.equal e e' then v else Svalue.S_bool.not e'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here fallback is basically Analyses.simplify. If you remove the fallback here you get less reductions, in particular in some corner cases.

e.g. if you have a state where x is in [3, INT_MAX] you cannot simplify x < 5 (that would be OX), but you can simplify !(x < 5) (which is x >= 5) into true.

(ofc this example in particular cannot happen since we'd reduce !(x < 5) to x >= 5 directly but i know there are cases where this applies)

i think i would rather keep all these inner fallbacks if possible, we can always do some benchmarking at some point to see how useful they are

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is just a mismanaged merge conflict, thanks for catching this

Comment on lines 100 to 104
val v_true : [> sbool ] t
val v_false : [> sbool ] t
val bool : bool -> [> sbool ] t
val as_bool : 'a t -> bool option
val and_ : [< sbool ] t -> [< sbool ] t -> [> sbool ] t
val conj : [< sbool ] t list -> [> sbool ] t
val split_ands : [< sbool ] t -> ([> sbool ] t -> unit) -> unit
val or_ : [< sbool ] t -> [< sbool ] t -> [> sbool ] t
val not : sbool t -> sbool t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curiosity : why would you unexport some of these but not all ? maybe in this case it makes sense in the module S_bool : Symex.Value.S_bool.S declaration to instead have something like

module S_bool : sig 
  include Symex.Value.S_bool.S with ...

  val v_true : ...
  etc
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really messed up that merge didn't I? 😆

Comment on lines +355 to +359
module type S_bool = sig
val v_true : t
val v_false : t
val as_bool : t -> bool option
val bool : bool -> t
val to_bool : t -> bool option
val of_bool : bool -> t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you could do

module type S_bool  = sig
  include Symex.Value.S_bool.S with ...

  val v_true : ...
  etc
end

Signed-off-by: Sacha Ayoun <[email protected]>
@giltho
Copy link
Contributor Author

giltho commented Dec 15, 2025

I think I won't merge this, but here's a list of things that are in this PR, and I will import them as needed in other PRs.

  • ✅ means I'm sure I want Soteria to have this

  • ❓means I'm not sure

  • Sym_data.Map and rewriting Sym_states.PMap using Sym_data.Map ✅

  • S_bool as a module ❓

  • S_int❓

  • S_elt (t + iter_vars + subst) ✅

  • S_range ❓(would be sure, but not sure I want it without S_int)

Opinions are welcome btw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanliness Quality of life improvements, to keep the codebase clean and healthy soteria-core Issues related to the Soteria core library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants