Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CN] Difficult to work with multiple multidimensional arrays that share indices #357

Open
peterohanley opened this issue Jul 1, 2024 · 11 comments
Labels
cn language Related to design of the CN language resource reasoning Related to reasources in specs

Comments

@peterohanley
Copy link

This may just be working as intended but it seems surprising. These functions process arrays and access them in the same way, and each works fine isolated but if they're in the same loop body it doesn't work. I thought this should work because the extracts are given different types so they could not interact.

#include <stdint.h>

extern uint8_t three[2][2][2];

void handle_three(void)
/*@
    accesses three;
@*/
{
    for (int c = 0; c < 2; ++c)
    /*@ inv c >= 0i32; c <= 2i32;
    @*/
    {
        for (int s = 0; s < 2; ++s)
        /*@ inv s >= 0i32; s <= 2i32;
            c < 2i32;
            c >= 0i32;
         @*/
        {
            /*@ extract Owned<uint8_t[2][2]>, (u64)c; @*/
            /*@ extract Owned<uint8_t[2]>, (u64)s; @*/
            /*@ extract Owned<uint8_t>, 0u64; @*/
            /*@ extract Owned<uint8_t>, 1u64; @*/

            three[c][s][0] = 0;
            three[c][s][1] = 0;
        }
    }
}

extern uint8_t two[2][2];

void handle_two(void)
/*@
    accesses two;
@*/
{
    for (int c = 0; c < 2; ++c)
    /*@ inv c >= 0i32; c <= 2i32;
     @*/
    {
        for (int s = 0; s < 2; ++s)
        /*@ inv s >= 0i32; s <= 2i32;
                c < 2i32;
                c >= 0i32;
        @*/
        {
            /*@ extract Owned<uint8_t[2]>, (u64)c; @*/
            /*@ extract Owned<uint8_t>, (u64)s; @*/

            two[c][s] = 0;
        }
    }
}
void handle_both(void)
/*@
    accesses two;
    accesses three;
@*/
{
    for (int c = 0; c < 2; ++c)
    /*@ inv c >= 0i32; c <= 2i32;
    @*/
    {
        for (int s = 0; s < 2; ++s)
        /*@ inv s >= 0i32; s <= 2i32;
            c < 2i32;
            c >= 0i32;
         @*/
        {
            // either block is ok by itself but if both are in it fails
            /*@ extract Owned<uint8_t[2]>, (u64)c; @*/
            /*@ extract Owned<uint8_t>, (u64)s; @*/
            two[c][s] = 0;

            /*@ extract Owned<uint8_t[2][2]>, (u64)c; @*/
            /*@ extract Owned<uint8_t[2]>, (u64)s; @*/
            /*@ extract Owned<uint8_t>, 0u64; @*/
            /*@ extract Owned<uint8_t>, 1u64; @*/
            three[c][s][0] = 0;
            three[c][s][1] = 0;
        }
    }
}
@dc-mak
Copy link
Contributor

dc-mak commented Jul 3, 2024

Potentially related #320, #311, #256

@cp526
Copy link
Collaborator

cp526 commented Oct 31, 2024

Here's a version that works.

void handle_both(void)
/*@
    accesses two;
    accesses three;
@*/
{
    for (int c = 0; c < 2; ++c)
    /*@ inv c >= 0i32; c <= 2i32;
    @*/
    {
        for (int s = 0; s < 2; ++s)
        /*@ inv s >= 0i32; s <= 2i32;
            c < 2i32;
            c >= 0i32;
         @*/
        {
            // either block is ok by itself but if both are in it fails
            /*@ extract Owned<uint8_t[2]>, (u64)c; @*/
            /*@ extract Owned<uint8_t>, (u64)s; @*/
            two[c][s] = 0;

            /*@ extract Owned<uint8_t[2][2]>, (u64)c; @*/
            /*@ extract Owned<uint8_t[2]>, (u64)s; @*/
            /*@ extract Owned<uint8_t>, 0u64; @*/
            /*@ extract Owned<uint8_t>, 1u64; @*/
            /*@ split_case (c == s); @*/
            /*@ split_case (s == 0i32); @*/
            three[c][s][0] = 0;
            three[c][s][1] = 0;
        }
    }
}

(note the split_case instructions).

The trigger for this problem seems to indeed be interference between different extract's. The upper extract's are meant to affect only two, the lower ones only three, but because extract only takes the resource type and the index as arguments (not the array base pointer), they have this unintended "side effect" here. In this instance this would be addressed by #311 .

What's happening here is maybe easiest illustrated with this example:

extern int arr[10];

void f (unsigned long i, unsigned long j)
/*@ accesses arr;
    requires i < 10u64;
             j < 10u64; 
@*/
{
  /*@ extract Owned<int>, i; @*/
  /*@ extract Owned<int>, j; @*/
  arr[i];
  arr[j];
}

As-is, CN rejects this:

17:53 % cn verify ~/Desktop/extract-problem.c
/Users/christopher/Desktop/extract-problem.c:10:7: warning: extract: index added, no resources (yet) extracted.
  /*@ extract Owned<int>, j; @*/
      ^~~~~~~~~~~~~~~~~~~~~~
[1/1]: f -- fail
/Users/christopher/Desktop/extract-problem.c:12:3: error: Missing resource for reading
  arr[j];
  ^~~~~~
Resource needed: Owned<signed int>(&(&&arr[(u64)0'i32])[j])

The precondition ensures that i and j are valid array indices, but CN cannot tell whether they are equal or distinct. The first extract moves ownership for i out of the each, but for the second one, CN can't prove that ownership for j is available to be extracted, because it does not know how i and j relate. (We have to somehow make CN better explain this error.)

Adding a /*@ split_case (i == j); @*/ before the array accesses fixes the problem by analysing both situations (i == j and i != j) separately, and in both cases the required ownership can be obtained.

Going back to the nested array example, this adds split_case for any pair of indices for which there's previous extract statements at the same C-type. First pair: /*@ extract Owned<uint8_t[2]>, (u64)c; @*/ and /*@ extract Owned<uint8_t[2]>, (u64)s; @*/
Second pair: /*@ extract Owned<uint8_t>, (u64)s; @*/ and /*@ extract Owned<uint8_t>, 0u64; @*/.

In principle there's also a third pair, which is the same as the second but with 1u64. That doesn't need another case split because s can only be either 0 or 1, so the previous split already handles that.

We should do #311, or avoid the need for extract altogether, if possible, to make this less of a problem.

@dc-mak dc-mak added ui/ux Issue with presentation or user experience resource reasoning Related to reasources in specs language Related to design of the CN language and removed ui/ux Issue with presentation or user experience labels Nov 4, 2024
@dc-mak
Copy link
Contributor

dc-mak commented Nov 5, 2024

Can I close this because #311 exists?

@septract
Copy link
Collaborator

Noting that @lwli11 at BAE experienced an instance of this issue.

@lwli11
Copy link

lwli11 commented Dec 5, 2024

Confirmed that split_case works for us, but adds significant lines of annotation and seems split_case slows down verification. We have an example where a function originally 30 lines long with way too much array indexing takes 104m23.382s to verify. Any fixes would be very welcome by us! Thanks!

@dc-mak
Copy link
Contributor

dc-mak commented Dec 6, 2024

104 minutes for 30 lines of code? Have you tried with CVC5 as the solver? Would you be able to share (a version of) this example with us, since it would be very helpful to have as an optimising test cases.

Just fyi, split_case bifurcates the type checking control flow, so N split cases will lead to 2^N different control flow path checks. One might hope that enough of them are obviously inconsistent to not spend too long on them.

But the real solution might just be to make #311 more of a priority.

@dc-mak
Copy link
Contributor

dc-mak commented Dec 17, 2024

@lwli11 are you able to share the example of the code that takes 104 minutes to verify 30 lines of code? Thanks

@lwli11
Copy link

lwli11 commented Dec 17, 2024

We're in the process of clearing it through export control so everyone on your team can view it instead of just the US citizens. It shouldn't be too long. We're hoping to get everyone to sign off on not export controlling anything we generate to make this process easier in the future.

@lwli11
Copy link

lwli11 commented Jan 14, 2025

Cleared code below. Happy to discuss this week:

// Not export controlled per ES-FL-010925-0004
// Copyright 2024 BAE Systems

#define NULL          __cerbvar_NULL



#define __SIZE_TYPE__ long unsigned int
typedef __SIZE_TYPE__ size_t;

int b64invs[64] = { 62, -1, -1, -1, 63, 52, 53, 54, 55, 56, 57, 58,
	59, 60, 61, -1, -1, -1, -1, -1, -1, -1, 0, 1, 2, 3, 4, 5,
	6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
	21, 22, 23, 24, 25, -1, -1, -1, -1, -1, -1, 26, 27, 28,
	29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,
	43, 44, 45, 46, 47, 48, 49, 50, 51 };

int b64_isvalidchar(char c)
{
	if (c >= '0' && c <= '9')
		return 1;
	if (c >= 'A' && c <= 'Z')
		return 1;
	if (c >= 'a' && c <= 'z')
		return 1;
	if (c == '+' || c == '/' || c == '=')
		return 1;
	return 0;
}


size_t b64_decoded_size(const char *in)
/*@
	requires take x =each(u64 i; i>=0u64 && i<10u64){ Owned<char>(array_shift<char>(in, i))};
	ensures take y = each(u64 i; i>=0u64 && i<10u64){ Owned<char>(array_shift<char>(in, i))};

@*/
{
	size_t len;
	size_t ret;
	size_t i;

	if (in == NULL)
		return 0;

	len = 9;
	ret = len;
	for (i=len; i-->0; ) 
/*@
        inv take x2 =each(u64 j; 0u64<=j && j<10u64){ Owned<char>(array_shift<char>(in, j))};
	{in} unchanged;
	i>=0u64 && i<10u64;
@*/

    {
		/*@ extract Owned<char>, (u64) i; @*/
		if (in[i] == '=') 
        {
			ret--;
		} 
        else 
        {
			break;
		}
	}

	return ret;
}

int b64_decode(const char *in, char *out, size_t outlen)
/*@
	requires take x = each(u64 i; i>=0u64 && i<10u64){Owned<char>(array_shift<char>(in, i))};
	take y = each(u64 i; i>=0u64 && i<10u64){Owned<char>(array_shift<char>(out, i))};
	take g =  each(u64 i; i>=0u64 && i<20u64){Owned<signed int>(array_shift<signed int>(&b64invs, i))};

	ensures  take x1 = each(u64 i; i>=0u64 && i<10u64){Owned<char>(array_shift<char>(in, i))};
        take y1 = each(u64 i; i>=0u64 && i<10u64){Owned<char>(array_shift<char>(out, i))};
	take g1=   each(u64 i; i>=0u64 && i<20u64){Owned<signed int>(array_shift<signed int>(&b64invs, i))};
@*/
{
	size_t len;
	size_t i=0;
	size_t j=0;
	int    v=0;
	if (in == NULL || out == NULL)
		return 0;

	len = 9;
	if (outlen < b64_decoded_size(in) || len % 4 != 0)
		return 0;
	len=9;
	for (i=0; i<len; i++)
	/*@ inv take x2 = each(u64 k; k>=0u64 && k<10u64){Owned<char>(array_shift<char>(in, k))};
	take y2= each(u64 k; k>=0u64 && k<10u64){Owned<char>(array_shift<char>(out, k))};
	take g2 =  each(u64 k; k>=0u64 && k<20u64){Owned<signed int>(array_shift<signed int>(&b64invs, k))};
	{&b64invs} unchanged; {in} unchanged;	{out} unchanged;
	@*/
    {
	if (i>=0 && i<10){
		/*@ extract Owned<char>, i; @*/
		if (!b64_isvalidchar(in[i])) 
        	{
			return 0;
		}
	}
    }
	for (i=0, j=0; i<len; i+=4, j+=3) 
	/*@ inv take x3 = each(u64 k; k>=0u64 && k<10u64){Owned<char>(array_shift<char>(in, k))};
	take y3= each(u64 k; k>=0u64 && k<10u64){Owned<char>(array_shift<char>(out, k))};
	take g3 =  each(u64 k; k>=0u64 && k<20u64){Owned<signed int>(array_shift<signed int>(&b64invs, k))};
	{&b64invs} unchanged; {in} unchanged; {out} unchanged;
	@*/
    {
	if (i>=0 && i<10){
	/*@ extract Owned<char>, i; @*/
            int test = in[i]-43;
	    if (test>=0 && test<10){
			/*@ extract Owned<int>, (u64) test; @*/
			v = b64invs[test];
			if (v>=0 && v<33554431){
				int tmp = v << 6;
				int tmp2 = v;
				v = tmp | tmp2;		
			    if (i+2 >=0 && i+2 <10 && i+3>=0 && i+3<10){
					/*@ extract Owned<char>, i+2u64;  @*/
					int test2 = in[i+2]-43;
					if (test2>=0 && test2<10){
	
				    	/*@ split_case (test2==test); @*/
					    /*@ extract Owned<signed int>, (u64) test2; @*/
			
					    int tmp3 = b64invs[test2];
			    		if (v>=0 && v<33554431){
			    			v = in[i+2]=='=' ? v << 6 : (v << 6) | tmp3;		
					    	/*@ extract Owned<char>, i+3u64; @*/
				 	    	int test3 = in[i+3]-43;
					    	if (test3>=0 && test3<10){
								/*@ split_case (test==test3); @*/
								/*@ split_case (test2==test3); @*/
								/*@ extract Owned<int>, (u64) test3; @*/
				    			if (v>=0 && v<33554431){
									v = in[i+3]=='=' ? v << 6 : (v << 6) | b64invs[test3];
									if (j>=0 && j<10){
						      	  		/*@ split_case ((u64) j==(u64) i+3u64); @*/
        					        	/*@ split_case ((u64) j==(u64) i+2u64); @*/
			 		    				/*@ split_case ((u64) j==(u64) i); @*/
					    				/*@ extract Owned<char>, (u64) j; @*/
									    out[j] = (v >> 16) & 0xFF;
									    if (in[i+2] != '='){
											if (j+1>=0 && j+1<10){
        							        	/*@ split_case ((u64) j+1u64==(u64) i+3u64); @*/
            				  					/*@ split_case ((u64) j+1u64==(u64) i+2u64); @*/
     		         							/*@ split_case ((u64) j+1u64==(u64) i); @*/
										    	/*@ extract Owned<char>, (u64) j+1u64; @*/
										    	out[j+1] = (v >> 8) & 0xFF;
											}	
					    				}
					    				if (i+3>=0 && i+3<10){
									        /*@ extract Owned<char>, (u64) i + 3u64; @*/
			   						        if (in[i+3] != '='){
										    	if (j+2>=0 && j+2<10){
                	            				    /*@ split_case ((u64) j+2u64==(u64) i+3u64); @*/
        	        				                /*@ split_case ((u64) j+2u64==(u64) i+3u64); @*/
       		        				                /*@ split_case ((u64) j+2u64==(u64) i); @*/
													/*@ extract Owned<char>, (u64) j+2u64; @*/
													out[j+2] = v & 0xFF;
					    	    				}
											}
					    				}
				    				}
				    			}
							}
			    		}
					}
		    	}
			}
	    }
	}
}

	return 1;
}

@septract
Copy link
Collaborator

septract commented Feb 14, 2025 via email

@lwli11
Copy link

lwli11 commented Feb 17, 2025

@septract Sure! I appreciate you taking a look at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cn language Related to design of the CN language resource reasoning Related to reasources in specs
Projects
None yet
Development

No branches or pull requests

5 participants