@@ -498,21 +498,39 @@ those conditions. We prove for each case of `I_i`.
498
498
499
499
Quite obvious.
500
500
501
- - Case 2: ` I_i ` is ` pop() ` no taking the irregular path.
501
+ - Case 2: ` I_i ` is ` pop() ` taking the regular path.
502
502
503
503
Let ` x ` and ` y ` be the values ` I_i ` read from ` bottom ` at ` 'L201 ` and ` top ` at ` 'L204 ` ,
504
504
respectively. Then we have ` x = b_i ` , as ` bottom ` is modified only by the writer. Since ` I_i ` is
505
- not taking the irregular path, we have ` y+2 <= x ` . We also prove ` t_i <= y+1 ` . Consider the
506
- invocation ` I ` that writes ` y+2 ` to ` top ` , if exists. (Otherwise, ` t_i <= y+1 ` should obviously
507
- hold.) If ` I ` is ` pop() ` , then ` I ` should be linearized after ` I_i ` thanks to the coherence of
508
- ` top ` ; if ` I ` is ` steal() ` , by the synchronization of the SC fences from ` I_i ` to ` I ` via ` top ` ,
509
- ` I ` should load a value that is coherence-after-or ` WL_j ` from ` bottom ` , where ` j ` is such an
510
- index that ` I_i = O_j` . Thus ` I ` is linearized after ` I_i ` , and ` t_i <= y+1 ` holds.
505
+ taking the regular path, we have ` y+2 <= x ` . We also prove ` t_i <= y+1 ` . Consider the invocation
506
+ ` I ` that writes ` y+2 ` to ` top ` , if exists. (Otherwise, ` t_i <= y+1 ` should obviously hold.) If ` I `
507
+ is ` pop() ` , then ` I ` should be linearized after ` I_i ` thanks to the coherence of ` top ` ; if ` I ` is
508
+ ` steal() ` , by the synchronization of the SC fences from ` I_i ` to ` I ` via ` top ` , ` I ` should load a
509
+ value that is coherence-after-or ` WL_j ` from ` bottom ` , where ` j ` is such an index that `I_i =
510
+ O_j` . Thus ` I` is linearized after ` I_i` , and ` t_i <= y+1` holds.
511
511
512
512
Then we have ` t_i <= y+1 < y+2 <= x = b_i ` , and it is legit to pop a value from the bottom end of
513
513
the deque and decrease ` bottom ` .
514
514
515
- FIXME(todo): the right value?
515
+ It remains to prove that ` I_i ` returns the right value. Let ` O_k ` be the last ` push() ` operation
516
+ in ` I_0 ` , ..., ` I_(i-1) ` that pushes to the index ` x ` and writes ` bottom = x+1 ` , and ` v ` be the
517
+ value ` O_k ` pushes. Let's prove that ` I_i ` returns ` v ` .
518
+
519
+ For all ` l ` , let ` WB_l ` be the value of ` buffer ` at the end of the invocation ` O_l ` . Also, for all
520
+ ` z ` , let ` WC[l, z] ` be the ` z ` -th contents of the buffer ` WB_l ` at the end of the invocation
521
+ ` O_l ` . They are well-defined since the pointer ` buffer ` and the contents of the buffer are
522
+ modified only by the owner.
523
+
524
+ Let's prove by induction that for all ` l ∈ [k, j) ` , ` WC[l, x % size(WB_l)] = v ` . Since ` O_k ` just
525
+ pushed a value to ` x ` , it trivially holds for the base case ` l = k ` . Now suppose that it holds for
526
+ ` l = m ` for some ` m ∈ [k, j-1) ` and prove that it holds for ` l = m+1 ` . Since ` O_k ` is the last
527
+ operation that writes to the index ` x ` , ` O_(m+1) ` is not a regular ` pop() ` that writes `bottom =
528
+ x` . Let ` z` and ` w` be the values ` O_ (m+1)` read from ` bottom` at ` 'L301` and ` top` at ` 'L302`,
529
+ respectively, if ` O_(m+1) ` is resizing. Then we have ` z > x ` by the choice of ` O_k ` , and ` w <= x `
530
+ by the coherence on ` top ` . Thus ` WC[m, x % size(WB_m)] = v ` is copied to `WC[ m+1, x %
531
+ size(WB_ (m+1))] `.
532
+
533
+ Thus ` I_i = O_j ` returns ` WC[j-1, x % size(WB_(j-1))] = v ` .
516
534
517
535
- Case 3: ` I_i ` is ` pop() ` taking the irregular path.
518
536
@@ -563,7 +581,7 @@ those conditions. We prove for each case of `I_i`.
563
581
that is coherence-before ` WL_j ` from ` bottom ` , where ` j ` is such an index that ` I_i = O_j ` . Thus
564
582
` I ` is linearized before ` I_i ` , and ` x <= t_i ` holds. Thus we have ` b_i = x <= t_i ` , and it is
565
583
legit for ` I_i ` to return ` EMPTY ` .
566
-
584
+
567
585
If the CAS succeeds, then ` I_i ` updates ` top ` from ` x-1 ` to ` x ` at ` 'L213 ` and writes `bottom =
568
586
x` at ` 'L216` . Let's prove that ` x-1 <= t_i` . Consider the invocation ` I` that writes ` x-1` to
569
587
` top ` . If ` I ` is ` pop() ` , then ` I ` should be linearized before ` I_i ` thanks to the coherence of
@@ -579,7 +597,7 @@ those conditions. We prove for each case of `I_i`.
579
597
Since ` I_i ` writes ` x ` to ` top ` , we have ` t_i = x-1 ` . Thus ` t_i = y = x-1 = (b_i)-1 ` , and it is
580
598
legit to pop the value from the ` bottom ` end of the deque and increase ` top ` .
581
599
582
- FIXME(todo): the right value?
600
+ ` I_i ` returns the right value for the same reason as above.
583
601
584
602
<!-- r t x-2 -->
585
603
<!-- ------- -->
@@ -637,7 +655,81 @@ those conditions. We prove for each case of `I_i`.
637
655
Since ` I_i ` writes ` y+1 ` to ` top ` , we have ` t_i = y ` . Thus we have ` t_i = y <= x-1 = (b_i)-1 ` , and
638
656
it is legit to steal the value from the ` top ` end of the deque and increase ` top ` .
639
657
640
- FIXME(todo): the right value?
658
+
659
+ It remains to prove that ` I_i ` returns the right value. Since ` I_i ` reads ` bottom > y ` , there
660
+ exists a ` push() ` invocation that pushes to the index ` y ` and writes ` bottom = y+1 ` . Let ` O_k ` be
661
+ the last such an invocation in ` I_0 ` , ..., ` I_n ` , and ` v ` be the value ` O_k ` pushes. Let's prove
662
+ that ` O_k ` is linearized before ` I_i ` , and ` I_i ` returns ` v ` .
663
+
664
+ Let's first prove that for all regular ` pop() ` invocation ` J ` that writes ` bottom = y ` at ` 'L202 ` ,
665
+ ` J ` should be linearized before ` I_i ` . In order for ` J ` to enter the regular path, ` J ` should have
666
+ read from ` top ` a value ` <= y-1 ` . Then by the synchronization of the SC fences from ` J ` to ` I_i `
667
+ via ` top ` , the value ` I_i ` read from ` bottom ` at ` 'L404 ` is coherence-after-or the value written
668
+ by ` J ` at ` 'L202 ` . This ` J ` is linearized before ` I_i ` .
669
+
670
+ Now let's prove that ` O_k ` is linearized before ` I_i ` . If ` O_k ` is the only such an invocation
671
+ that writes ` bottom = y+1 ` at ` 'L110 ` , then the value ` I_i ` read from ` bottom ` should be
672
+ coherence-after-or the value written by ` O_k ` . If there is another such invocation ` O_l ` , then
673
+ there should be a regular ` pop() ` invocation ` O_m ` such that ` l < m < k ` , and ` O_m ` writes `bottom
674
+ = y` at ` 'L202` . Thus ` O_m` is linearized before ` I_i` , and the value ` I_i` read from ` bottom` is
675
+ coherence-after the value written by ` O_m ` . Since ` I_i ` should read ` bottom >= y+1 ` , the value
676
+ ` I_i ` read from ` bottom ` should be coherence-after-or the value written by ` O_k ` . In either case,
677
+ ` O_k ` should be linearized before ` I_i ` .
678
+
679
+ Also, for all ` l > k ` , ` b_l > y ` should hold. This is because there are no regular ` pop() `
680
+ invocations that writes ` bottom = y ` after ` O_k ` .
681
+
682
+ <!-- [I_i] -->
683
+ <!-- read t y -->
684
+ <!-- -------- -->
685
+ <!-- read b x (>= y+1) -->
686
+ <!-- update t y y+1 -->
687
+
688
+ <!-- [O_k: push(n)] -->
689
+ <!-- read b y -->
690
+ <!-- read t w -->
691
+ <!-- write b y+1 (release) -->
692
+
693
+ <!-- [pop(n)] -->
694
+ <!-- read b y+1 -->
695
+ <!-- write b y -->
696
+ <!-- ---------- -->
697
+ <!-- read t f (<= y-1) -->
698
+
699
+ Let ` O_l ` be the owner invocation that writes ` WB_l ` to ` buffer ` that is read by ` I_i ` . Let ` z `
700
+ and ` w ` be the values ` O_k ` read from ` bottom ` at ` 'L101 ` and ` top ` at ` 'L102 ` , respectively.
701
+ Since ` I_i ` is linearized after ` O_k ` , there is a release-acquire synchronization from ` O_k ` 's
702
+ write to ` bottom ` at ` 'L110 ` to ` I_i ` 's read from ` bottom ` at ` 'L404 ` . So we have ` w <= y ` , ` WB_l `
703
+ is ` WB_k ` or a later value, and the value ` v ` that ` O_k ` wrote to the buffer at ` 'L109 ` should be
704
+ acknowledged by ` I_i ` at ` 'L409 ` . Also, we have ` view_beginning(O_k) <= view_end(I_i) ` , as
705
+ required by ` (SYNC) ` .
706
+
707
+ + Case ` l <= k ` .
708
+
709
+ Then ` WB_l = WB_k ` , and ` I_i ` read the return value from ` WB_l[y % size(WB_l)] ` . The read value
710
+ should be the value ` v ` written by ` O_k ` at ` 'L109 ` or a later value. Let's think of ` O_m ` that
711
+ overwrites to ` WB_l[y % size(WB_l)] ` after ` O_k ` . (If such ` O_m ` doesn't exist, then ` I_i `
712
+ should have read ` v ` and the conclusion follows.) Since ` O_m ` is after ` O_k ` and it read
713
+ ` bottom > y ` at ` 'L101 ` , ` O_m ` is a ` push() ` operation that read ` top > y ` at ` 'L102 ` . Then
714
+ there is a release-acquire synchronization from ` I_i ` 's write to ` top ` at ` 'L410 ` to ` O_m ` 's
715
+ read from ` top ` at ` 'L102 ` so that ` I_i ` 's read from ` WB_l[y % size(WB_l)] ` at ` 'L409 ` happens
716
+ before ` O_m ` 's write to the same location. Thus ` I_i ` should have read ` v ` at ` 'L409 ` .
717
+
718
+ + Case ` k < l ` .
719
+
720
+ Let's prove by induction that for all ` m ∈ [k, l] ` , ` WC[m, y % size(WB_m)] = v ` . By assumption,
721
+ it holds for ` m = k ` . Now suppose that it holds for ` m = n ` for some ` n ∈ [k, l) ` and let's
722
+ prove that it holds for ` m = n+1 ` . Let ` f ` and ` g ` be the values ` O_(n+1) ` read from ` bottom `
723
+ and ` top ` . Then we have ` g <= w <= y < b_(n+1) = f ` . Since ` k < n+1 ` , ` O_(n+1) ` is not a regular
724
+ ` pop() ` that writes ` bottom = y ` . If ` O_(n+1) ` is resizing, then since ` g <= y < f ` , `WC[ n, y %
725
+ size(WB_n)] ` is copied to ` WC[ n+1, y % size(WB_ (n+1))] ` . Thus we have ` WC[ n+1, y %
726
+ size(WB_ (n+1))] = v`.
727
+
728
+ By the release-acquire synchronization from ` O_l ` 's write to ` buffer ` at ` 'L307 ` to ` I_i ` 's read
729
+ from ` buffer ` at ` 'L408 ` , the value ` I_i ` read from the buffer's content at ` 'L409 ` should be
730
+ coherence-after-or the value ` WC[l, y % size(WB_l)] = v ` that ` O_l ` wrote at ` 'L305 ` . Similarly
731
+ to the above case, ` I_i ` 's read happens before any overwrites to the same location. Thus ` I_i `
732
+ should have read ` v ` at ` 'L409 ` .
641
733
642
734
<!-- We also prove `t_i <= y` as follows. Consider the invocation `I` that writes `y+2` to `top`. If -->
643
735
<!-- `I` is `pop()` whose last write to `bottom` is `WL_I`, then `I` should have read `y+1` from `top` -->
@@ -827,7 +919,7 @@ Even though `Release`/`Acquire` orderings are enough for the success/failure cas
827
919
` 'L213 ` , C11 (and effectively also LLVM and Rust) requires that "the failure argument shall not be
828
920
memory\_ order\_ release nor memory\_ order\_ acq\_ rel. The failure argument shall be no stronger than
829
921
the success argument." ([ C11] [ c11 ] 7.17.7.4 paragraph 2). So we used ` AcqRel ` /` Acquire ` in the
830
- implementation. I think this C11 requirement may be failsafe for most use cases, but is actually
922
+ implementation. I think this C11 requirement may be fail-safe for most use cases, but is actually
831
923
inefficient for the Chase-Lev deque.
832
924
833
925
0 commit comments