WIP Add a steal_half method to sync::chase_lev #72

fitzgen · 2016-05-04T22:25:06Z

Questions:

We could change the signature to make callers supply a vec, rather than
allocate our own in steal_half. This could potentially be more efficient if
the caller is repeatedly retrying.
Is this safe? I think so, but I'm not 100% sure. The paper mentions that it
would be interesting to apply a steal-half operation to this deque, but they
didn't do it. I couldn't find any other papers or libraries that implemented
this operation either, although my google-fu may be weak in this regard. The
most important part, as far as I can tell, is populating the results array
before performing the CAS to ensure that we get the correct set of resulting
items (after the CAS, the circular buffer could circle back over the slots
we're stealing and overwrite them with new values). Everything seems to be
pretty much the same as stealing one item, which leads to my next question.
Should we factor out the common parts between steal and steal_half?
There's a ton of duplication here, as I've authored it.

If you think this is a valuable operation to have, then I can write some tests
and fix this up.

Questions: * We could change the signature to make callers supply a vec, rather than allocate our own in `steal_half`. This could potentially be more efficient if the caller is repeatedly retrying. * Is this safe? I think so, but I'm not 100% sure. The paper mentions that it would be interesting to apply a steal-half operation to this deque, but they didn't do it. I couldn't find any other papers or libraries that implemented this operation either, although my google-fu may be weak in this regard. The most important part, as far as I can tell, is populating the results array before performing the CAS to ensure that we get the correct set of resulting items (after the CAS, the circular buffer could circle back over the slots we're stealing and overwrite them with new values). Everything seems to be pretty much the same as stealing one item, which leads to my next question. * Should we factor out the common parts between `steal` and `steal_half`? There's a ton of duplication here, as I've authored it. If you think this is valuable, then I can write some tests and fix this up.

fitzgen · 2016-05-04T22:32:27Z

Maybe folks have tried this and it doesn't really give any better load balancing than stealing one item at a time does?

schets · 2016-05-05T13:07:04Z

Maybe folks have tried this and it doesn't really give any better load balancing than stealing one item at a time does?

I can't find the paper, but steal_half performs far better in cases where jobs don't generate many children.

jeehoonkang · 2016-12-30T10:42:21Z

src/sync/chase_lev.rs

+            return StealHalf::Empty;
+        }
+
+        let half = size + 1 / 2;


maybe (size + 1) / 2?

I think this is indeed a bug. The stealer effectively tries to steal all the remaining elements, breaking the invariant for the popper that the last element will not be stolen. I think it should be let half = (size + 1) / 2;

jeehoonkang · 2016-12-30T10:51:50Z

The most important part, as far as I can tell, is populating the results array
before performing the CAS to ensure that we get the correct set of resulting
items (after the CAS, the circular buffer could circle back over the slots
we're stealing and overwrite them with new values).

I "guess" this would kill the performance for some degree. Maybe we can relax it by using two locations (just say bottom1 and bottom2) for the purposes of the present bottom variable: (1) signalling that the circular buffer is reclaimed and reusable, and (2) signalling that elements before bottom are already stolen by some threads. It's just a wild speculation: I would like to think on it more.

The implementation seems correct to my eyes, for the same reason that that of steal is correct. Though I am not 100% confident as I don't have a formal proof yet...

fitzgen · 2016-12-30T16:58:15Z

The implementation seems correct to my eyes, for the same reason that that of steal is correct. Though I am not 100% confident as I don't have a formal proof yet...

Have you seen Correct and Efficient Work-Stealing for Weak Memory Models? For someone who is familiar with this area of research (which I am definitely not, but I think you are ;)) I assume it probably shouldn't be too hard to extend the proofs for this case.

I really enjoyed the Promising paper, btw 👍

jeehoonkang · 2016-12-31T02:46:42Z

Have you seen Correct and Efficient Work-Stealing for Weak Memory Models?

Yes I've read it, but the proof is just beyond my level and I couldn't grasp it.. Yet I understand the key idea, which is what the authors called "cumulativity", that implies it is impossible that concurrent invocations of steal and pop see old values of bottom and top at the same time. The exact same logic applies to steal_half AFAICT.

These days I am working on reproving (weak) Chase-Lev deque in a (hypothetical) program logic, and I will try to come up with a proof of steal_half.

I really enjoyed the Promising paper, btw 👍

I really thank you for reading the paper ;-) As soon as the slide for POPL is done, I will share it.

jeehoonkang

Beyond the comment I left, I think some test code will be very helpful ;)

jeehoonkang · 2017-04-04T15:01:55Z

src/sync/chase_lev.rs

+            return StealHalf::Empty;
+        }
+
+        let half = size + 1 / 2;


I think this is indeed a bug. The stealer effectively tries to steal all the remaining elements, breaking the invariant for the popper that the last element will not be stolen. I think it should be let half = (size + 1) / 2;

ghost · 2017-07-05T19:22:12Z

Do we have any updates on this? I think Rayon might benefit from a steal_half method.

I did some googling, but couldn't find any papers about steal-half operation in Chase-Lev deques. Papers that do talk about steal-half usually have the following structure: they mention Chase-Lev as previous work, state that steal-half is a difficult operation to implement, and then present a totally new deque design that supports steal-half.

While papers never say that steal-half cannot be supported in Chase-Lev deques, the fact that apparently noone did it makes me think it's definitely non-trivial.

That said, we could still add a steal-half method to our Chase-Lev implementation that simply repeatedly calls steal(). Note that every call to steal() pins the current thread, so we could actually do steal-half by enclosing all those steals with a single call to pin(). That could be much faster than naively calling steal() multiple times in a row.

fitzgen · 2017-07-05T22:49:33Z

Please steal this PR from my work queue :)

jeehoonkang · 2017-07-06T01:56:12Z

@stjepang Would you please give me the references to the papers that says it's hard to support steal_half() in Chase-Lev? Because still, @fitzgen's implementation looks correct to my eyes (albeit the comment I left).

ghost · 2017-07-06T08:25:40Z

@jeehoonkang

The Chase-Lev paper simply says at the end:

"It may be interesting to see how our techniques are applied to other schemes that improve on ABP-work stealing such as the locality-guided work-stealing of Acar, Blelloch and Blumofe [1] or the steal-half algorithm of Hendler and Shavit [6]."

There is also a paper from 2013 that mentions Chase-Lev:

"Several years later the nonblocking algorithm was extended by Chase and Lev to support dynamic resizing [10]. Their first and, to our knowledge, only proof of correctness of a nonblocking work-stealing algorithm is not trivial: it spans over thirty pages [11]. Moreover, there is, in the literature, no nonblocking algorithm which combines resizeability with other extensions, such as steal half, possibly owing to the complexity involved in extending the proof of correctness."

jeehoonkang · 2017-07-18T06:57:30Z

@stjepang The paper @fitzgen mentioned is also published in PPoPP 2013, and this contains a proof of an implementation of Chase-Lev in C/C++11. I am quite confident that the proof scales to steal_half. Though I agree that this issue can only be answered decisively with a rigorous mathematical proof. (The decades of research prove that "confident" means nothing in concurrency.)

AFAICT, Chase-Lev deque is difficult to formally reason about due to two reasons: (i) use of relaxed atomics, and (ii) use of SC fences. I am interested in both difficulties, and these issues make Chase-Lev very interesting to me. I will dig further into it..

For the time being, I prefer merging it (after fixing the bug I mentioned in the comment), but I agree not to do so if anyone objects to do that.

ghost · 2017-07-18T07:54:11Z

That's great then! :) No objections to merging from my side either.

Two things we should probably do that'd increase our confidence of correctness:

Reaching out to the authors of "Correct and Efficient Work-Stealing for Weak Memory Models" and asking about the steal-half operation.
Stress testing on a weakly ordered architecture.

jeehoonkang · 2017-07-24T10:35:40Z

Closing it for the same reason with #148

jeehoonkang reviewed Dec 30, 2016

View reviewed changes

jeehoonkang requested changes Apr 4, 2017

View reviewed changes

jeehoonkang mentioned this pull request Jul 19, 2017

Steal-half in Chase-Lev #148

Closed

jeehoonkang closed this Jul 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP Add a steal_half method to sync::chase_lev #72

WIP Add a steal_half method to sync::chase_lev #72

fitzgen commented May 4, 2016

fitzgen commented May 4, 2016

schets commented May 5, 2016

jeehoonkang Dec 30, 2016

fitzgen Dec 30, 2016

jeehoonkang Apr 4, 2017

jeehoonkang commented Dec 30, 2016 •

edited

Loading

fitzgen commented Dec 30, 2016

jeehoonkang commented Dec 31, 2016

jeehoonkang left a comment

jeehoonkang Apr 4, 2017

ghost commented Jul 5, 2017

fitzgen commented Jul 5, 2017

jeehoonkang commented Jul 6, 2017

ghost commented Jul 6, 2017

jeehoonkang commented Jul 18, 2017

ghost commented Jul 18, 2017

jeehoonkang commented Jul 24, 2017

WIP Add a steal_half method to sync::chase_lev #72

WIP Add a steal_half method to sync::chase_lev #72

Conversation

fitzgen commented May 4, 2016

fitzgen commented May 4, 2016

schets commented May 5, 2016

jeehoonkang Dec 30, 2016

Choose a reason for hiding this comment

fitzgen Dec 30, 2016

Choose a reason for hiding this comment

jeehoonkang Apr 4, 2017

Choose a reason for hiding this comment

jeehoonkang commented Dec 30, 2016 • edited Loading

fitzgen commented Dec 30, 2016

jeehoonkang commented Dec 31, 2016

jeehoonkang left a comment

Choose a reason for hiding this comment

jeehoonkang Apr 4, 2017

Choose a reason for hiding this comment

ghost commented Jul 5, 2017

fitzgen commented Jul 5, 2017

jeehoonkang commented Jul 6, 2017

ghost commented Jul 6, 2017

jeehoonkang commented Jul 18, 2017

ghost commented Jul 18, 2017

jeehoonkang commented Jul 24, 2017

jeehoonkang commented Dec 30, 2016 •

edited

Loading