Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oz Heartbeating #4

Open
sjmackenzie opened this issue Feb 15, 2013 · 7 comments
Open

Oz Heartbeating #4

sjmackenzie opened this issue Feb 15, 2013 · 7 comments
Labels

Comments

@sjmackenzie
Copy link
Member

Overall theory of a heartbeat:

  • Each node should have its own heartbeat.
  • Failure detection works at the level of single language entities (class instantiations, ports, etc)
  • Each node determines if a remote node fails.
  • Failure detection is done on each node,
  • Nodes don't tell other remote nodes that a certain remote node has failed
  • If there is a failure (permFail) the language level entities adapt, wait, restart, etc (this is programmed on another level)
  • A single slow node should not slow down the failure detection of other nodes
  • Failure detection should be as fast as possible
  • If a delay is increased a certain number of times for a given node only then the failure detector should indicate a failure for that node.
  • The heartbeat is adaptive
  • A forever loop sending messages.
  • If no messages are present within a delay, then increase the delay.
  • If there are messages then decrease the delay
  • increase or decrease the delay such that it is just high enough so that successful communication is achieved
  • delay is not a timeout, yet the delay should be as small as possible

Some points I want to clarify:

Two or more approaches are available - note each approach uses asynchronous io (ie zeromq or nanomsg)

  1. single phase protocol approach
for each remote node set their delay to say 3 secs to allow for initial TCP connections
queue a heartbeat message to each remote node
send queue
     spawn an oz thread for each node and loop infinitely
          sleep for the time delay associated with remote node
          poll for heartbeats from that node
          if poll is empty
               increase node delay
               if node delay increased X times
                     flag remote node with permFail state
               else 
                     flag remote node with tempFail state
               queue heartbeat message to remote node
           else
                flag remote node alive state
                decrease node delay (factor in how many heartbeats are there)
                queue actor/language_entity messages to remote node
                queue heartbeat message to remote node
           send_queue
           poll for actor messages and process if any
           loop
  1. use a two phase protocol
for each remote node set their delay to say 3 secs to allow for initial TCP connections
queue a heartbeat message to each remote node
send queue
     spawn an oz thread for each node and loop infinitely
          sleep for the time delay associated with remote node
          poll for heartbeat responses from that node
          if poll is empty
               increase node delay
               if node delay increased X times
                     flag remote node with permFail state
               else 
                     flag remote node with tempFail state
               queue heartbeat message to remote node
           else
                flag remote node alive state
                decrease node delay
                queue actor messages to remote node
                queue heartbeat message to remote node
           send_queue
           poll for actor messages and process if any
           loop

I believe version two will be slower as it has to wait for a round trip journey, also heartbeats could be lost on the wire. Whereas version one operates on the data at hand therefore faster to detect failure and send messages.

Please check the logic, also am I missing anything?

@sjrd
Copy link
Member

sjrd commented Feb 15, 2013

I think the first version is better. It does not make sense to me to request heartbeats. We know we'll always need some, so assume the other node wants us to send some.

Failure detection works at the level of single language entities

Why? Why not make failure detection work at the level of nodes? When a node's status changes, then change the status of all entities associated with that node. We don't need an instance of the FD for every entity, do we?

also am I missing anything?

I think your understanding of permFail is wrong. IIRC, permFail is never set by the FD. It can only be set explicitly through the Kill operation. The FD is there only to switch back and forth from 'ok' and 'tempFail'. It should be in Rahaël Collet's thesis. I'll try and find a pointer.

Oh and... One of the critical design goals of Mozart 2 was to be able to write the DSS entirely in Oz. Make sure you do ^^ If you have trouble figuring out how it can be done, let me know. But basically it is supported by the three following core mechanism:

  • TCP connections, in modules OS and/or Open
  • Serialization, in module Pickle, but also directly the boot module Serializer
  • The reflective layer (doc, work in progress)

@sjmackenzie
Copy link
Member Author

I understand the need for fast failure detection, but isn't this very
aggressive heartbeating? Possibly too aggressive? Is it not possible to use
actor messages as a heartbeat too?

@sjrd
Copy link
Member

sjrd commented Feb 16, 2013

It is possible to avoid sending a heartbeat if a useful message was sent not long ago; and accept useful message received as also being a heartbeat.

I do not know if many messages are avoided doing so, but IIRC it is a common optimization, indeed.

@bmejias
Copy link

bmejias commented Feb 16, 2013

Hi Ozers,

Very interesting discussion. Please find my comments below.

On Fri, Feb 15, 2013 at 7:23 PM, Sébastien Doeraene <
[email protected]> wrote:

I think the first version is better. It does not make sense to me to
request heartbeats. We know we'll always need some, so assume the other
node wants us to send some.

Failure detection works at the level of single language entities

Why? Why not make failure detection work at the level of nodes? When a
node's status changes, then change the status of all entities associated
with that node. We don't need an instance of the FD for every entity, do we?

I agree that the failure detection should be node base instead of language
entity. It will save a lot of bandwidth consumption. It is of course
important to clearly define what is a node. I wouldn't do
node=machine=ip-based, but node = process. So if you are talking to two oz
processes on the same machine, one of them could crash while the other is
still running.

also am I missing anything?

I think your understanding of permFail is wrong. IIRC, permFail is never
set by the FD. It can only be set explicitly through the Kill operation.
The FD is there only to switch back and forth from 'ok' and 'tempFail'. It
should be in Rahaël Collet's thesis. I'll try and find a pointer.

+1 for FD just switching from OK to tempFail, and permFail only set by the
kill operation. Question here: Is the kill operation perform on a node, or
on a distributed entitiy? AFAIR, it could be performed in both, and it
would affect all language entities associated to the node.

voilà,
cheers
Boriss

Oh and... One of the critical design goals of Mozart 2 was to be able to
write the DSS entirely in Oz. Make sure you do ^^ If you have trouble
figuring out how it can be done, let me know. But basically it is supported
by the three following core mechanism:

  • TCP connections, in modules OS and/or Open

  • Serialization, in module Pickle, but also directly the boot module
    Serializer

  • The reflective layer (doc, work in progresshttps://github.com/mozart/mozart2/wiki/Reflective-layer
    )


    Reply to this email directly or view it on GitHubhttps://github.com/Oz Heartbeating #4#issuecomment-13620404.

@sjrd
Copy link
Member

sjrd commented Feb 16, 2013

Question here: Is the kill operation perform on a node, or on a distributed entitiy? AFAIR, it could be performed in both, and it would affect all language entities associated to the node.

I'm quite certain you can kill a language entity. I remember having read this in Raphaël's thesis. It might be the case that one can kill a node too, I don't know.

@bmejias
Copy link

bmejias commented Feb 17, 2013

On Sat, Feb 16, 2013 at 10:16 PM, Sébastien Doeraene <
[email protected]> wrote:

Question here: Is the kill operation perform on a node, or on a
distributed entitiy? AFAIR, it could be performed in both, and it would
affect all language entities associated to the node.

I'm quite certain you can kill a language entity. I remember having read
this in Raphaël's thesis. It might be the case that one can kill a node
too, I don't know.

OK. So, if you have to entities, A and B, from node P, both entities A and
B share the same failure stream, and you can monitor both. As extra, you
could get a variable associated to the node P from A or from B. All three
streams can be monitored.

If you do {Kill A}, the permFail value will appear on the failure stream of
A, B, and P.

Do I get this right?

cheers
Boriss


Reply to this email directly or view it on GitHubhttps://github.com//issues/4#issuecomment-13675052.

@sjrd
Copy link
Member

sjrd commented Feb 17, 2013

If you do {Kill A}, the permFail value will appear on the failure stream of A, B, and P.

No, if you do {Kill A}, permFail will appear on the fault stream of A, but not B. I don't think there exists such a thing as the fault stream of a node.

It's not because the FD is node-based that the fault streams are node-based too. The fault stream of an entity A is derived from two sources of information: the suspicion state of its node, and its explicitly own state.

Internally, we have :

  • Node P can be suspected or not; this is computed by the FD running for node P. InternalStateOfP is either tempFail or ok.
  • Entity A can be OK, or could have been explicitly broken (Break -> localFail) or killed (Kill -> permFail). InternalStateOfA is either ok, localFail or permFail.

Given these two sources of information, the observable fault state of A (the appearing at the end of its fault stream) is computed as follows:

case InternalStateOfA
of ok then InternalStateOfP
[] X then X
end

Does that make sense?

carealejo referenced this issue in carealejo/mozart2-vm Mar 16, 2013
Css: Gecode::branch builtin
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants