-
Notifications
You must be signed in to change notification settings - Fork 14
[WIP] Introduce Firenest.ReplicatedState #21
base: master
Are you sure you want to change the base?
Conversation
@josevalim I think the local part is done. Now on to figure out the hard part - distributing the state. I also haven't introduced the callbacks yet. |
For anybody looking at the progress of this. We've changed the abstraction slightly into a The code is not there yet, but the documentation is close to how I'd like to see it in the end. |
|
||
It takes in the local delta value constructed in the `c:local_update/4` | ||
calls and returns a remote delta value that will be replicated to other | ||
servers. The value of the local delta is reset to the initial delta value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure I'm missing something obvious here 😅 Does it mean that local_delta
argument will be the delta returned by local_update/4
?
local state. If keeping track of incremental changes is not convenient | ||
for a particular state type, the value of a delta can be set to be equal | ||
to the current state - this will always cause full state transfers. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a full working example to the docs or maybe even two (maps and rate limiter).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be in here or should we have an examples
directory? And if we have them as examples - would it actually make sense to release them as components?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have them inline for now, unless they become too big.
|
||
* `:name` - name for the process, required; | ||
* `:topology` - name of the supporting topology, required; | ||
* `:partitions` - number of partitions, defaults to 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PubSub defaults to "By default uses one partition for every 4 cores". Contention in PubSub is mostly ets so we don't need many partitions. My point is, we have precedent for not having a default of 1 in case we want to start with many. Although more partitions make it more chatty, so we may want to document that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with one for now for compatibility with phoenix presence, but we could split it up more. I'm fine with whatever default.
end | ||
|
||
# TODO: should we store somewhere we're catching up with the server? | ||
# if so, then we should accumulate the broadcasts we can't handle, until we can. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we always needs to this because I believe the topology does not guarantee ordering between sends and broadcasts (or it should not, as that by definition implies using the same connection). So if you need ask a node to catch you up and it replies to you and then broadcasts, there is a chance the broadcast will arrive before the reply, and you would have lost it.
{:catch_up, old_clock, state} | ||
|
||
# We were caught up with a newer clock than the current, ignore | ||
# TODO: is that even possible? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible if we are getting information from different nodes instead of the node itself. So one node can see A at clock 11 but for the other node A is at clock 9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The topology/synced_server tells us where the information is coming from - we rely only on that to know which node is communicating with us. We don't tag the messages ourselves with the node name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think we are agreeing. What I was saying is that, if you can receive the clock for node A from node B, then this can happen. But if you always receive the clock for node A from node A, then this CANNOT happen and I would raise accordingly.
Review is done, this looks fantastic! I really like the organization into three modules, it really helps keep things contained! |
@@ -269,8 +269,7 @@ defmodule Firenest.PubSub.Supervisor do | |||
end | |||
|
|||
def init({pubsub, topology, options}) do | |||
partitions = | |||
options[:partitions] || System.schedulers_online() |> Kernel./(4) |> Float.ceil() |> trunc() | |||
partitions = options[:partitions] || ceil(System.schedulers_online() / 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just came across this, cool project! Doesn't look like there has been much movement recently so I figured I'd just point out that the tests are failing due to this line, ceil/1
doesn't exist, perhaps you meant to leave it as Float.ceil/1
.
No description provided.