-
Notifications
You must be signed in to change notification settings - Fork 306
local echo (5/n): Create outbox messages on send #1472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
eeb6ef2
to
ac35860
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Comments below.
lib/model/store.dart
Outdated
/// Always equal to `connection.zulipFeatureLevel` | ||
/// and `account.zulipFeatureLevel`. | ||
int get zulipFeatureLevel => connection.zulipFeatureLevel!; | ||
|
||
String get zulipVersion => account.zulipVersion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about also moving zulipVersion
along with zulipFeatureLevel
, so they stay together?
lib/model/message_list.dart
Outdated
@@ -626,6 +627,17 @@ class MessageListView with ChangeNotifier, _MessageSequence { | |||
} | |||
} | |||
|
|||
void handleOutboxMessage(OutboxMessage outboxMessage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a name like _addOutboxMessage
? There are multiple tasks that could accurately be described as a message list "handling" an outbox message.
lib/model/message_list.dart
Outdated
/// Remove the [outboxMessage] from the view. | ||
/// | ||
/// This is a no-op if the message is not found. | ||
void removeOutboxMessageIfExists(OutboxMessage outboxMessage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just removeOutboxMessage
? I think we can leave the "if exists" part as implied.
lib/model/message.dart
Outdated
/// ``` | ||
/// | ||
/// During its lifecycle, it is guaranteed that the outbox message is deleted | ||
/// as soon an message event with a matching [MessageEvent.localMessageId] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "a message event"
lib/model/message.dart
Outdated
/// ``` | ||
/// ┌─────────────────────────────────────┐ | ||
/// │ Event received, │ | ||
/// Send │ or we abandoned │ | ||
/// immediately. │ 200. the queue. ▼ | ||
/// (create) ──────────────► sending ────────► sent ──────────────► (delete) | ||
/// │4xx, │ ▲ | ||
/// │other error, │Reached User │ | ||
/// │or reached │time limit. cancels.│ | ||
/// │time limit. ▼ │ | ||
/// └───────────► failed ─────────────────┘ | ||
/// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we leave out the "reached time limit" parts in this first version? The "failed" state doesn't feel accurate for when that happens (especially when coming there from "sent"), and it's not part of the spec for #133, the more complicated outbox design, so we'd have to remove it when implementing that. It's also not necessary for the main idea of #1441:
The point is to better handle the case where you type out a message and hit send, and it fails because you don't have working network at that moment.
It looks like "reached time limit" comes from a parenthetical in the #1441 spec, with "perhaps":
Then if a message fails to send, we show on the local-echo placeholder an option that lets you recover it. (Similarly perhaps if it's been a while, like 10s, since trying to send and the request hasn't completed one way or another.)
If we want to keep "reached time limit" in this PR, how about adding a new node for it in the diagram, separate from "failed"? Then I think it'll be easier to reason clearly about some things later: what should the UI say for this state (not "failed" because we haven't been told that it failed); what should happen if the send-message event arrives when in this state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the "User cancels" label comes from #133. In that context, it means the user decided not to press a "Retry" button to retry the message-send request, and they want us to just forget about the message-send attempts and their failures.
That label doesn't feel accurate in this context, where a retry button isn't part of the picture. I think the word from the #1441 spec is "recover"; how about we say "User recovers the draft" or similar:
Then if a message fails to send, we show on the local-echo placeholder an option that lets you recover it. […]
You might want to retry sending, or just copy the text to save elsewhere. To cover both options, it can take the text and just put it back in the compose box. […] The placeholder in the message list then disappears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the "sent" state needed? What would happen if we just ignored a 200 response and removed "sent" from the diagram, and didn't consider the message to be sent until we got its event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "Send immediately" label is implied and can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combining all those points, what do you think of this as an updated diagram:
/// We abandoned the queue.
/// ┌──────────────────────────────────────┐
/// │ │
/// │ Event received. ▼
/// (create) ─► sending ──────────────────────────────► (delete)
/// │ ▲
/// │ 4xx or other User restores │
/// │ error. the draft. │
/// └──────────────► failed ───────────────┘
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding time limit, started a discussion here: #mobile > handle failed send @ 💬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the diagram update, started a discussion here: #mobile > #F1441 Handle retry state machine @ 💬
check(connection.lastRequest).isA<http.Request>() | ||
..bodyFields['queue_id'].equals(store.queueId) | ||
..bodyFields['local_id'].equals('${outboxMessage.localMessageId}'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this connection.lastRequest
needed, and if so, should the other similar tests have a check like it too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move these checks to the helper, but I moved them from there to this test in a previous revision since just checking it once seems fine to me; this de-duplicates some helper code and most tests focus on other things.
..hidden.isTrue(); | ||
})); | ||
|
||
test('while message is being sent, message event arrives, then the send fails', () => awaitFakeAsync((async) async { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps worth a comment (if there isn't one in the implementation) that this can actually happen: the message-send can succeed, but then the message-send request has a network issue that doesn't affect the event-poll request.
test/model/message_test.dart
Outdated
|
||
// Handle the event after the message is sent but before the debounce | ||
// timeout. The outbox message should remain hidden since the send | ||
// request was sucessful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "successful"
test/model/message_test.dart
Outdated
check(store.outboxMessages).isEmpty(); | ||
check(outboxMessage) | ||
..state.equals(OutboxMessageLifecycle.sent) | ||
..hidden.isTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of these tests have checks on outboxMessage
after checking that store.outboxMessages
is empty. Do they all need checks like that? Anyway, this test and later ones do it without a comment explaining why:
// […] The outbox message should no
// longer get updated because it is not in the store any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second sentence of the comment before store.handleEvent
is meant to address this check:
// Handle the event after the message is sent but before the debounce
// timeout. The outbox message should remain hidden since the send
// request was successful.
Perhaps it will be clearer to move this right before the check.
test/model/message_test.dart
Outdated
..hidden.isFalse(); | ||
}); | ||
|
||
test('send request pending until after kSendMessageTimeLimit, completes successfully, then message event arrives', () => awaitFakeAsync((async) async { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I skipped reading the tests with kSendMessageTimeLimit
in their names, pending an earlier comment that might lead to removing them)
Updated the PR! Thanks for the review. There are some pending questions mainly with regards to the state diagram and the send time limit. I introduced the new |
Will be working on a new revision to reorganize some of the implementation code with further state machine changes. |
The PR has been updated to implement the state diagram discussed in chat: #mobile > #F1441 Handle retry state machine @ 💬 |
b96c5e9
to
13edb83
Compare
(pushed another update to make timestamps used in sendMessage testable) |
975bc28
to
e6c9192
Compare
The point of this helper is to replicate what a topic sent from the client will become, after being processed by the server. This important when trying to create a local copy of a stream message, whose topic can get translated when it's delivered by the server.
This will be the same as `DateTime.timestamp()` in live code (therefore the NFC). For testing, utcNow uses a clock instance that can be controlled by FakeAsync. We could have made call sites of `DateTime.now()` use it too, but those for now don't need it for testing.
While we do create outbox messages, there are in no way user-visible changes since the outbox messages don't end up in message list views. We create skeletons for helpers needed from message list view, but don't implement them yet, to make the diff smaller. For testing, similar to TypingNotifier.debugEnable, we add MessageStoreImpl.debugOutboxEnable for tests that do not intend to cover outbox messages. Some of the delays to fake responses added in tests are not necessary because the future of sendMessage is not completed immediately, but we still add them to keep the tests realistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Here's a review of the first four commits:
0035700 api [nfc]: Add TopicName.interpretAsServer
d7cb9ec store [nfc]: Move zulip{FeatureLevel,Version} to PerAccountStoreBase
d09a52d test [nfc]: Generate timestamps
44af0d3 binding [nfc]: Add utcNow
and a partial review of the fifth, the main commit:
d61e252 message: Create an outbox message on send; manage its states
/// Convert this topic to match how it would appear on a message object from | ||
/// the server, assuming the topic is originally for a send-message request. | ||
/// | ||
/// For a client that does not support empty topics, | ||
/// a modern server (FL>=334) would convert "(no topic)" and empty topics to | ||
/// `store.realmEmptyTopicDisplayName`. | ||
/// | ||
/// See also: https://zulip.com/api/send-message#parameter-topic | ||
TopicName interpretAsServer({ | ||
required int zulipFeatureLevel, | ||
required String? realmEmptyTopicDisplayName, | ||
}) { | ||
if (zulipFeatureLevel < 334) { | ||
assert(_value.isNotEmpty); | ||
return this; | ||
} | ||
if (_value == kNoTopicTopic || _value.isEmpty) { | ||
// TODO(#1250): this assumes that the 'support_empty_topics' | ||
// client_capability is false; update this when we set it to true | ||
return TopicName(realmEmptyTopicDisplayName!); | ||
} | ||
return TopicName(_value); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a // TODO(server-10)
for…removing this method, I guess? Or at least simplifying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is whitespace-trimming also a step in any servers' interpretation of topics? Not that this method has to mirror that logic necessarily. But we'd probably want to explicitly expect (and assert) that that step has already been done when this method is called, otherwise the summary line wouldn't be quite accurate:
/// Convert this topic to match how it would appear on a message object from
/// the server, assuming the topic is originally for a send-message request.
const kLocalEchoDebounceDuration = Duration(milliseconds: 300); // TODO(#1441) find the right values for this | ||
const kSendMessageRetryWaitPeriod = Duration(seconds: 10); // TODO(#1441) find the right values for this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "find the right value for this"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, kSendMessageRetryWaitPeriod
doesn't sound like the right name to me. There's no automatic retry, and, as I mentioned at #1472 (comment), no "Retry" button. This wait-period logic will need to be cleanly removed as part of #133 (see my comment just before that one) which has both kinds of retry, and that should be easier if this thing doesn't also have the "retry" label :)
Maybe kSendMessageOfferRestoreWaitPeriod
?
// Because either of the values can get updated, the actual topic | ||
// can change, for example, between "(no topic)" and "general chat", | ||
// or between different names of "general chat". This should be | ||
// uncommon during the lifespan of an outbox message. | ||
// | ||
// There's also an unavoidable race that has the same effect: | ||
// an admin could change the name of "general chat" | ||
// (i.e. the value of realmEmptyTopicDisplayName) concurrently with | ||
// the user making the send request, so that the setting in effect | ||
// by the time the request arrives is different from the setting the | ||
// client last heard about. The realm update events do not have | ||
// information about this race for us to update the prediction | ||
// correctly. | ||
zulipFeatureLevel: zulipFeatureLevel, | ||
realmEmptyTopicDisplayName: realmEmptyTopicDisplayName), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a draft where I try to explain the bug a bit more (especially with the first line):
// Doing this interpretation just once on creating the outbox message
// allows an uncommon bug, because either of these values can change.
// During the outbox message's life, a predicted "(no topic)" topic
// could become stale/wrong when zulipFeatureLevel changes,
// or a predicted "general chat" topic could become stale/wrong
// when realmEmptyTopicDisplayName changes.
//
// Shrug. The same effect is caused by an unavoidable race:
// an admin could change the name of "general chat"
// (i.e. the value of realmEmptyTopicDisplayName)
// concurrently with the user making the send request,
// so that the setting in effect by the time the request arrives
// is different from the setting the client last heard about.
zulipFeatureLevel: zulipFeatureLevel,
realmEmptyTopicDisplayName: realmEmptyTopicDisplayName),
_outboxMessageDebounceTimers[localMessageId] = Timer(kLocalEchoDebounceDuration, () { | ||
assert(outboxMessages.containsKey(localMessageId)); | ||
_outboxMessageDebounceTimers.remove(localMessageId); | ||
_updateOutboxMessage(localMessageId, newState: OutboxMessageState.waiting); | ||
}); | ||
|
||
_outboxMessageWaitPeriodTimers[localMessageId] = Timer(kSendMessageRetryWaitPeriod, () { | ||
assert(outboxMessages.containsKey(localMessageId)); | ||
_outboxMessageWaitPeriodTimers.remove(localMessageId); | ||
_updateOutboxMessage(localMessageId, newState: OutboxMessageState.waitPeriodExpired); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assert
s don't look right to me; wouldn't they throw if an outbox message was removed before the timer finished?, which can happen in this part of the state diagram:
/// Event received.
/// Or we abandoned the queue.
/// (any state) ────────────────────────────► (delete)
// `localMessageId` is not necessarily in the store. This is because | ||
// message event can still arrive before the send request fails to | ||
// networking issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe "fails with networking issues" or "fails because of networking issues"
required OutboxMessageState newState, | ||
}) { | ||
final outboxMessage = outboxMessages[localMessageId]; | ||
if (outboxMessage == null || outboxMessage.state == newState) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a legitimate need for callers to pass a new state that's the same as the old state? Reading this, I'm wondering if it might be accidentally covering up some confusion in how the state machine is implemented.
This is stacked atop #1463.
No UI/user-facing change in this PR.