Fix multiple index concurrency issues #308
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #293, #292, #289 (ran over 1000 times with no FAILs) and should also fix #250.
So, there's a lot to unpack here:
WaitGroup
for a short time after each message. This is a theoretically unnecessary delay, but the end effect of this should be to delay publishing of new messages (not appending, because those rely on previous sequence numbers) by about 100 milliseconds. However, this fixes the problem of the index wait prematurely terminating and allowing other processes to proceed before the indexes are actually caught up. This if theoretically unnecessary because if we had a way to query the Luigi pumps to see if there was anything left in the source queue we could use that more directly to determine whether we need to continue waiting. Unfortunately, I did not find a way to do that or even a way to patch it into Luigi without major restructuring of Luigi. So this is the best solution I could come up with which should have the same effect and can be patched out later because it's still fully encapsulated withinsbot
s indexes system. This, I'm pretty sure, is why we got that one strayTestNames
failure in TestNames test flaky #250. It actually makes a lot of sense now that I've had a chance to chew on it for a while.TestSignMessages
doesn't usesbot
. It has a completely separate system for indexes. And that other system doesn't have a mechanism to wait for indexes to catch up. So, to make it so the test is at least being used and is usefully testing the functionality it's trying to test, I've added some short delays to allow indexes to catch up. I ran this test over 1000 times without any FAILs, so it should be fixed now.