-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making the producer stalling configurable #6413
Comments
The stall gate here is to prevent the server from spiking in memory usage in queues when the subscriber is failing to keep up, otherwise the server could potentially OOM and lose it all anyway. If your usage pattern features very spiky producers but stable or throttled subscribers, you may want to look at funnelling the data through streams instead. |
Thank you @neilalexander , are you talking about https://docs.nats.io/nats-concepts/jetstream/streams ? We were wondering if we could have "Fire and Forget" operating mode where the producer is publishing at its own speed and never getting throttled. The consumer may or may not receive the message depending of its healthness. |
Flowdesk is looking for (mostly) real time data distribution. More generally, data which is aging fast where old data is of little use. |
I think we can look into it. |
Tried out disabling the stalled wait and doing a bench where the subscriber is too slow. This pretty quickly makes the server the producer is connected to freeze/OOM. (Just because you can publish WAY faster than you can receive messages.) So it's not as simple as just adding a toggle to either stall or not. But it should be a toggle that either:
Would need to figure out exactly which queues get backed up and ensure they don't grow too much. |
We have tests for this already for fan in and fan out. Could you describe your test setup a bit more? |
Hello, @MauriceVanVeen @derekcollison
|
Normally, when a producer detects that one of the consumer of a message is falling behind, it will stall. Which means that if a message has 2 consumers and the first is "slow", then it will affect the timely delivery to the second consumer. With the new option `no_fast_producer_stall=true`, the server will simply drop a message destined to a consumer that would have caused the producer to stall. The message is still delivered to consumers that are not falling behind. The option can be config-reload'ed and if a message is dropped due to fast-producer/slow-consumer, and the message was traced (with deliver option), then the message trace egress event will have an error indicating the reason why the message was not delivered. Resolves #6413 Signed-off-by: Ivan Kozlovic <[email protected]>
Normally, when a producer detects that one of the consumer of a message is falling behind, it will stall. Which means that if a message has 2 consumers and the first is "slow", then it will affect the timely delivery to the second consumer. With the new option `no_fast_producer_stall=true`, the server will simply drop a message destined to a consumer that would have caused the producer to stall. The message is still delivered to consumers that are not falling behind. The option can be config-reload'ed and if a message is dropped due to fast-producer/slow-consumer, and the message was traced (with deliver option), then the message trace egress event will have an error indicating the reason why the message was not delivered. Resolves #6413 Signed-off-by: Ivan Kozlovic <[email protected]>
Normally, when a producer detects that one of the consumer of a message is falling behind, it will stall. Which means that if a message has 2 consumers and the first is "slow", then it will affect the timely delivery to the second consumer. With the new option `no_fast_producer_stall=true`, the server will simply drop a message destined to a consumer that would have caused the producer to stall. The message is still delivered to consumers that are not falling behind. The option can be config-reload'ed and if a message is dropped due to fast-producer/slow-consumer, and the message was traced (with deliver option), then the message trace egress event will have an error indicating the reason why the message was not delivered. Resolves #6413 Signed-off-by: Ivan Kozlovic <[email protected]>
Normally, when a producer detects that one of the consumer of a message is falling behind, it will stall. Which means that if a message has 2 consumers and the first is "slow", then it will affect the timely delivery to the second consumer. With the new option `no_fast_producer_stall=true`, the server will simply drop a message destined to a consumer that would have caused the producer to stall. The message is still delivered to consumers that are not falling behind. The option can be config-reload'ed and if a message is dropped due to fast-producer/slow-consumer, and the message was traced (with deliver option), then the message trace egress event will have an error indicating the reason why the message was not delivered. Resolves #6413 Signed-off-by: Ivan Kozlovic <[email protected]>
Normally, when a producer detects that one of the consumer of a message is falling behind, it will stall. Which means that if a message has 2 consumers and the first is "slow", then it will affect the timely delivery to the second consumer. With the new option `no_fast_producer_stall=true`, the server will simply drop a message destined to a consumer that would have caused the producer to stall. The message is still delivered to consumers that are not falling behind. The option can be config-reload'ed and if a message is dropped due to fast-producer/slow-consumer, and the message was traced (with deliver option), then the message trace egress event will have an error indicating the reason why the message was not delivered. Resolves #6413 Signed-off-by: Ivan Kozlovic <[email protected]>
@derekcollison @neilalexander @MauriceVanVeen I think that there could be value in having a way to completely disable producer stalling in some situations as described by @jing-flowdesk. But of course, we can't simply ignore the stall and still attempt to deliver. Instead, if we drop the message for a slow consumer, this would allow the server to deliver it other non slow consumers. This is not the default behavior, so that should not impact users that do not want this behavior and prefer the current one. I have the PR #6500 for consideration. PS: I had issues with the tests running on Travis, so I had to tweak them several times... |
…d) (#6500) Normally, when a producer detects that one of the consumer of a message is falling behind, it will stall. Which means that if a message has 2 consumers and the first is "slow", then it will affect the timely delivery to the second consumer. With the new option `no_fast_producer_stall=true`, the server will simply drop a message destined to a consumer that would have caused the producer to stall. The message is still delivered to consumers that are not falling behind. The option can be config-reload'ed and if a message is dropped due to fast-producer/slow-consumer, and the message was traced (with deliver option), then the message trace egress event will have an error indicating the reason why the message was not delivered. Resolves #6413 Signed-off-by: Ivan Kozlovic <[email protected]>
Normally, when a producer detects that one of the consumer of a message is falling behind, it will stall. Which means that if a message has 2 consumers and the first is "slow", then it will affect the timely delivery to the second consumer. With the new option `no_fast_producer_stall=true`, the server will simply drop a message destined to a consumer that would have caused the producer to stall. The message is still delivered to consumers that are not falling behind. The option can be config-reload'ed and if a message is dropped due to fast-producer/slow-consumer, and the message was traced (with deliver option), then the message trace egress event will have an error indicating the reason why the message was not delivered. Resolves #6413 Signed-off-by: Ivan Kozlovic <[email protected]>
Proposed change
Making the producer stalling configurable.
Currently, when you have a slow consumer on a subject, it can make the server throttle the provider.
ex:
I read on another issue that it was to protect GW and Routes ?
The proposal would be to make this configurable by allowing people to switch this off if they need.
Use case
Better handling event spikes by not stalling the producers if there are a slow consumer on some topic.
Contribution
The change looks not huge here but we would love to have some more inputs before saying we can do it.
The text was updated successfully, but these errors were encountered: