Add shard connection backoff policy #473

dkropachev · 2025-05-30T08:03:57Z

Introduce ShardReconnectionPolicy and its implementations:

NoDelayShardConnectionBackoffPolicy: no delay or concurrency limit, ensures at most one pending connection per host+shard.
LimitedConcurrencyShardConnectionBackoffPolicy: limits pending concurrent connections to max_concurrent per scope (Cluster or Host) using a backoff policy.

The idea of this PR is to shift responsibility of scheduling HostConnection._open_connection_to_missing_shard from HostConnection to ShardConnectionBackoffPolicy, that gives ShardConnectionBackoffPolicy control over process of opening connections.

This feature enables finer control over process of creating per shard connections, helping to prevent connections storms.

Fixes: #483

Solutions tested and rejected

Naive delay

Description

Policy would introduce a delay instead of executing connection creation request right away.
Policy would remember last time when connection creation was scheduled to and when it tries to schedule next request it would make sure that there is time between old and new request execution is equal or more than delay it is configured with.

Results

It worked fine when cluster operates in a normal way.

However, during testing with artificial delays, it became clear that this approach breaks down when the time to establish a
connection exceeds the configured delay.
In such cases, connections begin to pile up - the greater the connection initialization time relative to the delay, the faster they accumulate.

This becomes especially problematic during connection storms.
As the cluster becomes overloaded and connection initialization slows down, the delay-based throttling loses its effectiveness. In other words, the more the cluster suffers, the less effective the policy becomes.

Solution

The solution was to give the policy direct control over the connection initialization process.
This allows the policy to track how many connections are currently pending and apply delays after connections are created, rather than before.
That change ensures the policy remains effective even under heavy load.

This behavior is exactly what has been implemented in this PR.

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
I have adjusted the documentation in ./docs/source/.
I added appropriate Fixes: annotations to PR description.

mykaul · 2025-06-05T06:57:15Z

Shouldn't we have some warning / info level log when backoff is taking place?

dkropachev · 2025-06-05T10:26:00Z

Shouldn't we have some warning / info level log when backoff is taking place?

I would rather not do it, it is not useful and can potentially pollute the log

Lorak-mmk · 2025-06-06T10:41:09Z

Do you know what caused the test failure?

  =================================== FAILURES ===================================
  ___________________________ TypeTests.test_datetype ____________________________
  
  self = <tests.unit.test_types.TypeTests testMethod=test_datetype>
  
      def test_datetype(self):
          now_time_seconds = time.time()
          now_datetime = datetime.datetime.fromtimestamp(now_time_seconds, tz=datetime.timezone.utc)
      
          # Cassandra timestamps in millis
          now_timestamp = now_time_seconds * 1e3
      
          # same results serialized
  >       self.assertEqual(DateType.serialize(now_datetime, 0), DateType.serialize(now_timestamp, 0))
  E       AssertionError: b'\x00\x00\x01\x97<\x17\xda\xf9' != b'\x00\x00\x01\x97<\x17\xda\xf8'

it is a unit test that at the first glance should be fully deterministic. Failure is unexpected.
From the assertion it looks like some off-by-one error.

dkropachev · 2025-06-06T10:44:03Z

Do you know what caused the test failure?

  =================================== FAILURES ===================================
  ___________________________ TypeTests.test_datetype ____________________________
  
  self = <tests.unit.test_types.TypeTests testMethod=test_datetype>
  
      def test_datetype(self):
          now_time_seconds = time.time()
          now_datetime = datetime.datetime.fromtimestamp(now_time_seconds, tz=datetime.timezone.utc)
      
          # Cassandra timestamps in millis
          now_timestamp = now_time_seconds * 1e3
      
          # same results serialized
  >       self.assertEqual(DateType.serialize(now_datetime, 0), DateType.serialize(now_timestamp, 0))
  E       AssertionError: b'\x00\x00\x01\x97<\x17\xda\xf9' != b'\x00\x00\x01\x97<\x17\xda\xf8'

it is a unit test that at the first glance should be fully deterministic. Failure is unexpected. From the assertion it looks like some off-by-one error.

It is known issue, conversion goes wrong somewhere

cassandra/cluster.py

cassandra/policies.py

Lorak-mmk

General comment: integration tests for new policies are definitely needed here.

cassandra/policies.py

tests/unit/test_policies.py

cassandra/policies.py

cassandra/cluster.py

mykaul · 2025-06-15T11:30:27Z

The patchset lacks documentation, which would have helped to understand the feature and when/how to use it. Is documentation a separate repo / commit?

cassandra/policies.py

Copilot

Pull Request Overview

This PR adds shard‐aware reconnection policies with support for scheduling constraints. Key changes include new policy implementations and schedulers in cassandra/policies.py, modifications to connection management in cassandra/pool.py and cassandra/cluster.py, and comprehensive tests in both unit and integration suites to validate the new behavior.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/unit/test_shard_aware.py	Adds tests for both immediate and delayed reconnection behavior using new policies.
tests/unit/test_policies.py	Introduces extensive tests for scope bucket and scheduler behavior.
tests/unit/test_host_connection_pool.py	Updates HostConnectionPool tests to integrate the new scheduler.
tests/integration/long/test_policies.py	Validates backoff policies and correct connection formation across shards.
tests/integration/init.py	Adds a marker for tests designed for Scylla-specific behavior.
cassandra/pool.py	Refactors connection replacements to use the new scheduler instead of direct submission.
cassandra/policies.py	Implements new scheduler classes and backoff policies for shard connections.
cassandra/cluster.py	Exposes a new property and uses the scheduler for initializing shard connections.

dkropachev · 2025-06-17T19:51:49Z

The patchset lacks documentation, which would have helped to understand the feature and when/how to use it. Is documentation a separate repo / commit?

I have added documentation to all classes.
Way it is done here in repo, small features are documented at the docstring of the classes, big ones get .rst in the docs/.
I personally think it is a small one, so it has no separate file in the docs/, let me know if you want to see one.

mykaul · 2025-06-18T08:30:03Z

The patchset lacks documentation, which would have helped to understand the feature and when/how to use it. Is documentation a separate repo / commit?

I have added documentation to all classes. Way it is done here in repo, small features are documented at the docstring of the classes, big ones get .rst in the docs/. I personally think it is a small one, so it has no separate file in the docs/, let me know if you want to see one.

I don't think it's such a small feature, and I think details might be missing. I did skim briefly over the code - so I might have missed it - where's the random jitter discussed, so multiple clients when do a concurrent backoff? (again - may have missed it!)

Add abstract classes: `ShardReconnectionPolicy` and `ShardReconnectionScheduler` And implementations: `NoDelayShardReconnectionPolicy` - policy that represents old behavior of having no delay and no concurrency restriction. `NoConcurrentShardReconnectionPolicy` - policy that limits concurrent reconnections to 1 per scope and introduces delay between reconnections within the scope.

Inject shard reconnection policy into cluster, session, connection and host pool. Drop pending connections tracking logic, since policy does that. Fix some tests that mocks Cluster, session, connection or host pool.

dkropachev · 2025-06-18T10:31:36Z

The patchset lacks documentation, which would have helped to understand the feature and when/how to use it. Is documentation a separate repo / commit?

I have added documentation to all classes. Way it is done here in repo, small features are documented at the docstring of the classes, big ones get .rst in the docs/. I personally think it is a small one, so it has no separate file in the docs/, let me know if you want to see one.

I don't think it's such a small feature, and I think details might be missing. I did skim briefly over the code - so I might have missed it - where's the random jitter discussed, so multiple clients when do a concurrent backoff? (again - may have missed it!)

ok, I will add it, jitter comes from ExponentialReconnectionPolicy or from ConstantShardConnectionBackoffSchedule

Lorak-mmk · 2025-06-22T13:14:00Z

cassandra/policies.py

+    @abstractmethod
+    def schedule(
+            self,
+            host_id: str,
+            shard_id: int,
+            method: Callable[..., None],
+            *args: List[Any],
+            **kwargs: dict[Any, Any]) -> None:
+        """
+        Schedules request to create connection to given host and shard according to the policy.
+        At no point request is executed on the call, it is always running in a separate thread,
+        this method is non-blocking in this regard.
+
+        ``host_id`` - an id of the host of the shard.
+        ``shard_id`` - an id of the shard.
+        ``method`` - a callable that creates connection and stores it in the connection pool.
+          Currently, it is `HostConnection._open_connection_to_missing_shard`.
+        ``*args`` and ``**kwargs`` are passed to ``method`` when policy executes it.
+        """
+        raise NotImplementedError()


This comment does not give me enough info to be able to implement the scheduler.
I understand that the main job of schedule is to call method with args and kwargs.

Does it need to call it just once? I assume not, because connection creation could fail. Should the scheduler catch those exceptions and retry?

Why does it get args and kwargs? The interface would be simpler if the callable had them already bound.

What happens I I call method again after it successfully finishes?

At no point request is executed on the call, it is always running in a separate thread, this method is non-blocking in this regard. - I can't tell if this is some guarantee about method (e.g that it is non-blocking) or a restriction for implementors of this method.

Lorak-mmk · 2025-06-22T13:15:25Z

cassandra/policies.py

+class ShardConnectionBackoffPolicy(ABC):
+    """
+    Base class for shard connection backoff policies.
+    These policies allow user to control pace of establishing new connections.
+
+    On `new_scheduler` instantiate a scheduler that behaves according to the policy.
+    """
+
+    @abstractmethod
+    def new_scheduler(self, session: Session) -> ShardConnectionScheduler:
+        raise NotImplementedError()


Why does a scheduler need a whole Session?
If it is only to be able to schedule on executor, imo it is better to provide it with some callable or interface that enables just that instead of whole session. It would be easier to test, but also to understand the code.

Lorak-mmk · 2025-06-22T13:17:47Z

cassandra/policies.py

+class NoDelayShardConnectionBackoffPolicy(ShardConnectionBackoffPolicy):
+    """
+    A shard connection backoff policy with no delay between attempts and no concurrency restrictions.
+    Ensures that at most one pending connection per (host, shard) pair.
+    If connection attempts for the same (host, shard) it is silently dropped.
+


Ensures that at most one pending connection per (host, shard) pair. - this sentence is missing a verb.

and no concurrency restrictions
Ensures that at most one pending connection per (host, shard) pair.

Could you elaborate wdym by "no concurrency restrictions"? at most one pending connection per (host, shard) pair. absolutely does sound like a concurrency restriction.

Lorak-mmk · 2025-06-22T13:19:25Z

cassandra/policies.py

+    """
+    session: Session
+    already_scheduled: dict[str, bool]
+    lock: threading.Lock
+
+    def __init__(self, session: Session):


Why do you use dict with bool values? From the code I don't see a semantic difference between no value in dict and False, so a set should be a better fit here.

Lorak-mmk · 2025-06-22T13:20:42Z

cassandra/policies.py

+        self.session = weakref.proxy(session)
+        self.already_scheduled = {}
+        self.lock = threading.Lock()
+


self.session = weakref.proxy(session) is another argument for both more descriptive interface docs, and for not using Session here.
How is the dev implementing such policy after reading its doc comment supposed to know that they should use weakref.proxy here?

Lorak-mmk · 2025-06-22T13:26:24Z

cassandra/cluster.py

Commit: "feat(cluster): inject shard reconnection policy "

Commit message says that this commit fixes some tests, but that does not seem to be the case.
Instead the test fixes are done in previous commit. Let's fix the commit structure.
My proposal:

Introduce only the APIs

Introduce the policy that will be the default one (+ its unit tests)

Perform the necessary plumbing in driver code to make it use the new policy

Introduce other policies (one by one) (+ their unit tests if they have any)

Integration test.

Lorak-mmk · 2025-06-22T13:29:38Z

tests/integration/long/test_policies.py

+    def _test_backoff(self, shard_connection_backoff_policy: ShardConnectionBackoffPolicy):
+        backoff_policy = None
+        if isinstance(shard_connection_backoff_policy, LimitedConcurrencyShardConnectionBackoffPolicy):
+            backoff_policy = shard_connection_backoff_policy.backoff_policy
+
+        cluster = TestCluster(
+            shard_connection_backoff_policy=shard_connection_backoff_policy,
+            reconnection_policy=ConstantReconnectionPolicy(0),
+        )
+        session = cluster.connect()
+        sharding_info = get_sharding_info(session)
+
+        # even if backoff is set and there is no sharding info
+        # behavior should be the same as if there is no backoff policy
+        if not backoff_policy or not sharding_info:
+            time.sleep(2)
+            expected_connections = 1
+            if sharding_info:
+                expected_connections = sharding_info.shards_count
+            for host_id, connections_count in get_connections_per_host(session).items():
+                self.assertEqual(connections_count, expected_connections)
+            return
+
+        sleep_time = 0
+        schedule = backoff_policy.new_schedule()
+        # Calculate total time it will need to establish all connections
+        if shard_connection_backoff_policy.scope == ShardConnectionBackoffScope.Cluster:
+            for _ in session.hosts:
+                for _ in range(sharding_info.shards_count - 1):
+                    sleep_time += next(schedule)
+            sleep_time /= shard_connection_backoff_policy.max_concurrent
+        elif shard_connection_backoff_policy.scope == ShardConnectionBackoffScope.Host:
+            for _ in range(sharding_info.shards_count - 1):
+                sleep_time += next(schedule)
+            sleep_time /= shard_connection_backoff_policy.max_concurrent
+        else:
+            raise ValueError("Unknown scope {}".format(shard_connection_backoff_policy.scope))
+
+        time.sleep(sleep_time / 2)
+        self.assertFalse(
+            is_connection_filled(shard_connection_backoff_policy.scope, session, sharding_info.shards_count))
+        time.sleep(sleep_time / 2 + 1)
+        self.assertTrue(
+            is_connection_filled(shard_connection_backoff_policy.scope, session, sharding_info.shards_count))


If policies accepted a scheduler instead of Session, then this test could avoid having any sleeps - we could artificially control the scheduler. Much more could be unit tested as well!

Lorak-mmk · 2025-06-22T13:32:50Z

cassandra/policies.py

+        with self.lock:
+            if self.already_scheduled.get(scheduled_key):
+                return False
+            self.already_scheduled[scheduled_key] = True
+
+            scope_info = self.scopes.get(scope_hash)
+            if not scope_info:
+                scope_info = _ScopeBucket(self.session, self.backoff_policy, self.max_concurrent)
+                self.scopes[scope_hash] = scope_info
+            scope_info.schedule_new_connection(CreateConnectionCallback(self._execute, scheduled_key, method, *args, **kwargs))
+            return True
+


schedule is not supposed to return anything according to base class.

Lorak-mmk · 2025-06-22T13:37:56Z

tests/unit/test_shard_aware.py

+                    schedule = backoff_policy.new_schedule()
+                    for _ in range(shard_count):
+                        sleep_time += next(schedule)
+                    if sleep_time > 0:
+                        time.sleep(sleep_time/2)
+                        # Check that connection are not being established quicker than expected
+                        assert len(pool._connections) < expected_after
+                        time.sleep(sleep_time/2 + 1)



Can we somehow write this test without sleeps?
Python Driver tests are already unbearably slow, I don't want to make them any slower unless there is absolutely now way to avoid it.

dkropachev force-pushed the dk/add-connection-pool-delay branch 4 times, most recently from 0b80886 to f62dfa3 Compare June 3, 2025 03:42

dkropachev changed the title 1 Add shard-aware reconnection policies with support for scheduling constraints Jun 3, 2025

dkropachev requested a review from Lorak-mmk June 3, 2025 03:45

dkropachev marked this pull request as ready for review June 3, 2025 03:45

dkropachev mentioned this pull request Jun 4, 2025

Delay for per-shard reconnection #483

Open

dkropachev force-pushed the dk/add-connection-pool-delay branch 2 times, most recently from dbb3ad1 to cbb4719 Compare June 4, 2025 17:53

Lorak-mmk requested changes Jun 6, 2025

View reviewed changes

dkropachev force-pushed the dk/add-connection-pool-delay branch 4 times, most recently from a43ccd1 to b0fd069 Compare June 7, 2025 04:47

dkropachev requested a review from Lorak-mmk June 7, 2025 04:48

dkropachev force-pushed the dk/add-connection-pool-delay branch 2 times, most recently from f47313f to 9dfd9ec Compare June 13, 2025 06:20

Lorak-mmk requested changes Jun 13, 2025

View reviewed changes

dkropachev force-pushed the dk/add-connection-pool-delay branch 2 times, most recently from aebc540 to 61668de Compare June 13, 2025 17:58

dkropachev requested a review from Lorak-mmk June 13, 2025 18:02

dkropachev self-assigned this Jun 13, 2025

mykaul reviewed Jun 15, 2025

View reviewed changes

cassandra/policies.py Outdated Show resolved Hide resolved

mykaul reviewed Jun 15, 2025

View reviewed changes

cassandra/policies.py Outdated Show resolved Hide resolved

mykaul requested a review from Copilot June 15, 2025 11:33

Copilot AI reviewed Jun 15, 2025

View reviewed changes

dkropachev force-pushed the dk/add-connection-pool-delay branch from 61668de to 806aba9 Compare June 17, 2025 04:07

dkropachev changed the title ~~Add shard-aware reconnection policies with support for scheduling constraints~~ Add shard connection backoff policy Jun 17, 2025

dkropachev force-pushed the dk/add-connection-pool-delay branch from 806aba9 to 2584555 Compare June 17, 2025 15:51

dkropachev added 2 commits June 18, 2025 06:31

feat(cluster): inject shard reconnection policy

8f3670e

Inject shard reconnection policy into cluster, session, connection and host pool. Drop pending connections tracking logic, since policy does that. Fix some tests that mocks Cluster, session, connection or host pool.

dkropachev force-pushed the dk/add-connection-pool-delay branch from 2584555 to 8f3670e Compare June 18, 2025 10:31

Lorak-mmk requested changes Jun 22, 2025

View reviewed changes

Add shard connection backoff policy #473

Are you sure you want to change the base?

Add shard connection backoff policy #473

Uh oh!

Conversation

dkropachev commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Solutions tested and rejected

Naive delay

Description

Results

Solution

Pre-review checklist

Uh oh!

mykaul commented Jun 5, 2025

Uh oh!

dkropachev commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Lorak-mmk commented Jun 6, 2025

Uh oh!

dkropachev commented Jun 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Lorak-mmk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mykaul commented Jun 15, 2025

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

dkropachev commented Jun 17, 2025

Uh oh!

mykaul commented Jun 18, 2025

Uh oh!

dkropachev commented Jun 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dkropachev commented May 30, 2025 •

edited

Loading

dkropachev commented Jun 5, 2025 •

edited

Loading