Skip to content

Conversation

@sbawaska
Copy link

In CAMS we saw that when cqlproxy scaled beyond a single instance, all the original connections to the first cqlproxy did not rebalance. By setting the lifetime on the connection, we will ensure that the connections are created across all the cqlproxy pods

In CAMS we saw that when cqlproxy scaled beyond a single instance,
all the original connections to the first cqlproxy did not rebalance.
By setting the lifetime on the connection, we will ensure that the
connections are created across all the cqlproxy pods
ConnectObserver ConnectObserver

// FrameHeaderObserver will set the provided frame header observer on all frames' headers created from this session.
// FrameHeaderObserver will set the provided frame header observer on all frames' headers created from this session.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was done by gofmt


// Check if connection has exceeded its max lifetime
if conn.IsExpired() {
expiredConns = append(expiredConns, conn)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have case where all conns are expired and query will fail with ErrNoConnections. Also, closing expiredConnections immediately may cause queries running on it to fail. We need to change this such that:

  1. Before picking the connection here we move the expired connections out to a new array, replenish the pool and then pick the connection
  2. A separate goroutine to disconnect expired connections once inuseStreams reaches 0
  3. And probably a new limit beyond which we do not expire connections even if their expiry time is reached if we have accumulated that many number of expired connections already. This is to prevent opening too many connections with the server

if conn.IsExpired() {
// Only mark for expiration if we haven't hit the draining limit
// Account for connections we're about to add in this Pick() call
if len(pool.expiredConns)+len(expiredIndices) < maxDrainingConns {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are opening up possibilities of returning no-Connection, plus we are doing a lot more work later on in Pick() which will increase query latency. Can we do the expiry management async? One idea is that we maintain two ConnectionPools per host: one active and one draining. We expire the whole ConnectionPool at once, assuming all connections would have been created at roughly the same time. We also restrict having only one ConnectionPool in draining state. This way the code changes would be much smaller and the work done in query execution path would be lesser.
Lets discuss this in standup tomo and finalize the approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants