Skip to content

cluster: tracing config options #199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions scylla-rust-wrapper/src/cass_types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -855,6 +855,8 @@ impl TryFrom<CassConsistency> for Consistency {
CassConsistency::CASS_CONSISTENCY_LOCAL_QUORUM => Ok(Consistency::LocalQuorum),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in commit name: CassConsistency: adjust converion to serial variants of Consistency -> conversion

CassConsistency::CASS_CONSISTENCY_EACH_QUORUM => Ok(Consistency::EachQuorum),
CassConsistency::CASS_CONSISTENCY_LOCAL_ONE => Ok(Consistency::LocalOne),
CassConsistency::CASS_CONSISTENCY_LOCAL_SERIAL => Ok(Consistency::LocalSerial),
CassConsistency::CASS_CONSISTENCY_SERIAL => Ok(Consistency::Serial),
Comment on lines +858 to +859
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug fix! Let's prioritise such things more, instead of packing them into PRs with new features. Bugs should be fixed ASAP!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a driver still in alpha packing a bugfix in a small PR like this is imo perfectly acceptable.

_ => Err(()),
}
}
Expand Down
65 changes: 65 additions & 0 deletions scylla-rust-wrapper/src/cluster.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ use scylla::{SessionBuilder, SessionConfig};
use std::collections::HashMap;
use std::convert::TryInto;
use std::future::Future;
use std::num::NonZeroU32;
use std::os::raw::{c_char, c_int, c_uint};
use std::sync::Arc;
use std::time::Duration;
Expand All @@ -43,6 +44,12 @@ const DEFAULT_CONNECT_TIMEOUT: Duration = Duration::from_millis(5000);
const DEFAULT_KEEPALIVE_INTERVAL: Duration = Duration::from_secs(30);
// - keepalive timeout is 60 secs
const DEFAULT_KEEPALIVE_TIMEOUT: Duration = Duration::from_secs(60);
// - tracing info fetch timeout is 15 millis
const DEFAULT_TRACING_INFO_FETCH_TIMEOUT: Duration = Duration::from_millis(15);
// - tracing info fetch interval is 3 millis
const DEFAULT_TRACING_INFO_FETCH_INTERVAL: Duration = Duration::from_millis(3);
// - tracing consistency is ONE
const DEFAULT_TRACING_CONSISTENCY: Consistency = Consistency::One;

const DRIVER_NAME: &str = "ScyllaDB Cpp-Rust Driver";
const DRIVER_VERSION: &str = env!("CARGO_PKG_VERSION");
Expand Down Expand Up @@ -109,6 +116,11 @@ pub struct CassCluster {
auth_password: Option<String>,

client_id: Option<uuid::Uuid>,

/// The default timeout for tracing info fetch.
/// Rust-driver only defines the number of retries.
/// However, this can be easily computed: `tracing_max_wait_time / tracing_retry_wait_time`.
tracing_max_wait_time: Duration,
}

impl CassCluster {
Expand Down Expand Up @@ -155,6 +167,19 @@ pub fn build_session_builder(
session_builder = session_builder.user(username, password)
}

// Compute the number of retries for tracing info fetch
// based on the timeout and interval provided by user.
let tracing_info_fetch_attemps = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: attemps -> attempts

let attemps = cluster.tracing_max_wait_time.as_millis()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

/ session_builder
.config
.tracing_info_fetch_interval
.as_millis();

NonZeroU32::new(attemps as u32).unwrap_or_else(|| NonZeroU32::new(1).unwrap())
};
session_builder = session_builder.tracing_info_fetch_attempts(tracing_info_fetch_attemps);

Comment on lines +170 to +182
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ 🤔💭 What is the semantics of cass_cluster_set_tracing_max_wait_time in cpp-driver?
If it puts a time limit on whole tracing fetch, then this is very different to what you did.
Rust Driver, afaict, doesn't have a timeout for a single attempt - only a limit on number of attempts.
So a single attempt can take arbitrarily long time, and you assume they are instant.

Maybe a correct solution here is to wrap get_tracing_info calls into tokio::time::timeout?
Actually, what is the API in cpp-driver to retrieve tracing info? I don't see any call to get_tracing_info.
And if there are none, then what those parameters even do?

📌 As a side note, maybe we should add a timeout for single attempt to Rust Driver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any call to get_tracing_info. And if there are none, then what those parameters even do?

Great observation - I was sure that there was already an implemented method which fetches the tracing info. There is no such method (even in cassandra.h) - cpp-driver docs say:

**Note**: The driver does not return the actual tracing data for the request. The
application itself must use the returned tracing identifier to query the tables.

## Configuration

By default, when tracing is enabled, the driver will wait for the query's tracing
data to become available in the server-side tables before setting the request's
future. The amount of time it will wait, retry, and the consistency level of the
tracing data can be controls by setting `CassCluster` configuration options.

So this is something we have to implement in cpp-rust-driver.

async move {
let load_balancing = load_balancing_config.clone().build().await;
execution_profile_builder = execution_profile_builder.load_balancing_policy(load_balancing);
Expand Down Expand Up @@ -183,6 +208,8 @@ pub unsafe extern "C" fn cass_cluster_new() -> *mut CassCluster {
.connection_timeout(DEFAULT_CONNECT_TIMEOUT)
.keepalive_interval(DEFAULT_KEEPALIVE_INTERVAL)
.keepalive_timeout(DEFAULT_KEEPALIVE_TIMEOUT)
.tracing_info_fetch_interval(DEFAULT_TRACING_INFO_FETCH_INTERVAL)
.tracing_info_fetch_consistency(DEFAULT_TRACING_CONSISTENCY)
};

Box::into_raw(Box::new(CassCluster {
Expand All @@ -198,6 +225,7 @@ pub unsafe extern "C" fn cass_cluster_new() -> *mut CassCluster {
execution_profile_map: Default::default(),
load_balancing_config: Default::default(),
client_id: None,
tracing_max_wait_time: DEFAULT_TRACING_INFO_FETCH_TIMEOUT,
}))
}

Expand Down Expand Up @@ -408,6 +436,43 @@ pub unsafe extern "C" fn cass_cluster_set_request_timeout(
})
}

#[no_mangle]
pub unsafe extern "C" fn cass_cluster_set_tracing_max_wait_time(
cluster_raw: *mut CassCluster,
max_wait_time_ms: c_uint,
) {
let cluster = ptr_to_ref_mut(cluster_raw);

cluster.tracing_max_wait_time = Duration::from_millis(max_wait_time_ms.into());
}

Comment on lines +439 to +448
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a unit test for setting all tracing parameters?

#[no_mangle]
pub unsafe extern "C" fn cass_cluster_set_tracing_retry_wait_time(
cluster_raw: *mut CassCluster,
retry_wait_time_ms: c_uint,
) {
let cluster = ptr_to_ref_mut(cluster_raw);

cluster.session_builder.config.tracing_info_fetch_interval =
Duration::from_millis(retry_wait_time_ms.into());
}

#[no_mangle]
pub unsafe extern "C" fn cass_cluster_set_tracing_consistency(
cluster_raw: *mut CassCluster,
consistency: CassConsistency,
) {
let cluster = ptr_to_ref_mut(cluster_raw);

let consistency = Consistency::try_from(consistency)
.expect("Invalid consistency passed to `cass_cluster_set_tracing_consistency`.");

cluster
.session_builder
.config
.tracing_info_fetch_consistency = consistency;
}

#[no_mangle]
pub unsafe extern "C" fn cass_cluster_set_port(
cluster_raw: *mut CassCluster,
Expand Down