diff --git a/doc/developer/design/20231027_refresh_mvs.md b/doc/developer/design/20231027_refresh_mvs.md index 5642262ee95be..e06142d83fcbf 100644 --- a/doc/developer/design/20231027_refresh_mvs.md +++ b/doc/developer/design/20231027_refresh_mvs.md @@ -13,7 +13,7 @@ - Slack: - [channel #wg-tuning-freshness](https://materializeinc.slack.com/archives/C06535JL58R/p1699395646085619) - [big design thread in #epd-sql-council](https://materializeinc.slack.com/archives/C063H5S7NKE/p1699543250405409) - - [`REHYDRATION TIME ESTIMATE` thread in #epd-sql-council](https://materializeinc.slack.com/archives/C063H5S7NKE/p1712165305916299) + - [`HYDRATION TIME ESTIMATE` thread in #epd-sql-council](https://materializeinc.slack.com/archives/C063H5S7NKE/p1712165305916299) - Notion: - [Tuning REFRESH on MVs: UX](https://www.notion.so/materialize/Tuning-REFRESH-on-MVs-UX-1abbf85683364a1d997d77d7022ccd4f) - [Compute meeting on automatic cluster scheduling](https://www.notion.so/materialize/Compute-meeting-on-automatic-cluster-scheduling-ce353b8af52e449d8784241c4a1c0585) @@ -183,7 +183,7 @@ There is a workaround for this until we properly fix it: If there is a specific A proper fix would be to start up the replica a bit before the exact moment of the refresh, so that it can rehydrate already. For example, let's say we have an MV that is to be updated at every midnight. If we know that a refresh will take approximately 1 hour, then we can start up the replica at, say, 10:50 PM, so that it will be rehydrated by about 11:50 PM. At this point, most of the Compute processing that is needed for the refresh has already happened. Now the replica just needs to process the last 10 minutes of input data until midnight at a normal pace. We let the replica run until the MV's upper passes midnight (and jumps to the next midnight), which should happen within a few seconds after midnight. Note that before midnight, queries against the MV will still read the old state (as they should), because the new data is written at timestamps rounded up to midnight. -How do we know how much earlier than the refresh time should we turn on the replica, that is, how much time the refresh will take? In the first version of this feature, we can let the user set this explicitly by something like `REFRESH EVERY EARLY `. Later, we should record the times the refreshes take, and infer the time requirement of the next refresh based on earlier ones. Note that this will be complicated by the fact that we have different instance types [that have wildly differing CPU performance](https://materializeinc.slack.com/archives/CM7ATT65S/p1697816482502819). Update: This is now set on the auto-scheduled cluster, with the `REHYDRATION TIME ESTIMATE` syntax. +How do we know how much earlier than the refresh time should we turn on the replica, that is, how much time the refresh will take? In the first version of this feature, we can let the user set this explicitly by something like `REFRESH EVERY EARLY `. Later, we should record the times the refreshes take, and infer the time requirement of the next refresh based on earlier ones. Note that this will be complicated by the fact that we have different instance types [that have wildly differing CPU performance](https://materializeinc.slack.com/archives/CM7ATT65S/p1697816482502819). Update: This is now set on the auto-scheduled cluster, with the `HYDRATION TIME ESTIMATE` syntax. ### Logical Times vs. Wall Clock Times @@ -253,14 +253,14 @@ As mentioned in the [scoping section](#out-of-scope), an alternative implementat E.g.: ``` -ALTER CLUSTER c1 SET (SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour')); +ALTER CLUSTER c1 SET (SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour')); ``` Discussions: - [Original discussion](https://github.com/MaterializeInc/materialize/issues/25712) - [Overview](https://www.notion.so/materialize/REFRESH-user-docs-draft-4a8f30b737a94619ac9f645abc9f84ce?pvs=4#025fd5733fcd4f38b48ee967bc8fb763) - [Syntax discussion](https://materializeinc.slack.com/archives/C063H5S7NKE/p1710355545343079) -- [REHYDRATION TIME ESTIMATE discussion](https://materializeinc.slack.com/archives/C063H5S7NKE/p1712165305916299) +- [HYDRATION TIME ESTIMATE discussion](https://materializeinc.slack.com/archives/C063H5S7NKE/p1712165305916299) ## Introspection / Observability @@ -276,13 +276,13 @@ The values in `mz_materialized_view_refreshes` will be calculated as follows: For seeing whether the last refresh is being late, the user can run `EXPLAIN TIMESTAMP` in `STRICT SERIALIZABLE` mode, and look at can respond immediately. If it's false, then the last refresh's completion is overdue. Another way to get the same information would be to check if `mz_materialized_view_refreshes.next_refresh < now()`. -For showing cluster schedules, I'll create a table in `mz_internal` called `mz_cluster_schedules`. This will be similar to `mz_materialized_view_refresh_strategies` in that it will allow for multiple `SCHEDULE =` options on a cluster by having one row for each schedule option of each cluster. This would currently be only either `SCHEDULE = ON REFRESH` or `SCHEDULE = MANUAL`. Columns would be `(cluster_id text, type text, refresh_rehydration_time_estimate interval)`. In `type`, we would currently have either "manual" (the default), or "on-refresh". (Eventually, we'll probably also want a `next_scheduled_turn_on`, but this doesn't seem so urgent. It will get more important when we'll be choosing the warmup time automatically.) +For showing cluster schedules, I'll create a table in `mz_internal` called `mz_cluster_schedules`. This will be similar to `mz_materialized_view_refresh_strategies` in that it will allow for multiple `SCHEDULE =` options on a cluster by having one row for each schedule option of each cluster. This would currently be only either `SCHEDULE = ON REFRESH` or `SCHEDULE = MANUAL`. Columns would be `(cluster_id text, type text, refresh_hydration_time_estimate interval)`. In `type`, we would currently have either "manual" (the default), or "on-refresh". (Eventually, we'll probably also want a `next_scheduled_turn_on`, but this doesn't seem so urgent. It will get more important when we'll be choosing the warmup time automatically.) For seeing whether a cluster is currently turned on, the user can simply look at `mz_cluster_replicas`, because we currently turn clusters On/Off by just creating/dropping replicas. We might also add a builtin view for showing this information in a more focused way. For the automatic cluster scheduling history, the user can look at `mz_audit_events`. This has a `details` column, which is a JSON blob, where I'm planning to add the `reason` for turning on a cluster, i.e., which materialized views were in need of a refresh. (See Nikhil's comment [here](https://github.com/MaterializeInc/materialize/pull/26401#pullrequestreview-1981986544).) There is also the `mz_cluster_replica_history` view, which takes its info from `mz_audit_events`, and presents the info in a nicer form. I could add a new reason column to this view. Also note that the `reason` could also be prepared to show reasons from other policies: it could itself be a collection of key-value pairs, where the keys are policy names (e.g., refresh), and the values have policy-specific structures. For refresh, it could be a list of the materialized view IDs that made us turn the cluster on. -We'll also want to show rehydration times from the last several refreshes, to help users set the `REHYDRATION TIME ESTIMATE` of clusters. I'm thinking to create a new table `mz_internal.mz_compute_hydration_history (replica_id text, rehydration_time interval)`, which would have one row for each replica creation, and it would show the time it took to rehydrate the replica when it was created. (The user can join this with `mz_cluster_replica_history` to know which cluster the replica belonged to, replica size, etc.) Btw. this doesn't need to be constrained to clusters involving `REFRESH` MVs; this info seems generally useful for any compute cluster. If we want to make it even more useful generally, we might want to add one row not just for each replica creation, but also each replica restart, so that we'll show the rehydrations that happen at system upgrade restarts. In this case, we'll probably need also a `time` column, and then `(replica_id, time)` would be a composite key. For this general version, we might have to truncate the relation to keep it from growing too big. +We'll also want to show rehydration times from the last several refreshes, to help users set the `HYDRATION TIME ESTIMATE` of clusters. I'm thinking to create a new table `mz_internal.mz_compute_hydration_history (replica_id text, rehydration_time interval)`, which would have one row for each replica creation, and it would show the time it took to rehydrate the replica when it was created. (The user can join this with `mz_cluster_replica_history` to know which cluster the replica belonged to, replica size, etc.) Btw. this doesn't need to be constrained to clusters involving `REFRESH` MVs; this info seems generally useful for any compute cluster. If we want to make it even more useful generally, we might want to add one row not just for each replica creation, but also each replica restart, so that we'll show the rehydrations that happen at system upgrade restarts. In this case, we'll probably need also a `time` column, and then `(replica_id, time)` would be a composite key. For this general version, we might have to truncate the relation to keep it from growing too big. We might want to also track the time it takes to actually perform a refresh, assuming that the replica is already hydrated. This will often take <1 sec, but if the MV's storage is big and/or there are many changes, then it might take more. diff --git a/doc/user/content/sql/alter-cluster.md b/doc/user/content/sql/alter-cluster.md index 454c9793a7328..ae41748d1ebef 100644 --- a/doc/user/content/sql/alter-cluster.md +++ b/doc/user/content/sql/alter-cluster.md @@ -40,7 +40,7 @@ ALTER CLUSTER c1 SET (SIZE '100cc'); {{< private-preview />}} ```sql -ALTER CLUSTER c1 SET (SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour')); +ALTER CLUSTER c1 SET (SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour')); ``` See the reference documentation for [`CREATE CLUSTER`](../create-cluster/#scheduling) diff --git a/doc/user/content/sql/create-cluster.md b/doc/user/content/sql/create-cluster.md index 1b3dd2fdc98bd..207c0701ef2b1 100644 --- a/doc/user/content/sql/create-cluster.md +++ b/doc/user/content/sql/create-cluster.md @@ -244,7 +244,7 @@ you can configure a cluster to automatically turn on and off using the ```mzsql CREATE CLUSTER my_scheduled_cluster ( SIZE = '3200cc', - SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour') + SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour') ); ``` @@ -267,17 +267,17 @@ To re-enable scheduling: ```mzsql ALTER CLUSTER my_scheduled_cluster -SET (SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour')); +SET (SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour')); ``` -#### Rehydration time estimate +#### Hydration time estimate -

Syntax: REHYDRATION TIME ESTIMATE interval

+

Syntax: HYDRATION TIME ESTIMATE interval

By default, scheduled clusters will turn on at the scheduled refresh time. To avoid [unavailability of the objects scheduled for refresh](/sql/create-materialized-view/#querying-materialized-views-with-refresh-strategies) during the refresh operation, we recommend turning the cluster on ahead of the scheduled time to -allow rehydration to complete. This can be controlled using the `REHYDRATION +allow rehydration to complete. This can be controlled using the `HYDRATION TIME ESTIMATE` clause. #### Introspection @@ -290,7 +290,7 @@ system catalog table: SELECT c.id AS cluster_id, c.name AS cluster_name, cs.type AS schedule_type, - cs.refresh_rehydration_time_estimate + cs.refresh_hydration_time_estimate FROM mz_internal.mz_cluster_schedules cs JOIN mz_clusters c ON cs.cluster_id = c.id WHERE c.name = 'my_refresh_cluster'; diff --git a/doc/user/content/sql/create-materialized-view.md b/doc/user/content/sql/create-materialized-view.md index b477e497a7680..a529f8657f061 100644 --- a/doc/user/content/sql/create-materialized-view.md +++ b/doc/user/content/sql/create-materialized-view.md @@ -254,7 +254,7 @@ refresh times: ```mzsql CREATE CLUSTER my_scheduled_cluster ( SIZE = '3200cc', - SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour') + SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour') ); ``` @@ -283,18 +283,18 @@ backfill the view with pre-existing data — a process known as [_hydration_](/t of the view** to just the duration of the refresh. If the cluster is **not** configured to turn on ahead of scheduled refreshes -(i.e., using the `REHYDRATION TIME ESTIMATE` option), the total unavailability +(i.e., using the `HYDRATION TIME ESTIMATE` option), the total unavailability window of the view will be a combination of the hydration time for all objects in the cluster (typically long) and the duration of the refresh for the materialized view (typically short). Depending on the actual time it takes to hydrate the view or set of views in the -cluster, you can later adjust the rehydration time estimate value for the +cluster, you can later adjust the hydration time estimate value for the cluster using [`ALTER CLUSTER`](../alter-cluster/#schedule): ```mzsql ALTER CLUSTER my_scheduled_cluster -SET (SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '30 minutes')); +SET (SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '30 minutes')); ``` #### Introspection diff --git a/doc/user/content/sql/system-catalog/mz_internal.md b/doc/user/content/sql/system-catalog/mz_internal.md index 5bd868e65dee5..f4df5dae849cf 100644 --- a/doc/user/content/sql/system-catalog/mz_internal.md +++ b/doc/user/content/sql/system-catalog/mz_internal.md @@ -135,11 +135,11 @@ the most recent status for each AWS PrivateLink connection in the system. The `mz_cluster_schedules` table shows the `SCHEDULE` option specified for each cluster. -| Field | Type | Meaning | -|-------------------------------------|--------------|---------------------------------------------------------------| -| `cluster_id` | [`text`] | The ID of the cluster. Corresponds to [`mz_clusters.id`](../mz_catalog/#mz_clusters).| -| `type` | [`text`] | `on-refresh`, or `manual`. Default: `manual` | -| `refresh_rehydration_time_estimate` | [`interval`] | The interval given in the `REHYDRATION TIME ESTIMATE` option. | +| Field | Type | Meaning | +|-------------------------------------|--------------|----------------------------------------------------------------| +| `cluster_id` | [`text`] | The ID of the cluster. Corresponds to [`mz_clusters.id`](../mz_catalog/#mz_clusters). | +| `type` | [`text`] | `on-refresh`, or `manual`. Default: `manual` | +| `refresh_hydration_time_estimate` | [`interval`] | The interval given in the `HYDRATION TIME ESTIMATE` option. | ## `mz_cluster_replica_frontiers` diff --git a/misc/dbt-materialize/dbt/include/materialize/macros/ci/create_cluster.sql b/misc/dbt-materialize/dbt/include/materialize/macros/ci/create_cluster.sql index 599d84071d061..d3f66b832a755 100644 --- a/misc/dbt-materialize/dbt/include/materialize/macros/ci/create_cluster.sql +++ b/misc/dbt-materialize/dbt/include/materialize/macros/ci/create_cluster.sql @@ -21,20 +21,20 @@ This macro creates a cluster with the specified properties. - size (str): The size of the cluster. This parameter is required. - replication_factor (int, optional): The replication factor for the cluster. Only applicable when schedule_type is 'manual'. - schedule_type (str, optional): The type of schedule for the cluster. Accepts 'manual' or 'on-refresh'. - - refresh_rehydration_time_estimate (str, optional): The estimated rehydration time for the cluster. Only applicable when schedule_type is 'on-refresh'. + - refresh_hydration_time_estimate (str, optional): The estimated hydration time for the cluster. Only applicable when schedule_type is 'on-refresh'. - ignore_existing_objects (bool, optional): Whether to ignore existing objects in the cluster. Defaults to false. - force_deploy_suffix (bool, optional): Whether to forcefully add a deploy suffix to the cluster name. Defaults to false. Incompatibilities: - replication_factor is only applicable when schedule_type is 'manual'. - - refresh_rehydration_time_estimate is only applicable when schedule_type is 'on-refresh'. + - refresh_hydration_time_estimate is only applicable when schedule_type is 'on-refresh'. #} {% macro create_cluster( cluster_name, size, replication_factor=none, schedule_type=none, - refresh_rehydration_time_estimate=none, + refresh_hydration_time_estimate=none, ignore_existing_objects=false, force_deploy_suffix=false ) %} @@ -116,8 +116,8 @@ This macro creates a cluster with the specified properties. {% if replication_factor is not none and ( schedule_type == 'manual' or schedule_type is none ) %} , REPLICATION FACTOR = {{ replication_factor }} {% elif schedule_type == 'on-refresh' %} - {% if refresh_rehydration_time_estimate is not none %} - , SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = {{ dbt.string_literal(refresh_rehydration_time_estimate) }}) + {% if refresh_hydration_time_estimate is not none %} + , SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = {{ dbt.string_literal(refresh_hydration_time_estimate) }}) {% else %} , SCHEDULE = ON REFRESH {% endif %} diff --git a/misc/dbt-materialize/dbt/include/materialize/macros/deploy/deploy_init.sql b/misc/dbt-materialize/dbt/include/materialize/macros/deploy/deploy_init.sql index ebdd51e2ceca6..480430460726d 100644 --- a/misc/dbt-materialize/dbt/include/materialize/macros/deploy/deploy_init.sql +++ b/misc/dbt-materialize/dbt/include/materialize/macros/deploy/deploy_init.sql @@ -115,7 +115,7 @@ c.id AS cluster_id, c.name AS cluster_name, cs.type AS schedule_type, - cs.refresh_rehydration_time_estimate + cs.refresh_hydration_time_estimate FROM mz_clusters c LEFT JOIN mz_internal.mz_cluster_schedules cs ON cs.cluster_id = c.id WHERE c.name = {{ dbt.string_literal(cluster) }} @@ -130,7 +130,7 @@ {% set size = results[1] %} {% set replication_factor = results[2] %} {% set schedule_type = results[5] %} - {% set refresh_rehydration_time_estimate = results[6] %} + {% set refresh_hydration_time_estimate = results[6] %} {% if not managed %} {{ exceptions.raise_compiler_error("Production cluster " ~ cluster ~ " is not managed") }} @@ -141,7 +141,7 @@ size=size, replication_factor=replication_factor, schedule_type=schedule_type, - refresh_rehydration_time_estimate=refresh_rehydration_time_estimate, + refresh_hydration_time_estimate=refresh_hydration_time_estimate, ignore_existing_objects=ignore_existing_objects, force_deploy_suffix=True ) %} diff --git a/misc/dbt-materialize/tests/adapter/test_ci.py b/misc/dbt-materialize/tests/adapter/test_ci.py index 2b8845642ec70..8c81c077ea92e 100644 --- a/misc/dbt-materialize/tests/adapter/test_ci.py +++ b/misc/dbt-materialize/tests/adapter/test_ci.py @@ -212,7 +212,7 @@ def test_create_cluster_with_on_refresh_schedule(self, project): "run-operation", "create_cluster", "--args", - '{"cluster_name": "test_on_refresh_schedule", "size": "1", "schedule_type": "on-refresh", "refresh_rehydration_time_estimate": "10m", "ignore_existing_objects": true, "force_deploy_suffix": true}', + '{"cluster_name": "test_on_refresh_schedule", "size": "1", "schedule_type": "on-refresh", "refresh_hydration_time_estimate": "10m", "ignore_existing_objects": true, "force_deploy_suffix": true}', ] ) @@ -324,7 +324,7 @@ def get_cluster_properties(project, cluster_name): c.id AS cluster_id, c.name AS cluster_name, cs.type AS schedule_type, - cs.refresh_rehydration_time_estimate + cs.refresh_hydration_time_estimate FROM mz_clusters c LEFT JOIN mz_internal.mz_cluster_schedules cs ON cs.cluster_id = c.id WHERE c.name = '{cluster_name}' diff --git a/misc/dbt-materialize/tests/adapter/test_deploy.py b/misc/dbt-materialize/tests/adapter/test_deploy.py index 74593df08ad10..251ee29272ffe 100644 --- a/misc/dbt-materialize/tests/adapter/test_deploy.py +++ b/misc/dbt-materialize/tests/adapter/test_deploy.py @@ -507,9 +507,9 @@ def test_fails_on_unmanaged_cluster(self, project): run_dbt(["run-operation", "deploy_init"], expect_pass=False) - def test_dbt_deploy_init_with_refresh_rehydration_time(self, project): + def test_dbt_deploy_init_with_refresh_hydration_time(self, project): project.run_sql( - "CREATE CLUSTER prod (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour'))" + "CREATE CLUSTER prod (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour'))" ) project.run_sql("CREATE SCHEMA prod") diff --git a/src/adapter/src/catalog/builtin_table_updates.rs b/src/adapter/src/catalog/builtin_table_updates.rs index d151e48951b63..3ae5097190c87 100644 --- a/src/adapter/src/catalog/builtin_table_updates.rs +++ b/src/adapter/src/catalog/builtin_table_updates.rs @@ -318,12 +318,12 @@ impl CatalogState { Datum::Null, ]), ClusterSchedule::Refresh { - rehydration_time_estimate, + hydration_time_estimate, } => Row::pack_slice(&[ Datum::String(&id.to_string()), Datum::String("on-refresh"), Datum::Interval( - Interval::from_duration(&rehydration_time_estimate) + Interval::from_duration(&hydration_time_estimate) .expect("planning ensured that this is convertible back to Interval"), ), ]), diff --git a/src/adapter/src/coord/cluster_scheduling.rs b/src/adapter/src/coord/cluster_scheduling.rs index 236affb9c9218..52670f5fd1f27 100644 --- a/src/adapter/src/coord/cluster_scheduling.rs +++ b/src/adapter/src/coord/cluster_scheduling.rs @@ -50,8 +50,8 @@ pub struct RefreshDecision { /// Objects that currently need a refresh on the cluster (taking into account the rehydration /// time estimate). objects_needing_refresh: Vec, - /// The REHYDRATION TIME ESTIMATE setting of the cluster. - rehydration_time_estimate: Duration, + /// The HYDRATION TIME ESTIMATE setting of the cluster. + hydration_time_estimate: Duration, } impl SchedulingDecision { @@ -66,12 +66,12 @@ impl SchedulingDecision { SchedulingDecision::Refresh(RefreshDecision { cluster_on, objects_needing_refresh: mvs_needing_refresh, - rehydration_time_estimate, + hydration_time_estimate, }) => { - let mut rehydration_time_estimate_str = String::new(); + let mut hydration_time_estimate_str = String::new(); mz_repr::strconv::format_interval( - &mut rehydration_time_estimate_str, - Interval::from_duration(rehydration_time_estimate).expect( + &mut hydration_time_estimate_str, + Interval::from_duration(hydration_time_estimate).expect( "planning ensured that this is convertible back to Interval", ), ); @@ -81,7 +81,7 @@ impl SchedulingDecision { .iter() .map(|id| id.to_string()) .collect(), - rehydration_time_estimate: rehydration_time_estimate_str, + hydration_time_estimate: hydration_time_estimate_str, }) } }) @@ -114,7 +114,7 @@ impl Coordinator { // Nothing to do, user manages this cluster manually. } ClusterSchedule::Refresh { - rehydration_time_estimate, + hydration_time_estimate, } => { let mvs = cluster .bound_objects() @@ -141,7 +141,7 @@ impl Coordinator { debug!(%cluster.id, ?refresh_mv_write_frontiers, "check_refresh_policy"); refresh_mv_write_frontiers.push(( cluster.id, - rehydration_time_estimate, + hydration_time_estimate, mvs, )); } @@ -163,14 +163,14 @@ impl Coordinator { let decisions = refresh_mv_write_frontiers .into_iter() .map( - |(cluster_id, rehydration_time_estimate, refresh_mv_write_frontiers)| { + |(cluster_id, hydration_time_estimate, refresh_mv_write_frontiers)| { // We are just checking that - // write_frontier < local_read_ts + rehydration_time_estimate - let rehydration_estimate = &rehydration_time_estimate + // write_frontier < local_read_ts + hydration_time_estimate + let hydration_estimate = &hydration_time_estimate .try_into() .expect("checked during planning"); let local_read_ts_adjusted = - local_read_ts.step_forward_by(rehydration_estimate); + local_read_ts.step_forward_by(hydration_estimate); let mvs_needing_refresh = refresh_mv_write_frontiers .into_iter() .filter_map(|(id, frontier)| { @@ -187,7 +187,7 @@ impl Coordinator { SchedulingDecision::Refresh(RefreshDecision { cluster_on, objects_needing_refresh: mvs_needing_refresh, - rehydration_time_estimate, + hydration_time_estimate, }), ) }, diff --git a/src/audit-log/src/lib.rs b/src/audit-log/src/lib.rs index c299963718448..fe300647704d9 100644 --- a/src/audit-log/src/lib.rs +++ b/src/audit-log/src/lib.rs @@ -295,8 +295,8 @@ pub struct RefreshDecisionWithReasonV1 { /// Objects that currently need a refresh on the cluster (taking into account the rehydration /// time estimate). pub objects_needing_refresh: Vec, - /// The REHYDRATION TIME ESTIMATE setting of the cluster. - pub rehydration_time_estimate: String, + /// The HYDRATION TIME ESTIMATE setting of the cluster. + pub hydration_time_estimate: String, } #[derive(Clone, Debug, Serialize, Deserialize, PartialOrd, PartialEq, Eq, Ord, Hash, Arbitrary)] diff --git a/src/catalog/src/builtin.rs b/src/catalog/src/builtin.rs index 6438577dd472f..e55ec1627f6e7 100644 --- a/src/catalog/src/builtin.rs +++ b/src/catalog/src/builtin.rs @@ -2510,7 +2510,7 @@ pub static MZ_CLUSTER_SCHEDULES: Lazy = Lazy::new(|| BuiltinTable .with_column("cluster_id", ScalarType::String.nullable(false)) .with_column("type", ScalarType::String.nullable(false)) .with_column( - "refresh_rehydration_time_estimate", + "refresh_hydration_time_estimate", ScalarType::Interval.nullable(true), ), is_retained_metrics_object: false, diff --git a/src/catalog/src/durable/objects/serialization.rs b/src/catalog/src/durable/objects/serialization.rs index d150ff7e56ff7..2ce51f0fef137 100644 --- a/src/catalog/src/durable/objects/serialization.rs +++ b/src/catalog/src/durable/objects/serialization.rs @@ -92,11 +92,11 @@ impl RustType for ClusterSchedule { value: Some(cluster_schedule::Value::Manual(Empty {})), }, ClusterSchedule::Refresh { - rehydration_time_estimate, + hydration_time_estimate, } => proto::ClusterSchedule { value: Some(cluster_schedule::Value::Refresh( ClusterScheduleRefreshOptions { - rehydration_time_estimate: Some(rehydration_time_estimate.into_proto()), + rehydration_time_estimate: Some(hydration_time_estimate.into_proto()), }, )), }, @@ -108,7 +108,7 @@ impl RustType for ClusterSchedule { None => Ok(Default::default()), Some(cluster_schedule::Value::Manual(Empty {})) => Ok(ClusterSchedule::Manual), Some(cluster_schedule::Value::Refresh(csro)) => Ok(ClusterSchedule::Refresh { - rehydration_time_estimate: csro + hydration_time_estimate: csro .rehydration_time_estimate .into_rust_if_some("rehydration_time_estimate")?, }), @@ -1841,7 +1841,7 @@ impl RustType proto::audit_log_event_v1::RefreshDecisionWithReasonV1 { decision: Some(decision), objects_needing_refresh: self.objects_needing_refresh.clone(), - rehydration_time_estimate: self.rehydration_time_estimate.clone(), + rehydration_time_estimate: self.hydration_time_estimate.clone(), } } @@ -1864,7 +1864,7 @@ impl RustType Ok(RefreshDecisionWithReasonV1 { decision, objects_needing_refresh: proto.objects_needing_refresh, - rehydration_time_estimate: proto.rehydration_time_estimate, + hydration_time_estimate: proto.rehydration_time_estimate, }) } } diff --git a/src/catalog/tests/read-write.rs b/src/catalog/tests/read-write.rs index 28403308a2f04..369ad6319a9ae 100644 --- a/src/catalog/tests/read-write.rs +++ b/src/catalog/tests/read-write.rs @@ -248,7 +248,7 @@ async fn test_audit_logs(openable_state: Box) { on_refresh: mz_audit_log::RefreshDecisionWithReasonV1 { decision: mz_audit_log::SchedulingDecisionV1::On, objects_needing_refresh: vec!["u42".to_string(), "u90".to_string()], - rehydration_time_estimate: "1000s".to_string(), + hydration_time_estimate: "1000s".to_string(), }, }), }), diff --git a/src/sql-lexer/src/keywords.txt b/src/sql-lexer/src/keywords.txt index 2ba519c09bbff..872847eb871e6 100644 --- a/src/sql-lexer/src/keywords.txt +++ b/src/sql-lexer/src/keywords.txt @@ -190,6 +190,7 @@ Host Hour Hours Humanized +Hydration Id Identifiers Ids diff --git a/src/sql-parser/src/ast/defs/statement.rs b/src/sql-parser/src/ast/defs/statement.rs index 7ddd45dead910..3ef656bc0ca37 100644 --- a/src/sql-parser/src/ast/defs/statement.rs +++ b/src/sql-parser/src/ast/defs/statement.rs @@ -3729,7 +3729,7 @@ impl AstDisplay for RefreshOptionValue { pub enum ClusterScheduleOptionValue { Manual, Refresh { - rehydration_time_estimate: Option, + hydration_time_estimate: Option, }, } @@ -3747,12 +3747,12 @@ impl AstDisplay for ClusterScheduleOptionValue { f.write_str("MANUAL"); } ClusterScheduleOptionValue::Refresh { - rehydration_time_estimate, + hydration_time_estimate, } => { f.write_str("ON REFRESH"); - if let Some(rehydration_time_estimate) = rehydration_time_estimate { - f.write_str(" (REHYDRATION TIME ESTIMATE = '"); - f.write_node(rehydration_time_estimate); + if let Some(hydration_time_estimate) = hydration_time_estimate { + f.write_str(" (HYDRATION TIME ESTIMATE = '"); + f.write_node(hydration_time_estimate); f.write_str(")"); } } diff --git a/src/sql-parser/src/parser.rs b/src/sql-parser/src/parser.rs index 00949656b7ee9..73c0d25fca833 100644 --- a/src/sql-parser/src/parser.rs +++ b/src/sql-parser/src/parser.rs @@ -3881,9 +3881,12 @@ impl<'a> Parser<'a> { MANUAL => ClusterScheduleOptionValue::Manual, ON => { self.expect_keyword(REFRESH)?; - // Parse optional `(REHYDRATION TIME ESTIMATE ...)` - let rehydration_time_estimate = if self.consume_token(&Token::LParen) { - self.expect_keywords(&[REHYDRATION, TIME, ESTIMATE])?; + // Parse optional `(HYDRATION TIME ESTIMATE ...)` + let hydration_time_estimate = if self.consume_token(&Token::LParen) { + // `REHYDRATION` is the legacy way of writing this. We'd like to eventually + // remove this, and allow only `HYDRATION`. (Dbt needs to be updated for this.) + self.expect_one_of_keywords(&[HYDRATION, REHYDRATION])?; + self.expect_keywords(&[TIME, ESTIMATE])?; let _ = self.consume_token(&Token::Eq); let interval = self.parse_interval_value()?; self.expect_token(&Token::RParen)?; @@ -3892,7 +3895,7 @@ impl<'a> Parser<'a> { None }; ClusterScheduleOptionValue::Refresh { - rehydration_time_estimate, + hydration_time_estimate, } } _ => unreachable!(), diff --git a/src/sql-parser/tests/testdata/ddl b/src/sql-parser/tests/testdata/ddl index 0d9abb42ca25f..6d3e0c79ab27a 100644 --- a/src/sql-parser/tests/testdata/ddl +++ b/src/sql-parser/tests/testdata/ddl @@ -1685,48 +1685,56 @@ CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH) ---- CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH) => -CreateCluster(CreateClusterStatement { name: Ident("cluster"), options: [ClusterOption { name: Size, value: Some(Value(String("1"))) }, ClusterOption { name: Schedule, value: Some(ClusterScheduleOptionValue(Refresh { rehydration_time_estimate: None })) }], features: [] }) +CreateCluster(CreateClusterStatement { name: Ident("cluster"), options: [ClusterOption { name: Size, value: Some(Value(String("1"))) }, ClusterOption { name: Schedule, value: Some(ClusterScheduleOptionValue(Refresh { HYDRATION_time_estimate: None })) }], features: [] }) +parse-statement +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour')) +---- +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour')) +=> +CreateCluster(CreateClusterStatement { name: Ident("cluster"), options: [ClusterOption { name: Size, value: Some(Value(String("1"))) }, ClusterOption { name: Schedule, value: Some(ClusterScheduleOptionValue(Refresh { HYDRATION_time_estimate: Some(IntervalValue { value: "1 hour", precision_high: Year, precision_low: Second, fsec_max_precision: None }) })) }], features: [] }) + +# Legacy version: `REHYDRATION TIME ESTIMATE` parse-statement CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour')) ---- CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour')) => -CreateCluster(CreateClusterStatement { name: Ident("cluster"), options: [ClusterOption { name: Size, value: Some(Value(String("1"))) }, ClusterOption { name: Schedule, value: Some(ClusterScheduleOptionValue(Refresh { rehydration_time_estimate: Some(IntervalValue { value: "1 hour", precision_high: Year, precision_low: Second, fsec_max_precision: None }) })) }], features: [] }) +CreateCluster(CreateClusterStatement { name: Ident("cluster"), options: [ClusterOption { name: Size, value: Some(Value(String("1"))) }, ClusterOption { name: Schedule, value: Some(ClusterScheduleOptionValue(Refresh { HYDRATION_time_estimate: Some(IntervalValue { value: "1 hour", precision_high: Year, precision_low: Second, fsec_max_precision: None }) })) }], features: [] }) parse-statement -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE '1 hour')) +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE '1 hour')) ---- -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour')) +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour')) => -CreateCluster(CreateClusterStatement { name: Ident("cluster"), options: [ClusterOption { name: Size, value: Some(Value(String("1"))) }, ClusterOption { name: Schedule, value: Some(ClusterScheduleOptionValue(Refresh { rehydration_time_estimate: Some(IntervalValue { value: "1 hour", precision_high: Year, precision_low: Second, fsec_max_precision: None }) })) }], features: [] }) +CreateCluster(CreateClusterStatement { name: Ident("cluster"), options: [ClusterOption { name: Size, value: Some(Value(String("1"))) }, ClusterOption { name: Schedule, value: Some(ClusterScheduleOptionValue(Refresh { HYDRATION_time_estimate: Some(IntervalValue { value: "1 hour", precision_high: Year, precision_low: Second, fsec_max_precision: None }) })) }], features: [] }) parse-statement -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE)) +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE)) ---- error: Expected literal string, found right parenthesis -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE)) +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE)) ^ parse-statement -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = )) +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = )) ---- error: Expected literal string, found right parenthesis -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = )) +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = )) ^ parse-statement -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION)) +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION)) ---- error: Expected TIME, found right parenthesis -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION)) +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION)) ^ parse-statement -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour') +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour') ---- error: Expected right parenthesis, found EOF -CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 hour') +CREATE CLUSTER cluster (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 hour') ^ parse-statement diff --git a/src/sql/src/plan.rs b/src/sql/src/plan.rs index f851e570f1502..2c79deb425d7a 100644 --- a/src/sql/src/plan.rs +++ b/src/sql/src/plan.rs @@ -590,9 +590,9 @@ pub enum ClusterSchedule { /// The system won't automatically turn the cluster On or Off. Manual, /// The cluster will be On when a REFRESH materialized view on it needs to refresh. - /// `rehydration_time_estimate` determines how much time before a refresh to turn the + /// `hydration_time_estimate` determines how much time before a refresh to turn the /// cluster On, so that it can rehydrate already before the refresh time. - Refresh { rehydration_time_estimate: Duration }, + Refresh { hydration_time_estimate: Duration }, } impl Default for ClusterSchedule { diff --git a/src/sql/src/plan/statement/ddl.rs b/src/sql/src/plan/statement/ddl.rs index 1ea50bfee10a1..0d02aa4bbc726 100644 --- a/src/sql/src/plan/statement/ddl.rs +++ b/src/sql/src/plan/statement/ddl.rs @@ -4023,20 +4023,20 @@ fn plan_cluster_schedule( ) -> Result { Ok(match schedule { ClusterScheduleOptionValue::Manual => ClusterSchedule::Manual, - // If `REHYDRATION TIME ESTIMATE` is not explicitly given, we default to 0. + // If `HYDRATION TIME ESTIMATE` is not explicitly given, we default to 0. ClusterScheduleOptionValue::Refresh { - rehydration_time_estimate: None, + hydration_time_estimate: None, } => ClusterSchedule::Refresh { - rehydration_time_estimate: Duration::from_millis(0), + hydration_time_estimate: Duration::from_millis(0), }, // Otherwise we convert the `IntervalValue` to a `Duration`. ClusterScheduleOptionValue::Refresh { - rehydration_time_estimate: Some(interval_value), + hydration_time_estimate: Some(interval_value), } => { let interval = Interval::try_from_value(Value::Interval(interval_value))?; if interval.as_microseconds() < 0 { sql_bail!( - "REHYDRATION TIME ESTIMATE must be non-negative; got: {}", + "HYDRATION TIME ESTIMATE must be non-negative; got: {}", interval ); } @@ -4044,16 +4044,16 @@ fn plan_cluster_schedule( // This limitation is because we want this interval to be cleanly convertable // to a unix epoch timestamp difference. When the interval involves months, then // this is not true anymore, because months have variable lengths. - sql_bail!("REHYDRATION TIME ESTIMATE must not involve units larger than days"); + sql_bail!("HYDRATION TIME ESTIMATE must not involve units larger than days"); } let duration = interval.duration()?; if u64::try_from(duration.as_millis()).is_err() || Interval::from_duration(&duration).is_err() { - sql_bail!("REHYDRATION TIME ESTIMATE too large"); + sql_bail!("HYDRATION TIME ESTIMATE too large"); } ClusterSchedule::Refresh { - rehydration_time_estimate: duration, + hydration_time_estimate: duration, } } }) @@ -4066,13 +4066,13 @@ fn unplan_cluster_schedule(schedule: ClusterSchedule) -> ClusterScheduleOptionVa match schedule { ClusterSchedule::Manual => ClusterScheduleOptionValue::Manual, ClusterSchedule::Refresh { - rehydration_time_estimate, + hydration_time_estimate, } => { - let interval = Interval::from_duration(&rehydration_time_estimate) + let interval = Interval::from_duration(&hydration_time_estimate) .expect("planning ensured that this is convertible back to Interval"); let interval_value = literal::unplan_interval(&interval); ClusterScheduleOptionValue::Refresh { - rehydration_time_estimate: Some(interval_value), + hydration_time_estimate: Some(interval_value), } } } diff --git a/test/sqllogictest/autogenerated/mz_internal.slt b/test/sqllogictest/autogenerated/mz_internal.slt index 2fe9b8f4fdcaa..a7a9769ed6659 100644 --- a/test/sqllogictest/autogenerated/mz_internal.slt +++ b/test/sqllogictest/autogenerated/mz_internal.slt @@ -99,7 +99,7 @@ SELECT position, name, type FROM objects WHERE schema = 'mz_internal' AND object ---- 1 cluster_id text 2 type text -3 refresh_rehydration_time_estimate interval +3 refresh_hydration_time_estimate interval query ITT SELECT position, name, type FROM objects WHERE schema = 'mz_internal' AND object = 'mz_cluster_replica_frontiers' ORDER BY position diff --git a/test/sqllogictest/materialized_views.slt b/test/sqllogictest/materialized_views.slt index a8c833757f4f4..7cc378bcc3733 100644 --- a/test/sqllogictest/materialized_views.slt +++ b/test/sqllogictest/materialized_views.slt @@ -1339,7 +1339,7 @@ statement ok ALTER CLUSTER c_schedule_1 SET (MANAGED = true, SIZE = '1'); statement ok -ALTER CLUSTER c_schedule_1 SET (SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '0 seconds')); +ALTER CLUSTER c_schedule_1 SET (SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '0 seconds')); # Setting some other cluster option in ALTER CLUSTER shouldn't change the SCHEDULE. # (The sleep is needed so that if the following ALTER erroneously sets the SCHEDULE to MANUAL, then we should be in a @@ -1465,22 +1465,22 @@ SELECT replication_factor FROM mz_catalog.mz_clusters WHERE name = 'c_schedule_4 ---- 0 -## REHYDRATION TIME ESTIMATE +## HYDRATION TIME ESTIMATE statement error db error: ERROR: Expected literal string, found number "0" -CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = 0)); +CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = 0)); -statement error db error: ERROR: REHYDRATION TIME ESTIMATE must be non\-negative; got: \-01:00:00 -CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '-1 hour')); +statement error db error: ERROR: HYDRATION TIME ESTIMATE must be non\-negative; got: \-01:00:00 +CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '-1 hour')); statement error db error: ERROR: invalid input syntax for type interval: unknown units aaaa: "1 aaaa" -CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 aaaa')); +CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 aaaa')); -statement error db error: ERROR: REHYDRATION TIME ESTIMATE must not involve units larger than days -CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1 month')); +statement error db error: ERROR: HYDRATION TIME ESTIMATE must not involve units larger than days +CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1 month')); statement error db error: ERROR: invalid input syntax for type interval: Overflows maximum days; cannot exceed 2147483647/\-2147483648 days: "1000000000000 days" -CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '1000000000000 days')); +CREATE CLUSTER c_bad (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '1000000000000 days')); # ---------------------------------------- # Introspection @@ -1533,16 +1533,16 @@ mvi3 at NULL NULL true mvi3 at NULL NULL true statement ok -CREATE CLUSTER c_schedule_rehydration_time_estimate (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '995 seconds')); +CREATE CLUSTER c_schedule_hydration_time_estimate (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '995 seconds')); -# Make the above cluster turn on, so that we can test how the REHYDRATION TIME ESTIMATE looks in `mz_audit_events`. +# Make the above cluster turn on, so that we can test how the HYDRATION TIME ESTIMATE looks in `mz_audit_events`. statement ok CREATE MATERIALIZED VIEW mv_rte -IN CLUSTER c_schedule_rehydration_time_estimate +IN CLUSTER c_schedule_hydration_time_estimate WITH (REFRESH EVERY '1 sec') AS SELECT * FROM t2; query TTT -SELECT name, cs.type, cs.refresh_rehydration_time_estimate +SELECT name, cs.type, cs.refresh_hydration_time_estimate FROM mz_internal.mz_cluster_schedules cs, mz_catalog.mz_clusters c WHERE c.id = cs.cluster_id ORDER BY name; @@ -1552,7 +1552,7 @@ c_schedule_2 manual NULL c_schedule_3 on-refresh 00:00:00 c_schedule_4 manual NULL c_schedule_5 manual NULL -c_schedule_rehydration_time_estimate on-refresh 00:16:35 +c_schedule_hydration_time_estimate on-refresh 00:16:35 mz_catalog_server manual NULL mz_probe manual NULL mz_support manual NULL @@ -1582,12 +1582,12 @@ drop cluster-replica "c_schedule_4" "manual" true NULL create cluster-replica "c_schedule_1" "manual" true NULL create cluster-replica "c_schedule_2" "manual" true NULL create cluster-replica "c_schedule_5" "manual" true NULL -drop cluster-replica "c_schedule_1" "schedule" false {"decision":"off","objects_needing_refresh":[],"rehydration_time_estimate":"00:00:00"} -drop cluster-replica "c_schedule_3" "schedule" false {"decision":"off","objects_needing_refresh":[],"rehydration_time_estimate":"00:00:00"} -create cluster-replica "c_schedule_1" "schedule" false {"decision":"on","objects_needing_refresh":["uXXX"],"rehydration_time_estimate":"00:00:00"} -create cluster-replica "c_schedule_3" "schedule" false {"decision":"on","objects_needing_refresh":["uXXX"],"rehydration_time_estimate":"00:00:00"} -create cluster-replica "c_schedule_4" "schedule" false {"decision":"on","objects_needing_refresh":["uXXX"],"rehydration_time_estimate":"00:00:00"} -create cluster-replica "c_schedule_rehydration_time_estimate" "schedule" false {"decision":"on","objects_needing_refresh":["uXXX"],"rehydration_time_estimate":"00:16:35"} +drop cluster-replica "c_schedule_1" "schedule" false {"decision":"off","objects_needing_refresh":[],"hydration_time_estimate":"00:00:00"} +drop cluster-replica "c_schedule_3" "schedule" false {"decision":"off","objects_needing_refresh":[],"hydration_time_estimate":"00:00:00"} +create cluster-replica "c_schedule_1" "schedule" false {"decision":"on","objects_needing_refresh":["uXXX"],"hydration_time_estimate":"00:00:00"} +create cluster-replica "c_schedule_3" "schedule" false {"decision":"on","objects_needing_refresh":["uXXX"],"hydration_time_estimate":"00:00:00"} +create cluster-replica "c_schedule_4" "schedule" false {"decision":"on","objects_needing_refresh":["uXXX"],"hydration_time_estimate":"00:00:00"} +create cluster-replica "c_schedule_hydration_time_estimate" "schedule" false {"decision":"on","objects_needing_refresh":["uXXX"],"hydration_time_estimate":"00:16:35"} # Materialized views in this file can be explained diff --git a/test/testdrive/materialized-view-refresh-options.td b/test/testdrive/materialized-view-refresh-options.td index 1822b906ef64a..b84b39c66bfc0 100644 --- a/test/testdrive/materialized-view-refresh-options.td +++ b/test/testdrive/materialized-view-refresh-options.td @@ -516,11 +516,11 @@ DROP CLUSTER REPLICA scheduled_cluster.unbilled; > SELECT * FROM mv11; 3 -## REHYDRATION TIME ESTIMATE +## HYDRATION TIME ESTIMATE -> CREATE CLUSTER c_schedule_6 (SIZE = '1', SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '995 seconds')); +> CREATE CLUSTER c_schedule_6 (SIZE = '1', SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '995 seconds')); > CREATE CLUSTER c_schedule_7 (SIZE = '1'); -> ALTER CLUSTER c_schedule_7 SET (SCHEDULE = ON REFRESH (REHYDRATION TIME ESTIMATE = '995 seconds')); +> ALTER CLUSTER c_schedule_7 SET (SCHEDULE = ON REFRESH (HYDRATION TIME ESTIMATE = '995 seconds')); # Create MVs whose first refresh is 1000 seconds from now. > CREATE MATERIALIZED VIEW mv13 @@ -532,7 +532,7 @@ DROP CLUSTER REPLICA scheduled_cluster.unbilled; WITH (REFRESH AT mz_now()::string::int8 + 1000 * 1000) AS SELECT count(*) FROM t2; -# Should be turned on soon due to the REHYDRATION TIME ESTIMATE. +# Should be turned on soon due to the HYDRATION TIME ESTIMATE. > SELECT replication_factor FROM mz_catalog.mz_clusters WHERE name = 'c_schedule_6'; 1 > SELECT replication_factor FROM mz_catalog.mz_clusters WHERE name = 'c_schedule_7'; @@ -635,7 +635,7 @@ true SELECT mz_unsafe.mz_sleep(a) FROM t6; -# Wait for the first rehydration to complete +# Wait for the first hydration to complete > SELECT * FROM mv_long_hydration; @@ -644,10 +644,10 @@ true WHERE name = 'mv_long_hydration'; true -# Make the next rehydration take 1000000 ms. +# Make the next hydration take 1000000 ms. > INSERT INTO t6 VALUES (1000000); -# Restart the cluster to force a rehydration. +# Restart the cluster to force a hydration. > ALTER CLUSTER cluster_to_be_bricked SET (REPLICATION FACTOR 0); > ALTER CLUSTER cluster_to_be_bricked SET (REPLICATION FACTOR 1);