Skip to content

Commit abc9e96

Browse files
authored
Test full upgrade path of self managed releases (#33985)
<!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation Partially does MaterializeInc/database-issues#9861 - Specifically the "Paths" part The plan is to have this cover all releases(including patches) from 25.0.0 to 26.0.0. Bumped the parallelism but looking for guidance on if I should do this / if we should keep this in the nightly <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.
1 parent af20486 commit abc9e96

File tree

3 files changed

+157
-163
lines changed

3 files changed

+157
-163
lines changed

ci/nightly/pipeline.template.yml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1110,29 +1110,29 @@ steps:
11101110
args: [-m=long, test/cloudtest/test_upgrade.py, --no-test-parallelism]
11111111
sanitizer: skip
11121112
skip: "TODO(def-) Reenable in one version when labels are fixed in old version"
1113-
- id: checks-self-managed-25-2-upgrade-manipulate-before-upgrade
1114-
label: "Checks Self-Managed upgrade across all v25.2 patch releases, manipulate before upgrade"
1113+
- id: checks-self-managed-linear-upgrade-path-manipulate-before-upgrade
1114+
label: "Checks Self-Managed upgrade across all supported patch releases, manipulate before upgrade"
11151115
depends_on: build-x86_64
11161116
timeout_in_minutes: 180
1117-
parallelism: 3
1117+
parallelism: 6
11181118
agents:
11191119
queue: hetzner-x86-64-8cpu-16gb
11201120
plugins:
11211121
- ./ci/plugins/mzcompose:
11221122
composition: platform-checks
1123-
args: [--scenario=SelfManagedv25Point2_Upgrade_ManipulateBeforeUpgrade, "--seed=$BUILDKITE_JOB_ID"]
1123+
args: [--scenario=SelfManagedLinearUpgradePathManipulateBeforeUpgrade, "--seed=$BUILDKITE_JOB_ID"]
11241124

1125-
- id: checks-self-managed-25-2-upgrade-manipulate-during-upgrade
1126-
label: "Checks Self-Managed upgrade across all v25.2 patch releases, manipulate during upgrade"
1125+
- id: checks-self-managed-linear-upgrade-path-manipulate-during-upgrade
1126+
label: "Checks Self-Managed upgrade across all supported patch releases, manipulate during upgrade"
11271127
depends_on: build-x86_64
11281128
timeout_in_minutes: 180
1129-
parallelism: 3
1129+
parallelism: 6
11301130
agents:
11311131
queue: hetzner-x86-64-8cpu-16gb
11321132
plugins:
11331133
- ./ci/plugins/mzcompose:
11341134
composition: platform-checks
1135-
args: [--scenario=SelfManagedv25Point2_Upgrade_ManipulateDuringUpgrade, "--seed=$BUILDKITE_JOB_ID"]
1135+
args: [--scenario=SelfManagedLinearUpgradePathManipulateDuringUpgrade, "--seed=$BUILDKITE_JOB_ID"]
11361136

11371137
- group: "K8s node recovery cloudtest"
11381138
key: k8s-node-recovery

misc/python/materialize/checks/scenarios_upgrade.py

Lines changed: 113 additions & 129 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,9 @@
2828
from materialize.mzcompose import get_default_system_parameters
2929
from materialize.mzcompose.services.materialized import LEADER_STATUS_HEALTHCHECK
3030
from materialize.version_list import (
31-
fetch_self_managed_versions,
3231
get_published_minor_mz_versions,
3332
get_self_managed_versions,
33+
get_supported_self_managed_versions,
3434
)
3535

3636
# late initialization
@@ -515,48 +515,62 @@ def actions(self) -> list[Action]:
515515
]
516516

517517

518-
def upgrade_actions(
518+
@dataclass
519+
class MzServiceUpgradeInfo:
520+
# Version of the MZ instance
521+
version: MzVersion | None
522+
# Name of the docker service
523+
service_name: str
524+
# Generation of the MZ instance
525+
deploy_generation: int
526+
system_parameter_defaults: dict[str, str]
527+
528+
529+
def create_mz_service_upgrade_info_list(
530+
versions: list[MzVersion | None],
531+
) -> list[MzServiceUpgradeInfo]:
532+
# We use the first version to get the system parameters since the defaults for
533+
# newer versions include cutting edge features than can break backwards compatibility.
534+
# TODO (multiversion1): Get minimal system parameters by default to avoid cutting edge features.
535+
system_parameter_defaults = get_default_system_parameters(versions[0])
536+
return [
537+
MzServiceUpgradeInfo(
538+
version=version,
539+
service_name=f"mz_{(i % 2) + 1}",
540+
deploy_generation=i,
541+
system_parameter_defaults=system_parameter_defaults,
542+
)
543+
for i, version in enumerate(versions)
544+
]
545+
546+
547+
def upgrade_service_actions(
519548
scenario: Scenario,
520-
version: MzVersion | None,
521-
deploy_generation: int,
522-
service_name: str,
523-
previous_service_name: str,
524-
system_parameter_defaults: dict[str, str],
525-
should_validate: bool = True,
549+
service_info: MzServiceUpgradeInfo,
550+
previous_service_info: MzServiceUpgradeInfo,
526551
) -> list[Action]:
527552
return [
528553
start_mz_read_only(
529554
scenario,
530-
tag=version,
531-
deploy_generation=deploy_generation,
532-
mz_service=service_name,
533-
system_parameter_defaults=system_parameter_defaults,
555+
tag=service_info.version,
556+
deploy_generation=service_info.deploy_generation,
557+
mz_service=service_info.service_name,
558+
system_parameter_defaults=service_info.system_parameter_defaults,
534559
),
535-
WaitReadyMz(service_name),
536-
PromoteMz(service_name),
537-
*([Validate(scenario, mz_service=service_name)] if should_validate else []),
538-
KillMz(capture_logs=True, mz_service=previous_service_name),
560+
WaitReadyMz(service_info.service_name),
561+
PromoteMz(service_info.service_name),
562+
# Cleanup the previous service
563+
KillMz(capture_logs=True, mz_service=previous_service_info.service_name),
539564
]
540565

541566

542-
@dataclass
543-
class MzServiceUpgradeInfo:
544-
version: MzVersion | None
545-
service_name: str
546-
547-
548-
def get_self_managed_v25_2_versions() -> list[MzVersion]:
549-
self_managed_versions = fetch_self_managed_versions()
550-
return sorted(
551-
[
552-
v.version
553-
for v in self_managed_versions
554-
if v.helm_version.major == 25 and v.helm_version.minor == 2
555-
]
567+
def print_upgrade_path(versions: list[MzVersion | None]):
568+
print(
569+
f"Upgrading through versions {[str(version) if version is not None else "current" for version in versions]}"
556570
)
557571

558572

559-
class SelfManagedv25Point2_Upgrade_ManipulateBeforeUpgrade(Scenario):
573+
class SelfManagedLinearUpgradePathManipulateBeforeUpgrade(Scenario):
560574
"""
561575
Upgrade from the oldest v25.2 patch release to the latest v25.2 patch release to main.
562576
Run all manipulation phases before any upgrades.
@@ -569,63 +583,52 @@ def __init__(
569583
features: Features,
570584
seed: str | None = None,
571585
):
572-
self.v25_2_versions = get_self_managed_v25_2_versions()
586+
self.self_managed_versions = get_supported_self_managed_versions()
573587
super().__init__(checks, executor, features, seed)
574588

575589
def base_version(self) -> MzVersion:
576-
return self.v25_2_versions[0]
590+
return self.self_managed_versions[0]
577591

578592
def actions(self) -> list[Action]:
579-
print(f"Upgrading from tag {self.base_version()}")
580-
system_parameter_defaults = get_default_system_parameters(self.base_version())
581-
actions = []
582-
versions = self.v25_2_versions + [None]
583-
584-
mz_services = [
585-
MzServiceUpgradeInfo(
586-
version=version,
587-
# We alternate between mz_1 and mz_2
588-
service_name=f"mz_{(i % 2) + 1}",
589-
)
590-
for i, version in enumerate(versions)
593+
versions = self.self_managed_versions + [None]
594+
595+
print_upgrade_path(versions)
596+
597+
mz_services = create_mz_service_upgrade_info_list(versions)
598+
599+
actions = [
600+
StartMz(
601+
self,
602+
tag=mz_services[0].version,
603+
mz_service=mz_services[0].service_name,
604+
system_parameter_defaults=mz_services[0].system_parameter_defaults,
605+
),
606+
Initialize(self, mz_service=mz_services[0].service_name),
607+
Manipulate(self, phase=1, mz_service=mz_services[0].service_name),
608+
Manipulate(self, phase=2, mz_service=mz_services[0].service_name),
609+
Validate(self, mz_service=mz_services[0].service_name),
591610
]
592611

593-
for i, service in enumerate(mz_services):
594-
if i == 0:
595-
actions.extend(
596-
[
597-
StartMz(
598-
self,
599-
tag=service.version,
600-
mz_service=service.service_name,
601-
system_parameter_defaults=system_parameter_defaults,
602-
),
603-
Initialize(self, mz_service=service.service_name),
604-
Manipulate(self, phase=1, mz_service=service.service_name),
605-
Manipulate(self, phase=2, mz_service=service.service_name),
606-
Validate(self, mz_service=service.service_name),
607-
]
608-
)
609-
else:
610-
# Because upgrades start on generation 1, we can set it to the index
611-
deploy_generation = i
612-
actions.extend(
613-
upgrade_actions(
614-
self,
615-
service.version,
616-
deploy_generation,
617-
service.service_name,
618-
mz_services[i - 1].service_name,
619-
system_parameter_defaults=system_parameter_defaults,
620-
)
612+
for i, service_info in enumerate[MzServiceUpgradeInfo](
613+
mz_services[1:], start=1
614+
):
615+
actions.extend(
616+
upgrade_service_actions(
617+
self,
618+
service_info=service_info,
619+
previous_service_info=mz_services[i - 1],
621620
)
621+
+ [
622+
Validate(self, mz_service=service_info.service_name),
623+
]
624+
)
622625

623626
return actions
624627

625628

626-
class SelfManagedv25Point2_Upgrade_ManipulateDuringUpgrade(Scenario):
629+
class SelfManagedLinearUpgradePathManipulateDuringUpgrade(Scenario):
627630
"""
628-
Upgrade from the oldest v25.2 patch release to the latest v25.2 patch release to main.
631+
Upgrade from the oldest Self-Managed version to the latest Self-Managed version to main.
629632
Run the first manipulation phase before all upgrades and the second during the upgrade.
630633
"""
631634

@@ -636,73 +639,54 @@ def __init__(
636639
features: Features,
637640
seed: str | None = None,
638641
):
639-
self.v25_2_versions = get_self_managed_v25_2_versions()
642+
self.self_managed_versions = get_supported_self_managed_versions()
640643
super().__init__(checks, executor, features, seed)
641644

642645
def base_version(self) -> MzVersion:
643-
return self.v25_2_versions[0]
646+
return self.self_managed_versions[0]
644647

645648
def actions(self) -> list[Action]:
646-
print(f"Upgrading from tag {self.base_version()}")
647-
system_parameter_defaults = get_default_system_parameters(self.base_version())
649+
versions = self.self_managed_versions + [None]
648650

649-
actions = []
650-
versions = self.v25_2_versions + [None]
651+
print_upgrade_path(versions)
651652

652-
mz_services = [
653-
MzServiceUpgradeInfo(
654-
version=version,
655-
# We alternate between mz_1 and mz_2
656-
service_name=f"mz_{(i % 2) + 1}",
657-
)
658-
for i, version in enumerate(versions)
653+
mz_services = create_mz_service_upgrade_info_list(
654+
versions,
655+
)
656+
657+
actions = [
658+
StartMz(
659+
self,
660+
tag=mz_services[0].version,
661+
mz_service=mz_services[0].service_name,
662+
system_parameter_defaults=mz_services[0].system_parameter_defaults,
663+
),
664+
Initialize(self, mz_service=mz_services[0].service_name),
665+
Manipulate(self, phase=1, mz_service=mz_services[0].service_name),
659666
]
660667

661-
for i, service in enumerate(mz_services):
662-
if i == 0:
668+
for i, service_info in enumerate[MzServiceUpgradeInfo](
669+
mz_services[1:], start=1
670+
):
671+
actions.extend(
672+
upgrade_service_actions(
673+
self,
674+
service_info=service_info,
675+
previous_service_info=mz_services[i - 1],
676+
)
677+
)
678+
679+
if i == 1:
680+
# Manipulate the MZ instance after the first upgrade
663681
actions.extend(
664682
[
665-
StartMz(
666-
self,
667-
tag=service.version,
668-
mz_service=service.service_name,
669-
system_parameter_defaults=system_parameter_defaults,
670-
),
671-
Initialize(self, mz_service=service.service_name),
672-
Manipulate(self, phase=1, mz_service=service.service_name),
683+
Manipulate(self, phase=2, mz_service=service_info.service_name),
684+
Validate(self, mz_service=service_info.service_name),
673685
]
674686
)
675687
else:
676-
# Because upgrades start on generation 1, we can set it to the index
677-
deploy_generation = i
678-
679-
if i == 1:
680-
# Manipulate the MZ instance after the first upgrade.
681-
actions.extend(
682-
upgrade_actions(
683-
self,
684-
service.version,
685-
deploy_generation,
686-
service.service_name,
687-
mz_services[i - 1].service_name,
688-
system_parameter_defaults=system_parameter_defaults,
689-
# We can't validate on the first upgrade since all manipulations haven't completed yet.
690-
should_validate=False,
691-
)
692-
+ [
693-
Manipulate(self, phase=2, mz_service=service.service_name),
694-
]
695-
)
696-
else:
697-
actions.extend(
698-
upgrade_actions(
699-
self,
700-
service.version,
701-
deploy_generation,
702-
service.service_name,
703-
mz_services[i - 1].service_name,
704-
system_parameter_defaults=system_parameter_defaults,
705-
)
706-
)
688+
actions.append(
689+
Validate(self, mz_service=service_info.service_name),
690+
)
707691

708692
return actions

0 commit comments

Comments
 (0)