-
Notifications
You must be signed in to change notification settings - Fork 54
(7/N) Use nexus_generation, update it #8936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* only placed 0/3 desired internal_dns zones | ||
* only placed 0/3 desired nexus zones | ||
|
||
error: generating blueprint: could not find active nexus zone in parent blueprint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO(me) insert this into input, so we don't have as much churn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, found the problem here. The reconfigurator-cli explicitly creates an "ExampleSystem" with "no_zones" or "no_disks" enabled, then tries to create a blueprint.
In a model where we need an old Nexus zone to make a new Nexus zone, this breaks.
I'm updating the planner logic to allow this case. With that, the reconfigurator-cli diff is significantly smaller.
9b1f7e3
to
86cee63
Compare
d1bd3fb
to
bf8f274
Compare
86cee63
to
91a884e
Compare
62a6819
to
30ecc07
Compare
91a884e
to
73d7258
Compare
30ecc07
to
8c0f5c9
Compare
73d7258
to
8e77874
Compare
8c0f5c9
to
a2a7fb5
Compare
8e77874
to
5578ec2
Compare
a2a7fb5
to
1067194
Compare
0b2efdd
to
9c09f60
Compare
1067194
to
bb4a47e
Compare
} | ||
} | ||
|
||
// Confirm that we have new nexuses at the desired generation number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to confirm that the new db_metadata_nexus
records have hit the DB - otherwise, the "old Nexuses" could quiesce without giving the "new Nexuses" enough context to do a handoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
bb4a47e
to
df27c58
Compare
9c09f60
to
c073f52
Compare
df27c58
to
54480df
Compare
c073f52
to
1d6d7a6
Compare
6da9f39
to
c4c748f
Compare
1d6d7a6
to
997c67b
Compare
c4c748f
to
6750df3
Compare
997c67b
to
5524608
Compare
0f1a341
to
74b7b0c
Compare
74b7b0c
to
6b7e9ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. Sorry for all my questions. The details here seem very tricky.
let mut active = vec![]; | ||
let mut not_yet = vec![]; | ||
for (_, zone) in | ||
blueprint.all_omicron_zones(BlueprintZoneDisposition::is_in_service) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitty but it feels to me like this logic belongs in the caller. Maybe they could pass in the list of Nexus instances that should be "active" vs. "not_yet"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this moves all the complexity of parsing the blueprint into deploy_db_metadata_nexus_records
, in the reconfigurator executor, so I've moved quite a few tests there too.
Done in 70d4ab9
for (_, zone) in | ||
blueprint.all_omicron_zones(BlueprintZoneDisposition::is_in_service) | ||
{ | ||
if let BlueprintZoneType::Nexus(ref nexus) = zone.zone_type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this logic for determining whether each Nexus zone is active
or not yet
is quite right. Suppose:
- the planner decides it's time to hand off and bumps
nexus_generation
- "old" Nexus goes to execute this blueprint -- it can continue executing blueprints for a while in this state (if there are sagas still running)
- but this will cause us to write "new" Nexus instances with state
active
instead ofnot_yet
Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch. I fixed this in 9514deb, and added a test specifically for this "add-Nexus-after-quiesce-started" behavior.
let active_nexus_zones = datastore | ||
.get_active_db_metadata_nexus(opctx) | ||
.await | ||
.internal_context("fetching active nexuses")? | ||
.into_iter() | ||
.map(|z| z.nexus_id()) | ||
.collect::<Vec<_>>(); | ||
let not_yet_nexus_zones = datastore | ||
.get_not_yet_db_metadata_nexus(opctx) | ||
.await | ||
.internal_context("fetching 'not yet' nexuses")? | ||
.into_iter() | ||
.map(|z| z.nexus_id()) | ||
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like we could combine these into one database query? Not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 690ea16
/// This is used to identify which Nexus is currently executing the planning | ||
/// operation, which is needed for safe shutdown decisions during handoff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// This is used to identify which Nexus is currently executing the planning | |
/// operation, which is needed for safe shutdown decisions during handoff. | |
/// This is used to determine which Nexus instances are currently in control, which is needed for safe shutdown decisions during handoff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in f15fa4b
} | ||
} | ||
|
||
pub fn add_active_nexuses( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn add_active_nexuses( | |
pub fn set_active_nexuses( |
("add" to me would imply that we're appending this set, but we're not)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in f15fa4b
fn is_zone_ready_for_update( | ||
&self, | ||
zone_kind: ZoneKind, | ||
mgs_updates: &PlanningMgsUpdatesStepReport, | ||
) -> Result<bool, TufRepoContentsError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could just return bool
now, except I'm worried you do still need some logic here to avoid deploying new Nexus zones before the rest of the system has been updated. I'm not sure that's necessary? But it seems safer and I assumed it was what we'd keep doing.
Ok(true) | ||
} | ||
|
||
fn lookup_current_nexus_image(&self) -> Option<BlueprintZoneImageSource> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, when would this function ever return None
?
!= new_repo.zone_image_source(kind)? | ||
{ | ||
return Ok(false); | ||
fn lookup_current_nexus_generation(&self) -> Option<Generation> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, when would this function ever return None
?
else { | ||
// If we don't know the current Nexus zone ID, or its | ||
// generation, we can't perform the handoff safety check. | ||
report.unsafe_zone( | ||
zone, | ||
Nexus { | ||
zone_generation: zone_nexus_generation, | ||
current_nexus_generation: None, | ||
}, | ||
); | ||
return false; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this case should be impossible now. That would imply there was literally no Nexus zone in the blueprint at the current generation?
// We need to prevent old Nexus zones from shutting themselves | ||
// down. In other words: it's only safe to shut down if handoff | ||
// has occurred. | ||
// | ||
// That only happens when the current generation of Nexus (the | ||
// one running right now) is greater than the zone we're | ||
// considering expunging. | ||
if current_gen <= zone_nexus_generation { | ||
report.unsafe_zone( | ||
zone, | ||
Nexus { | ||
zone_generation: zone_nexus_generation, | ||
current_nexus_generation: Some(current_gen), | ||
}, | ||
); | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is all of the logic thus far asking: is this one of the "active" Nexus zones? (Could we just check the planning input?)
Following up on this thread from the older PR: #8863 (comment)
I don't follow why we need to check this. It actually seems wrong. It means we can never shut down any Nexus zone except if it's post-handoff. But there's nothing unsafe about shutting down a single Nexus zone, right? And at some point we're going to want SP updates to use this function to check whether it's safe to shut down all the zones on the host whose SP is being bounced (#8482). Won't this check then prevent us from doing any SP updates on a host that's hosting Nexus?
The only place this is called is from do_plan_zone_updates()
, but I feel like maybe that function just needs to ignore Nexus zones altogether since they're updated specially.
This change helps for zones like Nexus, which may have multiple deployments using distinct images (see: #8936)
Blueprint Preparation & System Description
SystemDescription
for testsBlueprint Planner
Blueprint Execution
db_metadata_nexus
records. Previously, this createdactive
records for all in-service Nexuses. Now, it createsactive
andnot_yet
records, depending on the value of thenexus_generation
set in the zone record compared to the top-levelnexus_generation
Blippy
Fixes #8843, #8854