Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions rs/tests/message_routing/global_reboot_test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,14 @@ const RESPONSES_RETRY_TIMEOUT_SEC: u64 = 120;

fn main() -> Result<()> {
SystemTestGroup::new()
// When this test reboots a node, the orchestrator can restart the replica at the
// same instant the node is shutting down, while the dedicated `/var/lib/ic/crypto`
Comment on lines +60 to +61

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds generic enough to concern any system test: why can't this happen in any system test when the nodes are shutting down at the end of the test?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point. We could add it by default. But that could hide panics in that module which we shouldn't ignore. Should we move that panic to a dedicated module so we can only ignore that specific panic?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make the system test driver kill the orchestrator before it triggers VM shut-down? This should avoid any such issues by design.

// volume is being unmounted. The replica then transiently sees the bare mount point
// with default `0o755` permissions and panics in `setup_crypto_provider` with
// "Crypto state directory ... allowing general access". This panic is benign: the
// node completes the reboot and recovers. Exempt it from the unallowed-log-pattern
// check so it doesn't cause spurious flakiness.
.add_unallowed_log_pattern_except("panicked", "rs/replica/src/setup.rs")
.with_setup(setup)
.add_test(systest!(test))
.execute_from_args()?;
Expand Down
Loading