Skip to content

Commit c496683

Browse files
authored
Prevent region allocation from filling pools (#7912)
Recent customer issues have highlighted problems with storage accounting, namely that while there are quotas and reservations for individual Crucible regions, there's nothing set for the whole Crucible dataset. Crucible _could_ end up using the whole disk, or some large fraction of it, such that other users of the same U2 could be starved out. This commit adds a buffer to each zpool that the Crucible region allocation query will not allocate into. This overhead will be set to 250G initially (see #7875 for reasoning) but could also be modified with omdb. Part of this commit's changes include using a CTE with `regions_hard_delete`, which is much more efficient than the previous for loop but has the effect of overwriting `size_used` for all datasets, which will undo any time this column value was manually set to prevent allocation for particular datasets / pools. Because of this, this commit also adds a `no_provision` flag for a Crucible dataset: if it is set, then the region allocation query will not allocate into that dataset. This flag can be toggled with omdb. Part of the upgrade to R14 will include a support procedure to address if the addition of the control plane storage buffer of 250G causes a Crucible dataset to be "overprovisioned", necessitating manually requested region replacement requests to reduce the size allocated for a particular Crucible dataset. This commit adds an omdb command to show all overprovisioned crucible datasets, and changes the region listing command so it can list regions for a particular dataset. Fixes #3480
1 parent 5e27bde commit c496683

32 files changed

+918
-36
lines changed

.github/buildomat/jobs/deploy.sh

+44
Original file line numberDiff line numberDiff line change
@@ -270,6 +270,9 @@ tar xf out/omicron-sled-agent.tar pkg/config-rss.toml pkg/config.toml
270270
sed -E -i~ "s/(m2|u2)(.*\.vdev)/\/scratch\/\1\2/g" pkg/config.toml
271271
diff -u pkg/config.toml{~,} || true
272272

273+
EXPECTED_ZPOOL_COUNT=$(grep -c -E 'u2.*\.vdev' pkg/config.toml)
274+
echo "expected number of zpools is ${EXPECTED_ZPOOL_COUNT}"
275+
273276
SILO_NAME="$(sed -n 's/silo_name = "\(.*\)"/\1/p' pkg/config-rss.toml)"
274277
EXTERNAL_DNS_DOMAIN="$(sed -n 's/external_dns_zone_name = "\(.*\)"/\1/p' pkg/config-rss.toml)"
275278

@@ -397,6 +400,47 @@ until zoneadm list | grep nexus; do
397400
done
398401
echo "Waited for nexus: ${retry}s"
399402

403+
# Wait for handoff, as zpools as inserted into the database during
404+
# `rack_initialize`, and the next omdb command requires them to exist in the
405+
# db.
406+
retry=0
407+
until grep "Handoff to Nexus is complete" /var/svc/log/oxide-sled-agent:default.log; do
408+
if [[ $retry -gt 300 ]]; then
409+
echo "Failed to handoff to Nexus after 300 seconds"
410+
exit 1
411+
fi
412+
sleep 1
413+
retry=$((retry + 1))
414+
done
415+
echo "Waited for handoff: ${retry}s"
416+
417+
# Wait for the number of expected U2 zpools
418+
retry=0
419+
ACTUAL_ZPOOL_COUNT=$(pfexec zlogin oxz_switch /opt/oxide/omdb/bin/omdb db zpool list -i | wc -l)
420+
until [[ "${ACTUAL_ZPOOL_COUNT}" -eq "${EXPECTED_ZPOOL_COUNT}" ]];
421+
do
422+
pfexec zlogin oxz_switch /opt/oxide/omdb/bin/omdb db zpool list
423+
if [[ $retry -gt 300 ]]; then
424+
echo "Failed to wait for ${EXPECTED_ZPOOL_COUNT} zpools after 300 seconds"
425+
exit 1
426+
fi
427+
sleep 1
428+
retry=$((retry + 1))
429+
ACTUAL_ZPOOL_COUNT=$(pfexec zlogin oxz_switch /opt/oxide/omdb/bin/omdb db zpool list -i | wc -l)
430+
done
431+
432+
# The bootstrap command creates a disk, so before that: adjust the control plane
433+
# storage buffer to 0 as the virtual hardware only creates 20G pools
434+
435+
pfexec zlogin oxz_switch /opt/oxide/omdb/bin/omdb db zpool list
436+
437+
for ZPOOL in $(pfexec zlogin oxz_switch /opt/oxide/omdb/bin/omdb db zpool list -i);
438+
do
439+
pfexec zlogin oxz_switch /opt/oxide/omdb/bin/omdb -w db zpool set-storage-buffer "${ZPOOL}" 0
440+
done
441+
442+
pfexec zlogin oxz_switch /opt/oxide/omdb/bin/omdb db zpool list
443+
400444
export RUST_BACKTRACE=1
401445
export E2E_TLS_CERT IPPOOL_START IPPOOL_END
402446
eval "$(./target/debug/bootstrap)"

0 commit comments

Comments
 (0)