Skip to content
Open
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 51 additions & 18 deletions must-gather/collection-scripts/gather_namespaced_resources
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,25 @@ if [ "${BASE_COLLECTION_PATH}" = "" ]; then
BASE_COLLECTION_PATH=$(pwd)
fi

# Use a global variable for namespace
INSTALL_NAMESPACE=openshift-lvm-storage

# There might be cases where the operator might be installed in a different namespace
# The INSTALL_NAMESPACE parameter does not account for old installations when it was running
# on project "openshift-storage" by default.
# This causes the must-gather to not collect data properly
# INSTALL_NAMESPACE=openshift-lvm-storage
# There might be different ways to sort the namespace installation namespace
#
# We filter based on the CSV object for "lvms-operator"
# oc get csv -A | grep lvms-operator | cut -f1 -d " "
# Give a list of project where the operator is detected, example output:
# ~~~
# $ oc get csv -A | grep lvms-operator | cut -f1 -d " "
# lvm-operator-for-everyone
# openshift-storage
# ~~~
# This will return an array of namespaces
mapfile -t INSTALL_NAMESPACE < <(oc get csv --all-namespaces | grep lvms-operator | cut -f1 -d " ")


# Add general resources to list if necessary

Expand Down Expand Up @@ -55,41 +72,57 @@ oc_yamls+=("csv")
oc_yamls+=("installplan")

echo "collecting dump of namespace" | tee -a "${BASE_COLLECTION_PATH}"/gather-debug.log
oc adm --dest-dir="${BASE_COLLECTION_PATH}" inspect ns/"${INSTALL_NAMESPACE}" ${log_collection_args} >>"${BASE_COLLECTION_PATH}"/gather-debug.log 2>&1

for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having many for loops for namespaces throughout the file, we can have the one here which ends right before the collection of cluster-wide objects.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I totally agree that single run for each project would be a better approach, we can change it, I was trying to avoid a major change while maintaining current structure when the data is collected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll just make the code shorter but it's not a big deal. I'm ok for both approaches.

Copy link
Contributor

@qJkee qJkee Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like to see that we are checking only one namespace to avoid confuses in future
Also, because of this, we have to have for loop everywhere

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like to see that we are checking only one namespace to avoid confuses in future

Can you elaborate a bit more on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not allow multiple LVM clusters across namespaces. So, fetching namespaces into array "just in case" and having to iterate over them all over the code is redundant. We must fetch single namespace and use it everywhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to have it in case a user disables the webhook and tries to create another LVMCluster, or if we ever decide on supporting multiple LVMClusters in the future?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not allow multiple LVM clusters across namespaces.

As of now, we are not collecting the LVMClusters appropiately to detect this case(cleanup pending): #1843 (comment)

So, fetching namespaces into array "just in case" and having to iterate over them all over the code is redundant.

With this version we only cover projects where a lvm-operator CSV object is detected in the namespace.
That may been one, or many projects, we cannot control how many times an operator is installed in different projects, but we can collect the data to understand better each environment.

Not only the above, the point of this PR is to have an automated discover of where the CSV is configured and collect from that namespace, if for any reason we collect from other projects we might have to analyze and understand why the user has the operator multiple times installed.

Also, the current state doesn't cover a more common use-case that is a cluster upgrade, we were deploying this operator by default in openshift-storage this was changed recently to openshift-lvm-storage, which makes the must-gather a tool that only works in an opinionated scenario.

We must fetch single namespace and use it everywhere

I don't think this would work, as I could install the operator in two different namespaces in my cluster and only use one. You could make the case that we can infer which one to pick based on where the LVMCluster exist, but then again we could enter the case that I create a LVMCluster everywhere with a bogus configuration.

Bottom line: I think this approach is more adaptable to whatever configuration a cluster might be running, and by collecting as much as possible we might reduce data requests each time they are required.

oc adm --dest-dir="${BASE_COLLECTION_PATH}" inspect ns/"$NAMESPACE" ${log_collection_args} >>"${BASE_COLLECTION_PATH}"/gather-debug.log 2>&1
done

echo "collecting dump of clusterresourceversion" | tee -a "${BASE_COLLECTION_PATH}"/gather-debug.log
for oc_yaml in "${oc_yamls[@]}"; do
# shellcheck disable=SC2129
oc adm --dest-dir="${BASE_COLLECTION_PATH}" inspect "${oc_yaml}" -n "${INSTALL_NAMESPACE}" ${log_collection_args} >>"${BASE_COLLECTION_PATH}"/gather-debug.log 2>&1
for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do
for oc_yaml in "${oc_yamls[@]}"; do
# shellcheck disable=SC2129
oc adm --dest-dir="${BASE_COLLECTION_PATH}" inspect "${oc_yaml}" -n "$NAMESPACE" ${log_collection_args} >>"${BASE_COLLECTION_PATH}"/gather-debug.log 2>&1
done
done

# Create the dir for oc_output
mkdir -p "${BASE_COLLECTION_PATH}/namespaces/${INSTALL_NAMESPACE}/oc_output/"
for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do
mkdir -p "${BASE_COLLECTION_PATH}/namespaces/${NAMESPACE}/oc_output/"
done

# Run the Collection of OC get commands
for command_get in "${commands_get[@]}"; do
echo "collecting oc command ${command_get}" | tee -a "${BASE_COLLECTION_PATH}/gather-debug.log"
COMMAND_OUTPUT_FILE=${BASE_COLLECTION_PATH}/namespaces/${INSTALL_NAMESPACE}/oc_output/${command_get// /_}
# shellcheck disable=SC2086
{ oc get ${command_get} -n ${INSTALL_NAMESPACE}; } >>"${COMMAND_OUTPUT_FILE}"
for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do
for command_get in "${commands_get[@]}"; do
echo "collecting oc command ${command_get}" | tee -a "${BASE_COLLECTION_PATH}/gather-debug.log"
COMMAND_OUTPUT_FILE=${BASE_COLLECTION_PATH}/namespaces/${NAMESPACE}/oc_output/${command_get// /_}
# shellcheck disable=SC2086
{ oc get ${command_get} -n ${NAMESPACE}; } >>"${COMMAND_OUTPUT_FILE}"
done
done

# Run the Collection of OC desc commands
for command_desc in "${commands_desc[@]}"; do
for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do
for command_desc in "${commands_desc[@]}"; do
echo "collecting oc describe command ${command_desc}" | tee -a "${BASE_COLLECTION_PATH}/gather-debug.log"
COMMAND_OUTPUT_FILE=${BASE_COLLECTION_PATH}/namespaces/${INSTALL_NAMESPACE}/oc_output/${command_desc// /_}
COMMAND_OUTPUT_FILE=${BASE_COLLECTION_PATH}/namespaces/${ns}/oc_output/${command_desc// /_}
# shellcheck disable=SC2086
{ oc describe ${command_desc} -n ${INSTALL_NAMESPACE}; } >>"${COMMAND_OUTPUT_FILE}"
{ oc describe ${command_desc} -n ${NAMESPACE}; } >>"${COMMAND_OUTPUT_FILE}"
done
done

{ oc get lvmclusters -n ${INSTALL_NAMESPACE} -o yaml; } >"$BASE_COLLECTION_PATH/namespaces/${INSTALL_NAMESPACE}/oc_output/lvmcluster.yaml" 2>&1
for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do
{ oc get lvmclusters -n ${NAMESPACE} -o yaml; } >"$BASE_COLLECTION_PATH/namespaces/${NAMESPACE}/oc_output/lvmcluster.yaml" 2>&1
done

# Create the dir for data from all namespaces
mkdir -p "${BASE_COLLECTION_PATH}/namespaces/all/"

# Run the Collection of Resources using must-gather
for resource in "${resources[@]}"; do
for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do
for resource in "${resources[@]}"; do
echo "collecting dump of ${resource}" | tee -a "${BASE_COLLECTION_PATH}/gather-debug.log"
{ oc adm --dest-dir="${BASE_COLLECTION_PATH}/namespaces/all/" inspect "${resource}" --all-namespaces ${log_collection_args}; } >>"${BASE_COLLECTION_PATH}/gather-debug.log" 2>&1
{ oc adm --dest-dir="${BASE_COLLECTION_PATH}/namespaces/${NAMESPACE}/" inspect "${resource}" --all-namespaces ${log_collection_args}; } >>"${BASE_COLLECTION_PATH}/gather-debug.log" 2>&1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runs on all namespaces to fetch all the LVMCluster objects in the cluster. Is there a reason to change this to namespace-only?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that is working as expected.
Based on this:

# Resource List
resources=()
# collect lvmcluster resources
#resources+=(lvmclusters)

It looks like it's never initialized properly, hence when executing that part it will not collecting anything.
I've tested this recently:

$ cat lvm-test.logs  | grep "collecting dump of"
[must-gather-6pxtp] POD 2025-12-25T10:31:45.252310962Z collecting dump of namespace
[must-gather-6pxtp] POD 2025-12-25T10:31:46.730431061Z collecting dump of clusterresourceversion
[must-gather-6pxtp] POD 2025-12-25T10:31:52.458643433Z collecting dump of oc get pvc all namespaces
[must-gather-6pxtp] POD 2025-12-25T10:31:53.030236440Z collecting dump of snapshot information for all namespaces
$ oc get lvmcluster -n openshift-storage
NAME            STATUS
my-lvmcluster   Failed

And, unless I'm missing something, it's not getting triggered as the resource list is empty.

Regardless of the above, with this change, the script should be able to collect lvmcluster objects in every namespace that they are available(considering that a CSV object exist in them,plus any other resource that might have been configured).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Would you be so kind to remove it all together then?

done
done

# For pvc of all namespaces
Expand Down