-
Notifications
You must be signed in to change notification settings - Fork 52
OCPBUGS-69436: Fix for must-gather scripting #1843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 4 commits
9a2cce3
9bc922e
e2b6eaa
587c392
2aa1b3a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,8 +11,25 @@ if [ "${BASE_COLLECTION_PATH}" = "" ]; then | |
| BASE_COLLECTION_PATH=$(pwd) | ||
| fi | ||
|
|
||
| # Use a global variable for namespace | ||
| INSTALL_NAMESPACE=openshift-lvm-storage | ||
|
|
||
| # There might be cases where the operator might be installed in a different namespace | ||
| # The INSTALL_NAMESPACE parameter does not account for old installations when it was running | ||
| # on project "openshift-storage" by default. | ||
| # This causes the must-gather to not collect data properly | ||
| # INSTALL_NAMESPACE=openshift-lvm-storage | ||
| # There might be different ways to sort the namespace installation namespace | ||
| # | ||
| # We filter based on the CSV object for "lvms-operator" | ||
| # oc get csv -A | grep lvms-operator | cut -f1 -d " " | ||
| # Give a list of project where the operator is detected, example output: | ||
| # ~~~ | ||
| # $ oc get csv -A | grep lvms-operator | cut -f1 -d " " | ||
| # lvm-operator-for-everyone | ||
| # openshift-storage | ||
| # ~~~ | ||
| # This will return an array of namespaces | ||
| mapfile -t INSTALL_NAMESPACE < <(oc get csv --all-namespaces | grep lvms-operator | cut -f1 -d " ") | ||
|
|
||
|
|
||
| # Add general resources to list if necessary | ||
|
|
||
|
|
@@ -55,41 +72,57 @@ oc_yamls+=("csv") | |
| oc_yamls+=("installplan") | ||
|
|
||
| echo "collecting dump of namespace" | tee -a "${BASE_COLLECTION_PATH}"/gather-debug.log | ||
| oc adm --dest-dir="${BASE_COLLECTION_PATH}" inspect ns/"${INSTALL_NAMESPACE}" ${log_collection_args} >>"${BASE_COLLECTION_PATH}"/gather-debug.log 2>&1 | ||
|
|
||
| for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do | ||
| oc adm --dest-dir="${BASE_COLLECTION_PATH}" inspect ns/"$NAMESPACE" ${log_collection_args} >>"${BASE_COLLECTION_PATH}"/gather-debug.log 2>&1 | ||
| done | ||
|
|
||
| echo "collecting dump of clusterresourceversion" | tee -a "${BASE_COLLECTION_PATH}"/gather-debug.log | ||
| for oc_yaml in "${oc_yamls[@]}"; do | ||
| # shellcheck disable=SC2129 | ||
| oc adm --dest-dir="${BASE_COLLECTION_PATH}" inspect "${oc_yaml}" -n "${INSTALL_NAMESPACE}" ${log_collection_args} >>"${BASE_COLLECTION_PATH}"/gather-debug.log 2>&1 | ||
| for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do | ||
| for oc_yaml in "${oc_yamls[@]}"; do | ||
| # shellcheck disable=SC2129 | ||
| oc adm --dest-dir="${BASE_COLLECTION_PATH}" inspect "${oc_yaml}" -n "$NAMESPACE" ${log_collection_args} >>"${BASE_COLLECTION_PATH}"/gather-debug.log 2>&1 | ||
| done | ||
| done | ||
|
|
||
| # Create the dir for oc_output | ||
| mkdir -p "${BASE_COLLECTION_PATH}/namespaces/${INSTALL_NAMESPACE}/oc_output/" | ||
| for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do | ||
| mkdir -p "${BASE_COLLECTION_PATH}/namespaces/${NAMESPACE}/oc_output/" | ||
| done | ||
|
|
||
| # Run the Collection of OC get commands | ||
| for command_get in "${commands_get[@]}"; do | ||
| echo "collecting oc command ${command_get}" | tee -a "${BASE_COLLECTION_PATH}/gather-debug.log" | ||
| COMMAND_OUTPUT_FILE=${BASE_COLLECTION_PATH}/namespaces/${INSTALL_NAMESPACE}/oc_output/${command_get// /_} | ||
| # shellcheck disable=SC2086 | ||
| { oc get ${command_get} -n ${INSTALL_NAMESPACE}; } >>"${COMMAND_OUTPUT_FILE}" | ||
| for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do | ||
| for command_get in "${commands_get[@]}"; do | ||
| echo "collecting oc command ${command_get}" | tee -a "${BASE_COLLECTION_PATH}/gather-debug.log" | ||
| COMMAND_OUTPUT_FILE=${BASE_COLLECTION_PATH}/namespaces/${NAMESPACE}/oc_output/${command_get// /_} | ||
| # shellcheck disable=SC2086 | ||
| { oc get ${command_get} -n ${NAMESPACE}; } >>"${COMMAND_OUTPUT_FILE}" | ||
| done | ||
| done | ||
|
|
||
| # Run the Collection of OC desc commands | ||
| for command_desc in "${commands_desc[@]}"; do | ||
| for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do | ||
| for command_desc in "${commands_desc[@]}"; do | ||
| echo "collecting oc describe command ${command_desc}" | tee -a "${BASE_COLLECTION_PATH}/gather-debug.log" | ||
| COMMAND_OUTPUT_FILE=${BASE_COLLECTION_PATH}/namespaces/${INSTALL_NAMESPACE}/oc_output/${command_desc// /_} | ||
| COMMAND_OUTPUT_FILE=${BASE_COLLECTION_PATH}/namespaces/${ns}/oc_output/${command_desc// /_} | ||
deik0 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| # shellcheck disable=SC2086 | ||
| { oc describe ${command_desc} -n ${INSTALL_NAMESPACE}; } >>"${COMMAND_OUTPUT_FILE}" | ||
| { oc describe ${command_desc} -n ${NAMESPACE}; } >>"${COMMAND_OUTPUT_FILE}" | ||
| done | ||
| done | ||
|
|
||
| { oc get lvmclusters -n ${INSTALL_NAMESPACE} -o yaml; } >"$BASE_COLLECTION_PATH/namespaces/${INSTALL_NAMESPACE}/oc_output/lvmcluster.yaml" 2>&1 | ||
| for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do | ||
| { oc get lvmclusters -n ${NAMESPACE} -o yaml; } >"$BASE_COLLECTION_PATH/namespaces/${NAMESPACE}/oc_output/lvmcluster.yaml" 2>&1 | ||
| done | ||
|
|
||
| # Create the dir for data from all namespaces | ||
| mkdir -p "${BASE_COLLECTION_PATH}/namespaces/all/" | ||
|
|
||
| # Run the Collection of Resources using must-gather | ||
| for resource in "${resources[@]}"; do | ||
| for NAMESPACE in "${INSTALL_NAMESPACE[@]}"; do | ||
| for resource in "${resources[@]}"; do | ||
| echo "collecting dump of ${resource}" | tee -a "${BASE_COLLECTION_PATH}/gather-debug.log" | ||
| { oc adm --dest-dir="${BASE_COLLECTION_PATH}/namespaces/all/" inspect "${resource}" --all-namespaces ${log_collection_args}; } >>"${BASE_COLLECTION_PATH}/gather-debug.log" 2>&1 | ||
| { oc adm --dest-dir="${BASE_COLLECTION_PATH}/namespaces/${NAMESPACE}/" inspect "${resource}" --all-namespaces ${log_collection_args}; } >>"${BASE_COLLECTION_PATH}/gather-debug.log" 2>&1 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This runs on all namespaces to fetch all the LVMCluster objects in the cluster. Is there a reason to change this to namespace-only?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure that is working as expected. It looks like it's never initialized properly, hence when executing that part it will not collecting anything. And, unless I'm missing something, it's not getting triggered as the resource list is empty. Regardless of the above, with this change, the script should be able to collect lvmcluster objects in every namespace that they are available(considering that a CSV object exist in them,plus any other resource that might have been configured).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I see. Would you be so kind to remove it all together then? |
||
| done | ||
| done | ||
|
|
||
| # For pvc of all namespaces | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having many for loops for namespaces throughout the file, we can have the one here which ends right before the collection of cluster-wide objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I totally agree that single run for each project would be a better approach, we can change it, I was trying to avoid a major change while maintaining current structure when the data is collected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll just make the code shorter but it's not a big deal. I'm ok for both approaches.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd like to see that we are checking only one namespace to avoid confuses in future
Also, because of this, we have to have for loop everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate a bit more on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not allow multiple LVM clusters across namespaces. So, fetching namespaces into array "just in case" and having to iterate over them all over the code is redundant. We must fetch single namespace and use it everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to have it in case a user disables the webhook and tries to create another LVMCluster, or if we ever decide on supporting multiple LVMClusters in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, we are not collecting the LVMClusters appropiately to detect this case(cleanup pending): #1843 (comment)
With this version we only cover projects where a lvm-operator CSV object is detected in the namespace.
That may been one, or many projects, we cannot control how many times an operator is installed in different projects, but we can collect the data to understand better each environment.
Not only the above, the point of this PR is to have an automated discover of where the CSV is configured and collect from that namespace, if for any reason we collect from other projects we might have to analyze and understand why the user has the operator multiple times installed.
Also, the current state doesn't cover a more common use-case that is a cluster upgrade, we were deploying this operator by default in
openshift-storagethis was changed recently toopenshift-lvm-storage, which makes the must-gather a tool that only works in an opinionated scenario.I don't think this would work, as I could install the operator in two different namespaces in my cluster and only use one. You could make the case that we can infer which one to pick based on where the LVMCluster exist, but then again we could enter the case that I create a LVMCluster everywhere with a bogus configuration.
Bottom line: I think this approach is more adaptable to whatever configuration a cluster might be running, and by collecting as much as possible we might reduce data requests each time they are required.