Workflow to add PR comment with reason for doc build failure#3955
Merged
samsrabin merged 50 commits intoESCOMP:b4b-devfrom Apr 24, 2026
Merged
Workflow to add PR comment with reason for doc build failure#3955samsrabin merged 50 commits intoESCOMP:b4b-devfrom
samsrabin merged 50 commits intoESCOMP:b4b-devfrom
Conversation
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…ssaging Simplify messaging during docs build
…ssaging-changelog ctsm5.4.032: Simplify messaging during docs build [update Changelog/Changesum]
This still needs test results updated
ctsm5.4.033: Update FATES tag with fix for LUH `no_nan` data set issue
ctsm5.4.034: bug fix to the FATES land use driver input code
bug fix to the FATES land use driver input code This PR includes a cherry-picked fix from NorESMhub#209 to address swap in the order of rangeland and pasture in the reading of the landuse drivers
The workflow looked for artifact 'docs-build-failed', but docs.yml uploads it as 'test-build-docs-container_failed'. The artifact was never found, so no comment was ever posted. Match the consumer to the producer's name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The job previously ran on every completed docs workflow run (including successes), did an artifacts API call, and then exited. Gate on workflow_run.conclusion == 'failure' so successful runs don't spin up a runner at all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous approach inlined log contents inside a triple-backtick fence via bash command substitution. A line in the log containing triple-backticks would break the outer fence and mangle rendering. Writing to a file side-steps re-expansion concerns and tightens quoting on the PR number and repo vars. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
actions/download-artifact@v4 extracts single-artifact downloads directly
into the current working directory when `path:` is omitted, so the post-
comment step's reference to \${DOCS_FAILURE_ARTIFACT}/build.log pointed at
a path that would not exist. Give the download step an explicit path
matching the env var so the downstream references resolve.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For PRs opened from forks, the runs.../pull_requests API returns an empty array, so the previous PR_NUMBER lookup resolved to literal "null" and the subsequent gh pr comment call failed. Have the producer write the PR number into the failure artifact and the consumer read it from there. The upload path widens from build-logs/build.log to build-logs/ so that pr_number.txt rides along; artifact layout is preserved on download. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 'Upload logs on failure' step has been uploading the path build-logs/build.log since it was added, but nothing in the repo ever wrote that file — ./build_docs prints to stdout only. The artifact was effectively empty, which only became visible once the consumer workflow actually ran. Tee the build step's combined output into the log file. Actions' default Linux shell runs with -eo pipefail, so the build's non-zero exit still propagates through the pipe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit assumed Actions' default Linux shell ran with
-o pipefail, but the runner logs show 'bash -e {0}' (pipefail off). As a
result, teeing build_docs output caused tee's zero exit to mask the build
failure, so the job was marked as passing even when the docs build failed.
Set pipefail explicitly in the run block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Piping through tee switches Python's stdout from line-buffered to block- buffered, while stderr remains unbuffered. Merging 2>&1 after the fact doesn't fix it: stderr flushes immediately, stdout dumps only at process exit, so the teed log showed warnings before the "Cleaning..." / "Building..." header lines they followed in the original output. Set PYTHONUNBUFFERED=1 to disable Python stdio buffering for the build_docs invocation and any Python children it spawns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GitHub Actions sets FORCE_COLOR in the environment, so Sphinx and other Python tooling emit ANSI escapes even when stdout is a pipe. The PR comment renders those as literal garbage (an ESC glyph followed by sequences like [91m). Set NO_COLOR=1 to ask well-behaved tools to drop color output across the build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
NO_COLOR=1 wasn't honored by every tool in the build chain, so escape sequences still reached the log and rendered as garbage in the PR comment. Replace it with a process-substituted sed that strips ANSI before writing build.log, which keeps colored output intact on the live Actions console while giving the PR comment a clean log. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GitHub's HTML sanitizer strips the bare boolean attribute in <details open>, leaving the element collapsed. Use the explicit <details open="true"> form, which the sanitizer preserves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
818e2cc to
1fbe07c
Compare
Contributor
|
Just looking this over so I understand what it's doing. The docs publish step was changed so that if it fails it keeps a failure artifact. Which then will be used by a new action that will publish a summary of what happened to the PR where the failure happened. The other thing was a nice change to add a verbose option to a couple scripts so that some of the logging output only happens when verbose is on. This kind of thing is always helpful. @samsrabin the one thing I'm wondering about is if these artifacts then should be removed by the new failure check action? Or does that happen automatically or maybe with some kind of automatic deletion of old artifcats? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of changes
When a PR has the "build docs" tests fail, it's hard to figure out why. This PR will make it so that the doc build log is posted in a PR comment after the test failure. Example here.
This is b4b but I think it should come straight to master, and we should not bother making a tag. It's purely a devops thing with the only changes being made to files in
.github/. Coming straight to master means we won't need to wait 2+ weeks to start benefiting from this, and time is a factor since people will be submitting many docs update PRs soon.Specific notes
Contributors other than yourself, if any: Claude Sonnet 4.6 and Opus 4.7 1M
CTSM Issues Fixed (include github issue #): None
Are answers expected to change (and if so in what way)? No
Any User Interface Changes (namelist or namelist defaults changes)? No
Does this create a need to change or add documentation? Did you do so? No
Testing performed, if any: See samsrabin#17.