Skip to content

Workflow to add PR comment with reason for doc build failure#3955

Merged
samsrabin merged 50 commits intoESCOMP:b4b-devfrom
samsrabin:comment-on-docs-test-fail
Apr 24, 2026
Merged

Workflow to add PR comment with reason for doc build failure#3955
samsrabin merged 50 commits intoESCOMP:b4b-devfrom
samsrabin:comment-on-docs-test-fail

Conversation

@samsrabin
Copy link
Copy Markdown
Member

@samsrabin samsrabin commented Apr 23, 2026

Description of changes

When a PR has the "build docs" tests fail, it's hard to figure out why. This PR will make it so that the doc build log is posted in a PR comment after the test failure. Example here.

This is b4b but I think it should come straight to master, and we should not bother making a tag. It's purely a devops thing with the only changes being made to files in .github/. Coming straight to master means we won't need to wait 2+ weeks to start benefiting from this, and time is a factor since people will be submitting many docs update PRs soon.

Specific notes

Contributors other than yourself, if any: Claude Sonnet 4.6 and Opus 4.7 1M

CTSM Issues Fixed (include github issue #): None

Are answers expected to change (and if so in what way)? No

Any User Interface Changes (namelist or namelist defaults changes)? No

Does this create a need to change or add documentation? Did you do so? No

Testing performed, if any: See samsrabin#17.

rosiealice and others added 23 commits April 1, 2026 11:39
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…ssaging

Simplify messaging during docs build
…ssaging-changelog

ctsm5.4.032: Simplify messaging during docs build [update Changelog/Changesum]
This still needs test results updated
ctsm5.4.033: Update FATES tag with fix for LUH `no_nan` data set issue
ctsm5.4.034: bug fix to the FATES land use driver input code
bug fix to the FATES land use driver input code

This PR includes a cherry-picked fix from NorESMhub#209 to address swap in the
order of rangeland and pasture in the reading of the landuse drivers
@samsrabin samsrabin self-assigned this Apr 23, 2026
@samsrabin samsrabin added documentation additions or edits to user-facing documentation or its infrastructure b4b bit-for-bit devops Development Operations to improve development throughput, E.g., adding GitHub Workflows docs:infrastructure New features or bug fixes in the documentation infrastructure docs-loc:infrastructure Relates to documentation infrastructure labels Apr 23, 2026
@samsrabin samsrabin marked this pull request as ready for review April 23, 2026 23:40
samsrabin and others added 25 commits April 24, 2026 09:02
The workflow looked for artifact 'docs-build-failed', but docs.yml uploads
it as 'test-build-docs-container_failed'. The artifact was never found,
so no comment was ever posted. Match the consumer to the producer's name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The job previously ran on every completed docs workflow run (including
successes), did an artifacts API call, and then exited. Gate on
workflow_run.conclusion == 'failure' so successful runs don't spin up
a runner at all.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous approach inlined log contents inside a triple-backtick fence
via bash command substitution. A line in the log containing triple-backticks
would break the outer fence and mangle rendering. Writing to a file side-steps
re-expansion concerns and tightens quoting on the PR number and repo vars.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
actions/download-artifact@v4 extracts single-artifact downloads directly
into the current working directory when `path:` is omitted, so the post-
comment step's reference to \${DOCS_FAILURE_ARTIFACT}/build.log pointed at
a path that would not exist. Give the download step an explicit path
matching the env var so the downstream references resolve.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For PRs opened from forks, the runs.../pull_requests API returns an empty
array, so the previous PR_NUMBER lookup resolved to literal "null" and the
subsequent gh pr comment call failed. Have the producer write the PR number
into the failure artifact and the consumer read it from there.

The upload path widens from build-logs/build.log to build-logs/ so that
pr_number.txt rides along; artifact layout is preserved on download.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 'Upload logs on failure' step has been uploading the path
build-logs/build.log since it was added, but nothing in the repo ever
wrote that file — ./build_docs prints to stdout only. The artifact was
effectively empty, which only became visible once the consumer workflow
actually ran.

Tee the build step's combined output into the log file. Actions' default
Linux shell runs with -eo pipefail, so the build's non-zero exit still
propagates through the pipe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit assumed Actions' default Linux shell ran with
-o pipefail, but the runner logs show 'bash -e {0}' (pipefail off). As a
result, teeing build_docs output caused tee's zero exit to mask the build
failure, so the job was marked as passing even when the docs build failed.
Set pipefail explicitly in the run block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Piping through tee switches Python's stdout from line-buffered to block-
buffered, while stderr remains unbuffered. Merging 2>&1 after the fact
doesn't fix it: stderr flushes immediately, stdout dumps only at process
exit, so the teed log showed warnings before the "Cleaning..." /
"Building..." header lines they followed in the original output. Set
PYTHONUNBUFFERED=1 to disable Python stdio buffering for the build_docs
invocation and any Python children it spawns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GitHub Actions sets FORCE_COLOR in the environment, so Sphinx and other
Python tooling emit ANSI escapes even when stdout is a pipe. The PR
comment renders those as literal garbage (an ESC glyph followed by
sequences like [91m). Set NO_COLOR=1 to ask well-behaved tools to drop
color output across the build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
NO_COLOR=1 wasn't honored by every tool in the build chain, so escape
sequences still reached the log and rendered as garbage in the PR
comment. Replace it with a process-substituted sed that strips ANSI
before writing build.log, which keeps colored output intact on the
live Actions console while giving the PR comment a clean log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GitHub's HTML sanitizer strips the bare boolean attribute in
<details open>, leaving the element collapsed. Use the explicit
<details open="true"> form, which the sanitizer preserves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@samsrabin samsrabin force-pushed the comment-on-docs-test-fail branch from 818e2cc to 1fbe07c Compare April 24, 2026 15:03
@samsrabin samsrabin changed the base branch from master to b4b-dev April 24, 2026 15:03
@samsrabin samsrabin merged commit b081ad6 into ESCOMP:b4b-dev Apr 24, 2026
3 checks passed
@ekluzek
Copy link
Copy Markdown
Contributor

ekluzek commented Apr 24, 2026

Just looking this over so I understand what it's doing. The docs publish step was changed so that if it fails it keeps a failure artifact. Which then will be used by a new action that will publish a summary of what happened to the PR where the failure happened.

The other thing was a nice change to add a verbose option to a couple scripts so that some of the logging output only happens when verbose is on. This kind of thing is always helpful.

@samsrabin the one thing I'm wondering about is if these artifacts then should be removed by the new failure check action? Or does that happen automatically or maybe with some kind of automatic deletion of old artifcats?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

b4b bit-for-bit devops Development Operations to improve development throughput, E.g., adding GitHub Workflows docs:infrastructure New features or bug fixes in the documentation infrastructure docs-loc:infrastructure Relates to documentation infrastructure documentation additions or edits to user-facing documentation or its infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants