-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use artifacts to cache the build #51
base: main
Are you sure you want to change the base?
Conversation
Also need a job to create the artifact in the action repo when necessary. |
Use artifact from action repo only
Ensure LIBCMARKDIR uses WORKDIR
Should we create a distinct action for this? eg. infrastructure-actions/build-gfm ? ... It seems there is a fair bit of logic around this, which could be pulled out into a separate action for clarity of maintenance. ... I don't have a particular opinion either way. The current action.yml isn't so large as to be confusing. All that said, I do think that caching the artifact is the right approach. No need to keep rebuilding it for $N sites and runs. |
AFAICT, actions don't have a way to exit cleanly early. Also an action can only be called as a separate step; it cannot be invoked as part of a step that has a run clause. So I cannot see any way to extract an action that would simplify the code. However, I think the build sections could be simplified slightly by moving the pushd $WORKDIR/popd statements into build-cmark.sh Note also that the build-artifacts job is only intended for use on the actions repo, so I'm not sure it helps to extract some of the logic into a separate action. |
build-cmark.sh is fair game, as it is only used within this repository/action. Use your best judgement. Thanks! |
Offhand, let the Action define the LIBCMARKDIR, and hand that to build-cmark.sh. Is that your thinking? |
That would require other changes to build-cmark.sh, as it would potentially have to move the lib directory to the input location. However it now occurs to me that it would simplify the Docker build if it could specify LIBCMARKDIR without having to know the directory structure of the tar file. Only the build file should know that. I might change it again... |
At a minimum, build-cmark.sh could simply |
I think it may need the whole of the lib directory, but that is no harder to move. |
|
||
# Is there a saved artifact for the GFM build? | ||
echo "Check for GFM build artifact in action repo: $GITHUB_ACTION_REPO" | ||
gh run download --dir ${LIBCMARKDIR} --name ${GFM_ARTIFACT_KEY} --repo $GITHUB_ACTION_REPO || true | ||
if [[ -f $LIBCMARKDIR/libcmark-gfm.so ]] | ||
then | ||
echo "Downloaded to ${LIBCMARKDIR} from $GITHUB_ACTION_REPO, nothing more to do!" | ||
exit 0 # nothing more to do in this step | ||
fi | ||
|
||
# GFM binary not found, need to build it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In contrast to caching compile caches (see #32 ) this is actually a good/intended use case for https://github.com/actions/ cache because you have the fixed key of the gfm version and don't need any incremental updates.
The cache action also handles scoping of the cache so that a run of the action on the default branch won't pull in an artifact from a (user supplied, thus untrusted) fork PR CI run. You can then simply do this in an earlier step:
- name: Cache GFM
id: gfm-cache
uses: actions/cache@v4
with:
path: ${{ env.LIBCMARKDIR }}
key: gfm-lib-${{ env.GFM_VERSION }}
and use the output steps.gfm-cache.outputs.cache-hit
in an ${{ }}
expression to check if the lib was restored, if it was not the action will automatically cache path:
at the end of the action. And the cache doesn't expire, it will only be LRU evicted if the 10GB/repo limit is reached so #56 would also no longer be necessary.
However this would potentially create artifacts in lots of repos, as well as being an unexpected side effect of the build action
Actions making use of the gh provided cache functionality is pretty much expected imo and shouldn't be an issue especially with such a tiny cache but an option to disable caching could be provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(some people recommend pinning even actions/*
actions to specific shas in jobs with elevated permissions but that is currently not required by the gha policy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the notification spam. Is pelican not compatible with libcmark-gfm 0.29.0
which is available via apt
in both 20.04 and 22.04?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the notification spam. Is pelican not compatible with
libcmark-gfm 0.29.0
which is available viaapt
in both 20.04 and 22.04?
When I tested it, I found differences in the output.
Also, changing from BuildBot to GH CI is a big change, and the fewer other changes that are made, the easier it is to debug problems. Updates to GFM and Pelican versions can be made later.
AFAICT the apt version of GFM is designed as a stand-alone executable. I think this is job for later once the action is known to be working OK. It also appears to be a bit slower than fetching an artifact. (Later) I tried installing it and setting up the expected links. |
One more consideration: Using actions/cache doesn't help with the first build in a new repo (not sure if it helps with the first build when a new branch is cloned), whereas the cached artifacts are always available to new branches and new repos. |
👍 makes sense, was just wondering :)
True but they also add a considerable amount of code/process to maintain and with the build only taking 15 seconds I'd lean towards 'no code is the best code' ^^ If the build was more substantial, this would be a great setup, but for 15 seconds it seems a bit overkill. An alternative would be to use a docker image in the step to run pelican, or even a complete docker action but I don't really have any experience with Pelican so 🤷 ghcr.io caching + a minimized image makes that pretty fast too.
If there is a cache on the base branch then yes. |
I tried using a Docker image and that was slower. |
To clarify before I go down a rabbit hole to test this: you used the docker file and build the image in the workflow or pre-built a docker image and hosted it on ghcr.io and used that in the workflow? |
I think I used ghcr.io. This took quite a while to load. |
The PR looks for a GFM build artifact in the repo housing the action.
If one is found, it is downloaded and the build is skipped.
It can take 15+ seconds to build GFM, which is large proportion of the run-time, so it seems worth doing.
[A previous version of this PR also checked the calling repo for the artifact, and would create an artifact if none was found there either. However this would potentially create artifacts in lots of repos, as well as being an unexpected side effect of the build action]