Skip to content

Use artifacts to cache the build #51

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
20 changes: 15 additions & 5 deletions pelican/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
# Settings
# ========
# Use the Python version as installed on CI pelican builders (2023-06-02)
ARG PYTHON_VERSION=3.8.10
ARG GFM_VERSION=0.28.3.gfm.12 # must agree with copy below
ARG PYTHON_VERSION=3.8.10 # must agree with the copy below

# Note: ARG scope ends at the FROM statement, so must be repeated if necessary

# Build cmake-gfm
FROM python:${PYTHON_VERSION}-slim-buster as pelican-asf
#=======================================================

ARG PYTHON_VERSION=3.8.10 # must agree with the copy above
ARG GFM_VERSION=0.28.3.gfm.12 # must agree with copy below

RUN apt update && apt upgrade -y
RUN apt install curl cmake build-essential -y
Expand All @@ -15,19 +20,24 @@ WORKDIR /opt/pelican-asf
# Copy only what we need to build cmark-gfm
COPY build-cmark.sh bin/build-cmark.sh

# build it
RUN bash bin/build-cmark.sh ${GFM_VERSION}
# build GFM
# Must agree with definition below
ENV LIBCMARKDIR /opt/gfm-${GFM_VERSION}/lib
RUN mkdir -p ${LIBCMARKDIR}
RUN bash bin/build-cmark.sh ${GFM_VERSION} ${LIBCMARKDIR}

# rebase the image to save on image size
FROM python:${PYTHON_VERSION}-slim-buster
#=======================================================

# Use the Pelican version as installed on CI pelican builders (2023-06-02)
ARG PELICAN_VERSION=4.5.4
ARG GFM_VERSION=0.28.3.gfm.12 # must agree with copy above

# Where we put GFM and the plugins
ENV WORKDIR /opt/pelican-asf
ENV LIBCMARKDIR ${WORKDIR}/cmark-gfm-${GFM_VERSION}/lib
# Must agree with definition above
ENV LIBCMARKDIR /opt/gfm-${GFM_VERSION}/lib

RUN apt update && apt upgrade -y

Expand Down
43 changes: 32 additions & 11 deletions pelican/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,34 +51,55 @@ runs:
# If the site uses Github Flavored Markdown, use this build branch
- name: fetch and build libcmark-gfm.so
if: ${{ inputs.gfm == 'true' }}
id: build_gfm
shell: bash
env:
WORKDIR: /opt/pelican-asf # where to build GFM
GFM_VERSION: '0.28.3.gfm.12' # ensure we agree with build-cmark.sh script
# action_repository only works in the env context; empty for local action call
# it is always empty for local invocation, in which case use the current repo
GITHUB_ACTION_REPO: ${{ github.action_repository || github.repository }}
GH_TOKEN: ${{ github.token }} # needed by gh
run: |
# The key needs to include the GFM version, but is otherwise arbitrary.
# It must agree with the definition in build-actions.yml
export GFM_ARTIFACT_KEY=gfm-lib-${GFM_VERSION}

if [[ -z $LIBCMARKDIR ]] # define LIBCMARKDIR if it is not already
then
# set up the GFM environment
export LIBCMARKDIR=/opt/pelican-asf/gfm-${GFM_VERSION} # arbitrary, but should contain version
mkdir -p $LIBCMARKDIR
echo "LIBCMARKDIR=${LIBCMARKDIR}" >>$GITHUB_ENV # needed for the build step
fi

# Does the GFM build already exist?
if [[ -n $LIBCMARKDIR && -d $LIBCMARKDIR ]]
if [[ -f $LIBCMARKDIR/libcmark-gfm.so ]]
then
echo "Already have GFM binary at $LIBCMARKDIR, skipping build"
exit 0 # nothing more to do in this step
fi

# Is there a saved artifact for the GFM build?
echo "Check for GFM build artifact in action repo: $GITHUB_ACTION_REPO"
gh run download --dir ${LIBCMARKDIR} --name ${GFM_ARTIFACT_KEY} --repo $GITHUB_ACTION_REPO || true
if [[ -f $LIBCMARKDIR/libcmark-gfm.so ]]
then
echo "Downloaded to ${LIBCMARKDIR} from $GITHUB_ACTION_REPO, nothing more to do!"
exit 0 # nothing more to do in this step
fi

# GFM binary not found, need to build it
Comment on lines +81 to +91
Copy link
Member

@assignUser assignUser Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In contrast to caching compile caches (see #32 ) this is actually a good/intended use case for https://github.com/actions/ cache because you have the fixed key of the gfm version and don't need any incremental updates.

The cache action also handles scoping of the cache so that a run of the action on the default branch won't pull in an artifact from a (user supplied, thus untrusted) fork PR CI run. You can then simply do this in an earlier step:

- name: Cache GFM
  id: gfm-cache
  uses: actions/cache@v4
  with:
    path: ${{ env.LIBCMARKDIR }}
    key: gfm-lib-${{ env.GFM_VERSION }}

and use the output steps.gfm-cache.outputs.cache-hit in an ${{ }} expression to check if the lib was restored, if it was not the action will automatically cache path: at the end of the action. And the cache doesn't expire, it will only be LRU evicted if the 10GB/repo limit is reached so #56 would also no longer be necessary.

However this would potentially create artifacts in lots of repos, as well as being an unexpected side effect of the build action

Actions making use of the gh provided cache functionality is pretty much expected imo and shouldn't be an issue especially with such a tiny cache but an option to disable caching could be provided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(some people recommend pinning even actions/* actions to specific shas in jobs with elevated permissions but that is currently not required by the gha policy)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the notification spam. Is pelican not compatible with libcmark-gfm 0.29.0 which is available via apt in both 20.04 and 22.04?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the notification spam. Is pelican not compatible with libcmark-gfm 0.29.0 which is available via apt in both 20.04 and 22.04?

When I tested it, I found differences in the output.
Also, changing from BuildBot to GH CI is a big change, and the fewer other changes that are made, the easier it is to debug problems. Updates to GFM and Pelican versions can be made later.

{
echo "Creating GFM binary in ${LIBCMARKDIR}"
# disable stdout unless debug is on
if [ "${{ inputs.debug }}" == 'true' ]
then
DEBUG_STEPS=1; export DEBUG_STEPS
else
exec >/dev/null
fi
# Don't pollute site checkout
mkdir -p $WORKDIR
pushd $WORKDIR
# build the code and define LIBCMARKDIR
bash ${{ github.action_path }}/build-cmark.sh $GFM_VERSION | grep "export LIBCMARKDIR" >/tmp/libcmarkdir.$$
source /tmp/libcmarkdir.$$
popd
# ensure LIBCMARKDIR is defined for subsequent steps
echo "LIBCMARKDIR=${LIBCMARKDIR}" >> $GITHUB_ENV
# build the code and define LIBCMARKDIR under $WORKDIR
bash ${{ github.action_path }}/build-cmark.sh $GFM_VERSION $LIBCMARKDIR
}

- name: Generate website from markdown
Expand Down
61 changes: 25 additions & 36 deletions pelican/build-cmark.sh
Original file line number Diff line number Diff line change
@@ -1,49 +1,46 @@
#!/bin/bash
#
# Build the cmark-gfm library and extensions within CURRENT DIRECTORY.
# Build the cmark-gfm library and extensions in a temporary directory
#
# The binary output will be under: cmark-gfm-$VERSION/lib
# The binary output will be under: LIBCMARKDIR
#
# USAGE:
# $ build-cmark.sh [ VERSION [ TARDIR ] ]
# $ build-cmark.sh VERSION LIBCMARKDIR [TARFILE]
#
# VERSION: defaults to 0.28.3.gfm.12
# TARDIR: where to find a downloaded/cached tarball of the cmark
# code, or where to place a tarball
# VERSION: e.g. 0.28.3.gfm.12
# LIBCMARKDIR: where to put the binary library files
# TARFILE: local copy of the tarfile; must be for the correct version! (optional)
#

# Echo all of our steps if DEBUG_STEPS is set
test -n "$DEBUG_STEPS" && set -x

set -e # early exit if any step fails

#VERSION=0.28.3.gfm.20 ### not yet
VERSION=0.28.3.gfm.12
if [ "$1" != "" ]; then VERSION="$1"; fi

# The tarball exists here, or will be downloaded here.
TARDIR="."
if [ "$2" != "" ]; then TARDIR="$2"; fi
VERSION=${1:?version}
LIBCMARKDIR=${2:?library output}
TARFILE=$3

ARCHIVES="https://github.com/github/cmark-gfm/archive/refs/tags"
LOCAL="${TARDIR}/cmark-gfm.$VERSION.orig.tar.gz"
TARNAME="cmark-gfm.$VERSION.orig.tar.gz"
TARDIR="cmark-gfm-$VERSION"

# WARNING: this must agree with the parent directory in the tar file or the build will fail
EXTRACTED_AS="cmark-gfm-$VERSION"
# Work in a temporary directory
TEMP=$(mktemp -d)

# Follow redirects, and place the result into known name $LOCAL
if [ -f "$LOCAL" ]; then
echo "Using cached tarball: ${LOCAL}" >&2
if [[ -f $TARFILE ]]
then
echo "Found tar!"
cp $TARFILE $TEMP # do this before cd to allow for relative paths
cd $TEMP
else
echo "Fetching $VERSION from cmark archives" >&2
curl -sSL --fail -o "$LOCAL" "$ARCHIVES/$VERSION.tar.gz"
cd $TEMP
echo "Fetching $VERSION from cmark archives" >&2
curl -sSL --fail -o "$TARNAME" "$ARCHIVES/$VERSION.tar.gz"
fi

# Clean anything old, then extract and build.
### somebody smart could peek into the .tgz. ... MEH
if [ -d "$EXTRACTED_AS" ]; then rm -r "$EXTRACTED_AS"; fi
tar xzf "$LOCAL"
pushd "$EXTRACTED_AS" >/dev/null
tar xzf "$TARNAME"
pushd "$TARDIR" >/dev/null
mkdir build
pushd build >/dev/null
cmake --version >&2
Expand All @@ -53,14 +50,6 @@ pushd "$EXTRACTED_AS" >/dev/null
} > build.log
popd >/dev/null

mkdir lib
cp -Pp build/src/lib* lib/
cp -Pp build/extensions/lib* lib/
cp -Pp build/src/lib* ${LIBCMARKDIR}/
cp -Pp build/extensions/lib* ${LIBCMARKDIR}/
popd >/dev/null

# These files/dir may need a reference with LD_LIBRARY_PATH.
# gfm.py wants this lib/ in LIBCMARKDIR.
# ls -laF "$EXTRACTED_AS/lib/"

# Provide a handy line for copy/paste.
echo "export LIBCMARKDIR='$(pwd)/$EXTRACTED_AS/lib'"