Skip to content

[DevX bootstrap] Benchmark the speed download of the devx closure #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yvan-sraka opened this issue Mar 1, 2023 · 11 comments
Closed
Assignees

Comments

@yvan-sraka
Copy link
Contributor

When using GHA to turn iohk/devx into a shell to run, e.g. cabal update, cabal build, we take a lot of time downloading stuff.

A lot of this time comes down to nix sequentially downloading a lot of data. We should build the store and export/import it instead to speed this up.

nix path-info --closure-size --human-readable $(nix print-dev-env --json .#ghc8107-static-minimal | jq -r .variables.out.value)

… will give us something like 2.5G. That's a lot.

We can also enter the shell using

$ nix print-dev-env .#ghc8107-static-minimal > env.sh
$ bash --rcfile env.sh

(e.g. instead of nix develop).

We could pre-build the closure (e.g. from result), and store that as a zstd compressed archive:

nix-store --export $(nix-store -qR result) | zstd -z8T8 > out.zstd

And then re-import this as the first step in GHAs after setting up nix.

See for example this GHA: https://github.com/angerman/x/blob/c559ae0429bb69829a9c9cae8c21ab777461aaf2/.github/workflows/main.yml#L23-L66, which doesn't work properly yet (nix still ends up downloading stuff when trying to enter the shell; maybe this can be eliminated with the env.sh idea from above).

@yvan-sraka
Copy link
Contributor Author

I wanted to measure the time latency improvement of such a hack, so I wrote a dumb python script:

import os
import timeit

DEV_SHELLS = [
    "ghc8107",
    "ghc902",
    "ghc925",
    "ghc8107-minimal",
    "ghc902-minimal",
    "ghc925-minimal",
    "ghc8107-static-minimal",
    "ghc902-static-minimal",
    "ghc925-static-minimal",
]

T = {}
flake = "input-output-hk/devx" # vs. yvan-sraka/static-closure
for devShell in DEV_SHELLS:
    os.system(f"nix-collect-garbage -d")
    x = lambda number: round(timeit.timeit(lambda: os.system(
        f'nix develop "github:{flake}#{devShell}"\
        --no-write-lock-file --refresh --command true'
    ), number=number), 2)
    T[devShell] = {"bootstrap": x(1), "reload": x(10)}
print(T)

I currently exhaustively list working version of the input-output-hk/devx devShell, see issues #23 and #24. My machine (an iMac 24" M1) uses zw3rk.com cache (but I disable remote builder) as I wanted to match what I imagine the “defaults” user setting.

n.b. I blindly choose Python because there is maybe a future where I want to possibly perform some basics statistics with numpy or display graphics with matplotlib!

The hack lives in this flake.nix shellHook.

I'll post the benchmark result I got in this thread :)

@angerman
Copy link
Collaborator

angerman commented Mar 6, 2023

This is good looking forward to the benchmarks!

@yvan-sraka
Copy link
Contributor Author

yvan-sraka commented Mar 7, 2023

The awaited benchmarks that ran past night on my machine (values unit is seconds):

Without the speed download hack (meaning the actual nix develop github:input-output-hk/devx#$key flake):

ghc8107:
  bootstrap: 2155.54
  reload: 4.57
ghc902: broken
ghc925:
  bootstrap: 2005.21
  reload: 4.43
ghc8107-minimal:
  bootstrap: 1820.64
  reload: 3.59
ghc902-minimal:
  bootstrap: 1788.87
  reload: 3.33
ghc925-minimal:
  bootstrap: 1826.22
  reload: 3.76
ghc8107-static-minimal:
  bootstrap: 1774.84
  reload: 3.46
ghc902-static-minimal:
  bootstrap: 1780.89
  reload: 4.75
ghc925-static-minimal:
  bootstrap: 1871.04
  reload: 3.4

With the speed download hack of version 70e3884 (that could be summed up as: curl https://s3.zw3rk.com/devx/$arch.$key.zstd | zstd -d | nix-store --import and then the env trick. It does cache the download, but unconditionally does the nix-store import for each reload…):

ghc8107: broken
ghc902: broken
ghc925: broken
ghc8107-minimal:
  bootstrap: 95.71
  reload: 37.74
ghc902-minimal:
  bootstrap: 100.58
  reload: 37.6
ghc925-minimal:
  bootstrap: 101.05
  reload: 33.25
ghc8107-static-minimal:
  bootstrap: 82.32
  reload: 30.61
ghc902-static-minimal:
  bootstrap: 91.1
  reload: 31.1
ghc925-static-minimal:
  bootstrap: 88.54
  reload: 27.21

First, there are few settings that are “broken” and I should investigate why … Then, as you can see, it's a big improvement in bootstrap speed (I have a quite slow internet connection so that surely helps to increase the numbers) …

… but there is more work to do, as @angerman made me realize: wrapper derivation should not have to rely on minio-client! And reload time here is bad (the measure is 10x re-entering the shell): I should fix that, so it behaves at least like the “without the speed download hacknix develop and even I believe I can potentially shave those numbers a bit. :)

@yvan-sraka yvan-sraka self-assigned this Mar 7, 2023
@angerman
Copy link
Collaborator

@yvan-sraka can you please update the comment above with the following remarks:

  • what are the values? seconds?
  • And be a bit more explicit in the comment that the first set is nix develop github:input-output-hk/devx#$key, and the second is effective curl https://s3.zw3rk.com/devx/$arch.$key.zstd | zstd -d | nix-store --import; and then the env. Though you do cache the download through fetchurl in nix, but unconditionally do the import for each run, (and reload).

If we had a canary derivation (e.g. the root of the imported closure), we could validate the existence of the closure in the store, by checking for the existence of that file; and skip the import?

@yvan-sraka
Copy link
Contributor Author

@yvan-sraka can you please update the comment above with the following remarks:

Edited :)

If we had a canary derivation (e.g. the root of the imported closure), we could validate the existence of the closure in the store, by checking for the existence of that file; and skip the import?

Yes! That's precisely what I've in mind and implemented in a new flake version that should also have fixed the broken builds. I should indeed re-run benchmark against this new flake version, which is currently --impure.

@yvan-sraka
Copy link
Contributor Author

On aarch64-darwin, the flavors ghc8107 and ghc902 are failing because of:

@nix { "action": "setPhase", "phase": "unpackPhase" }
unpacking sources
unpacking source archive /nix/store/9pqv84n4fxaadafjx32wi4c7d044xb0z-hlint-3.5-src
source root is hlint-3.5-src
@nix { "action": "setPhase", "phase": "patchPhase" }
patching sources
@nix { "action": "setPhase", "phase": "updateAutotoolsGnuConfigScriptsPhase" }
updateAutotoolsGnuConfigScriptsPhase
@nix { "action": "setPhase", "phase": "configurePhase" }
configuring
Configure flags:
--prefix=/nix/store/rvb3z44kwnwni719lndy9qz2dp84qxmw-hlint-exe-hlint-3.5 exe:hlint --package-db=clear --package-db=/nix/store/bvjs7g3g5i10h7pl360kpsbbh39m9y2s-hlint-exe-hlint-3.5-config/lib/ghc-9.0.2/package.conf.d --flags=ghc-lib --flags=gpl --flags=-hsyaml --flags=threaded --exact-configuration --dependency=hlint=hlint-3.5-DVvFAeGfGhl4cGfHK851Zv --dependency=array=array-0.5.4.0 --dependency=base=base-4.15.1.0 --dependency=deepseq=deepseq-1.4.5.0 --dependency=ghc-bignum=ghc-bignum-1.1 --dependency=ghc-boot-th=ghc-boot-th-9.0.2 --dependency=ghc-prim=ghc-prim-0.7.0 --dependency=integer-gmp=integer-gmp-1.1 --dependency=pretty=pretty-1.1.3.6 --dependency=rts=rts --dependency=template-haskell=template-haskell-2.17.0.0 --with-ghc=ghc --with-ghc-pkg=ghc-pkg --with-hsc2hs=hsc2hs --with-gcc=cc --with-ld=ld --with-ar=ar --with-strip=strip --disable-executable-stripping --disable-library-stripping --disable-library-profiling --disable-profiling --enable-static --enable-shared --disable-coverage --enable-library-for-ghci --datadir=/nix/store/44xd104h90xxkjbvg6sdriq471mpzir2-hlint-exe-hlint-3.5-data/share/ghc-9.0.2 --ghc-option=-fPIC --gcc-option=-fPIC 
Configuring executable 'hlint' for hlint-3.5..
@nix { "action": "setPhase", "phase": "buildPhase" }
building
Preprocessing executable 'hlint' for hlint-3.5..
Building executable 'hlint' for hlint-3.5..
[1 of 1] Compiling Main             ( src/Main.hs, dist/build/hlint/hlint-tmp/Main.o )
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
'apple-a12' is not a recognized processor for this target (ignoring processor)
Linking dist/build/hlint/hlint ...
/nix/store/48py6zrawzim9ghrnkqwm36jl4j1l23x-clang-wrapper-11.1.0/bin/ld: line 256: 26817 Segmentation fault: 11  /nix/store/5wvlj00dr22ivh210b18ccv1i60h6c1q-cctools-binutils-darwin-949.0.1/bin/ld ${extraBefore+"${extraBefore[@]}"} ${params+"${params[@]}"} ${extraAfter+"${extraAfter[@]}"}
clang-11: error: linker command failed with exit code 139 (use -v to see invocation)
`clang' failed in phase `Linker'. (Exit code: 139)

@yvan-sraka
Copy link
Contributor Author

I will re-run benchmarks in a GitHub Action context to have some consistency, since my personal internet connection is right now not enough reliable to not false results …

@angerman
Copy link
Collaborator

I don't think speed is the primary issue, as long as it's consistent. E.g. if you always get the same speed reliably that's going to provide good numbers. And most users won't be having 1G or 10G lines, but some XXX Mbit most likely.

If we find out that for fast lines, it's even worse though, that would also be good to know.

@yvan-sraka
Copy link
Contributor Author

I don't think speed is the primary issue, as long as it's consistent. E.g. if you always get the same speed reliably that's going to provide good numbers. And most users won't be having 1G or 10G lines, but some XXX Mbit most likely.

If we find out that for fast lines, it's even worse though, that would also be good to know.

Yes! My current connection issues are effectively more about consistency than speed :)

@yvan-sraka yvan-sraka removed their assignment Apr 11, 2023
@yvan-sraka yvan-sraka self-assigned this Aug 2, 2023
@yvan-sraka
Copy link
Contributor Author

yvan-sraka commented Aug 2, 2023

I've run new benchmarks on @hamishmack fetch-docker.sh in order to integrate them to the engineering blog post with this new script:

#! /usr/bin/env python
import os

def Dockerfile(shell, fast):
    if fast:
        cmd = f'./fetch-docker.sh input-output-hk/devx x86_64-linux.{shell}-env | zstd -d | nix-store --import'
    else:
        cmd = f'nix develop "github:input-output-hk/devx#{shell}" --command true'
    return f"""FROM nixos/nix
RUN nix-channel --update
RUN echo "experimental-features = nix-command flakes" >> /etc/nix/nix.conf
RUN echo "accept-flake-config = true" >> /etc/nix/nix.conf
RUN ln -s $(which bash) /bin/bash
RUN nix profile install "nixpkgs#jq" "nixpkgs#zstd"
RUN curl -L https://raw.githubusercontent.com/input-output-hk/actions/latest/devx/support/fetch-docker.sh -o fetch-docker.sh
RUN chmod +x fetch-docker.sh
RUN time {cmd}"""

for shell in ["ghc8107-iog", "ghc962-iog", "ghc8107-static-minimal", "ghc962-static-minimal"]:
    for fast in [True, False]:
        with open('Dockerfile', 'w') as file:
            file.write(Dockerfile(shell, fast))
        os.system(f'docker build . | tee {shell}-{"fast" if fast else "slow"}.log')

… and I retrieve the results with:

#! /usr/bin/env bash
for file in *.log; do
    grep '^real' "$file" | awk -v file="$file" '{print file " -> " $2}'
done

… I run it on my legacy ThinkPad X230 on a French countryside internet connection, to display how it changes loading times in context where internet connection and computing power are precious resources, like in a heavy CI:

ghc8107-iog-fast.log -> 8m28.296s
ghc8107-iog-slow.log -> 11m28.658s
ghc8107-static-minimal-fast.log -> 2m35.712s
ghc8107-static-minimal-slow.log -> 5m22.060s
ghc962-iog-fast.log -> 7m26.853s
ghc962-iog-slow.log -> 11m9.047s
ghc962-static-minimal-fast.log -> 1m35.069s
ghc962-static-minimal-slow.log -> 4m48.812s

… should I run these benchmarks with other devx closure flavors or in other execution environments?

@yvan-sraka yvan-sraka changed the title [DevX bootstrap] speed download a closure [DevX bootstrap] Benchmark the speed download of the devx closure Aug 2, 2023
@angerman
Copy link
Collaborator

angerman commented Sep 9, 2023

So it's about 50% of the total time. That's good! Thanks for running the benchmarks!

@angerman angerman closed this as completed Sep 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants