Skip to content

Commit

Permalink
Add prebuilt devcontainer (#369)
Browse files Browse the repository at this point in the history
* Use vcpkg.json only

This is necessary to be able to build ParquetSharpNative offline with
a populated vcpkg cache.

* Ensure vcpkg builtin registry is up-to-date in CI runners

If we reference a baseline that is too recent, some CI runner images
will not have it. With this step we ensure it won't be the case.

This used to be done via vcpkg-configuration.json, but we couldn't rely
on it for offline devcontainer usage.

* Improve Powershell build script

Format via Powershell VS Code extension and fix
linting issues.

* Build native lib in both Debug and Release by default (unless in the CI)

* Add devcontainer

* Add devcontainer workflow

* Update documentation about building ParquetSharp

* Add solution to .gitignore
  • Loading branch information
jgiannuzzi authored Sep 7, 2023
1 parent 05e4a7f commit ea865a4
Show file tree
Hide file tree
Showing 12 changed files with 362 additions and 37 deletions.
74 changes: 74 additions & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
FROM mcr.microsoft.com/devcontainers/dotnet:0-7.0-bullseye-slim AS dotnet

#====================================================================

FROM dotnet AS nuget

USER vscode

# Copy our projects
COPY --chown=vscode:vscode . /tmp/build/

# Populate the nuget cache with all of our dependencies
RUN for project in /tmp/build/csharp*; do \
dotnet restore $project; \
done

#====================================================================

FROM dotnet AS cpp

# Install the C++ dev tools
RUN echo "deb http://deb.debian.org/debian bullseye-backports main" >> /etc/apt/sources.list \
&& apt-get update \
&& export DEBIAN_FRONTEND=noninteractive \
&& apt-get -y install --no-install-recommends \
bison \
build-essential \
cmake/bullseye-backports \
cppcheck \
flex \
gdb \
ninja-build \
pkg-config \
valgrind \
&& apt-get autoremove -y \
&& apt-get clean -y \
&& rm -rf /var/lib/apt/lists/*

# Set vcpkg environment variables
ENV VCPKG_ROOT=/opt/vcpkg \
VCPKG_FORCE_SYSTEM_BINARIES=1

#====================================================================

FROM cpp AS vcpkg

USER vscode

# Install vcpkg
RUN sudo mkdir -p $VCPKG_ROOT \
&& sudo chown vscode:vscode $VCPKG_ROOT \
&& git clone https://github.com/microsoft/vcpkg.git $VCPKG_ROOT \
&& cd $VCPKG_ROOT \
&& ./bootstrap-vcpkg.sh -disableMetrics

# Copy our vcpkg manifest
COPY --chown=vscode:vscode vcpkg.json /tmp/build/

# Populate the vcpkg binary cache with all of our dependencies
RUN cd /tmp/build \
&& $VCPKG_ROOT/vcpkg install --clean-after-build \
&& bash -c 'rm -rf $VCPKG_ROOT/{buildtrees,downloads/temp,packages}' \
&& rm -rf *

#====================================================================

FROM cpp AS devcontainer

# Copy the nuget cache
COPY --from=nuget --chown=vscode:vscode /home/vscode/.nuget/packages /home/vscode/.nuget/packages

# Copy the installed vcpkg and its binary cache
COPY --from=vcpkg --chown=vscode:vscode $VCPKG_ROOT $VCPKG_ROOT
COPY --from=vcpkg --chown=vscode:vscode /home/vscode/.cache/vcpkg /home/vscode/.cache/vcpkg
72 changes: 72 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
// For format details, see https://aka.ms/devcontainer.json.
{
"name": "ParquetSharp",

// Use the prebuilt image. Comment this out if you want to make changes to it.
"image": "ghcr.io/g-research/parquetsharp/devcontainer:latest",

// Uncomment the following lines to build the container locally. You will also need
// to comment out the "image" line above.
// "build": {
// "dockerfile": "./Dockerfile",
// "context": ".."
// },

// Necessary for C++ debugger to work.
"capAdd": [
"SYS_PTRACE"
],
"securityOpt": [
"seccomp=unconfined"
],

// Configure tool-specific properties.
"customizations": {
// Configure properties specific to VS Code.
"vscode": {
// Set *default* container specific settings.json values on container create.
"settings": {
// Use vcpkg.
"cmake.configureEnvironment": {
"CMAKE_TOOLCHAIN_FILE": "/opt/vcpkg/scripts/buildsystems/vcpkg.cmake"
},

// Run cmake configure on open.
"cmake.configureOnOpen": true,

// Remove some cmake elements from the status bar.
"cmake.statusbar.advanced": {
"buildTarget": {
"visibility": "hidden"
},
"kit": {
"visibility": "hidden"
},
"ctest": {
"visibility": "hidden"
}
}
},

// Add the IDs of extensions you want installed when the container is created.
"extensions": [
"ms-dotnettools.csdevkit",
"ms-dotnettools.csharp",
"ms-vscode.cpptools",
"ms-vscode.cmake-tools"
]
}
},

// Features to add to the dev container. More info: https://containers.dev/features.
// "features": {},

// Use 'forwardPorts' to make a list of ports inside the container available locally.
// "forwardPorts": [],

// Use 'postCreateCommand' to run commands after the container is created.
// "postCreateCommand": "",

// Set `remoteUser` to `root` to connect as root instead. More info: https://aka.ms/vscode-remote/containers/non-root.
// "remoteUser": "vscode"
}
3 changes: 3 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
**
!vcpkg.json
!*/*.csproj
7 changes: 7 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,13 @@ jobs:
vcpkg-${{ steps.vcpkg-info.outputs.triplet }}-cmake:${{ steps.cmake-info.outputs.version }}
vcpkg-${{ steps.vcpkg-info.outputs.triplet }}
# Ensure vcpkg builtin registry is up-to-date
- name: Update vcpkg builtin registry
working-directory: ${{ steps.vcpkg-info.outputs.root }}
run: |
git reset --hard
git pull
# Setup a CentOS 7 container to build on Linux x64 for backwards compatibility.
- name: Start CentOS container and install toolchain
if: runner.os == 'Linux' && matrix.arch == 'x64'
Expand Down
113 changes: 113 additions & 0 deletions .github/workflows/devcontainer.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
name: Build devcontainer image

on:
push:
branches: [master]
paths:
- ".devcontainer/**"
- ".dockerignore"
- "vcpkg.json"
- "*/*.csproj"
- ".github/workflows/devcontainer.yml"
pull_request:
branches: [master]
paths:
- ".devcontainer/**"
- ".dockerignore"
- "vcpkg.json"
- "*/*.csproj"
- ".github/workflows/devcontainer.yml"
# Run once a week
schedule:
- cron: "34 2 * * 2"

permissions:
contents: read
packages: write

jobs:
build:
name: Build devcontainer image
strategy:
fail-fast: false
matrix:
runner: [ubuntu-latest, ubuntu-20.04-arm64]
runs-on: ${{ matrix.runner }}
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Compute image info
id: image
run: |
echo "name=ghcr.io/$(echo ${{ github.repository }} | tr A-Z a-z)/devcontainer" >> "$GITHUB_OUTPUT"
echo "push=${{ (github.event_name == 'push' && github.ref == 'refs/heads/master') || github.event_name == 'schedule' }}" >> "$GITHUB_OUTPUT"
- name: Compute image labels
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ steps.image.outputs.name }}
tags: latest
labels: |
org.opencontainers.image.title=ParquetSharp devcontainer
org.opencontainers.image.description=devcontainer for ParquetSharp
- if: fromJson(steps.image.outputs.push)
name: Login to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build image${{ fromJson(steps.image.outputs.push) && ' and push it by digest' || ''}}
id: build
uses: docker/build-push-action@v4
with:
file: .devcontainer/Dockerfile
labels: ${{ steps.meta.outputs.labels }}
outputs: type=image,name=${{ steps.image.outputs.name }},push-by-digest=true,name-canonical=true,push=${{ steps.image.outputs.push }}
cache-from: type=gha,scope=${{ github.ref }}-${{ matrix.runner }}
cache-to: type=gha,scope=${{ github.ref }}-${{ matrix.runner }},mode=max
- if: fromJson(steps.image.outputs.push)
name: Export digest
run: |
mkdir -p /tmp/digests
digest="${{ steps.build.outputs.digest }}"
touch "/tmp/digests/${digest#sha256:}"
- if: fromJson(steps.image.outputs.push)
name: Upload digest
uses: actions/upload-artifact@v3
with:
name: digests
path: /tmp/digests/*
if-no-files-found: error
retention-days: 1
outputs:
image_name: ${{ steps.image.outputs.name }}
image_push: ${{ steps.image.outputs.push }}

merge:
name: Merge platforms
if: fromJson(needs.build.outputs.image_push)
runs-on: ubuntu-latest
needs:
- build
steps:
- name: Download digests
uses: actions/download-artifact@v3
with:
name: digests
path: /tmp/digests
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Create manifest list and push
working-directory: /tmp/digests
run: |
docker buildx imagetools create -t ${{ needs.build.outputs.image_name }}:latest \
$(printf '${{ needs.build.outputs.image_name }}@sha256:%s ' *)
- name: Inspect image
run: docker buildx imagetools inspect ${{ needs.build.outputs.image_name }}:latest
2 changes: 1 addition & 1 deletion .github/workflows/nudge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Nudge

on:
workflow_run:
workflows: [CI]
workflows: ["CI", "Build devcontainer image"]
types: [completed]
branches: [master]

Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ build
obj
nuget
BenchmarkDotNet.Artifacts

# The solution files get generated by vcpkg on Windows
# and by the C# Dev Kit within a dev container.
*.sln
73 changes: 57 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,35 +123,76 @@ As only 64-bit runtimes are available, ParquetSharp cannot be referenced by a 32

## Building

Building ParquetSharp for Windows requires the following dependencies:
- Visual Studio 2022 (17.0 or higher)
- Apache Arrow (13.0.0)
### Dev Container

For building Arrow (including Parquet) and its dependencies, we recommend using Microsoft's [vcpkg](https://github.com/Microsoft/vcpkg).
The build scripts will use an existing vcpkg installation if either of the `VCPKG_INSTALLATION_ROOT` or `VCPKG_ROOT` environment variables are defined,
otherwise vcpkg will be downloaded into the build directory.
Note that the Windows build needs to be done in a Visual Studio Developer PowerShell for the build script to succeed.
ParquetSharp can be built and tested within a [dev container](https://containers.dev). This is a probably the easiest way to get started, as all the C++ dependencies are prebuilt into the container image.

**Windows (Visual Studio 2022 Win64 solution)**
```
> build_windows.ps1
> dotnet build csharp.test --configuration=Release
#### GitHub Codespaces

If you have a GitHub account, you can simply open ParquetSharp in a new GitHub Codespace by clicking on the green "Code" button at the top of this page.

Choose the "unspecified" CMake kit when prompted and let the C++ configuration run. Once done, you can build the C++ code via the "Build" button in the status bar at the bottom.

You can then build the C# code by right-clicking the ParquetSharp solution in the Solution Explorer on the left and choosing "Build". The Test Explorer will then get populated with all the C# tests too.

#### Visual Studio Code

If you want to work locally in [Visual Studio Code](https://code.visualstudio.com), all you need is to have [Docker](https://docs.docker.com/get-docker/) and the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) installed.

Simply open up your copy of ParquetSharp in VS Code and click "Reopen in container" when prompted. Once the project has been opened, you can follow the GitHub Codespaces instructions above.

#### CLI

If the CLI is how you roll, then you can install the [Dev Container CLI](https://github.com/devcontainers/cli) tool and issue the following command in the your copy of ParquetSharp to get up and running:

```bash
devcontainer up
```
**Linux and macOS (Makefile)**

Build the C++ code and run the C# tests with:

```bash
devcontainer exec ./build_unix.sh
devcontainer exec dotnet test csharp.test
```
> ./build_unix.sh
> dotnet build csharp.test --configuration=Release

### Native

Building ParquetSharp natively requires the following dependencies:
- A modern C++ compiler toolchain
- .NET SDK 7.0
- Apache Arrow (13.0.0)

For building Arrow (including Parquet) and its dependencies, we recommend using Microsoft's [vcpkg](https://vcpkg.io).
The build scripts will use an existing vcpkg installation if either of the `VCPKG_INSTALLATION_ROOT` or `VCPKG_ROOT` environment variables are defined, otherwise vcpkg will be downloaded into the build directory.

#### Windows

Building ParquetSharp on Windows requires Visual Studio 2022 (17.0 or higher).

Open a Visual Studio Developer PowerShell and run the following commands to build the C++ code and run the C# tests:

```pwsh
build_windows.ps1
dotnet test csharp.test
```

We have had to write our own `FindPackage` macros for most of the dependencies to get us going - it clearly needs more love and attention and is likely to be redundant with some vcpkg helper tools.
#### Unix

Build the C++ code and run the C# tests with:

```bash
./build_unix.sh
dotnet test csharp.test
```

## Contributing

We welcome new contributors! We will happily receive PRs for bug fixes or small changes. If you're contemplating something larger please get in touch first by opening a GitHub Issue describing the problem and how you propose to solve it.

## License

Copyright 2018-2021 G-Research
Copyright 2018-2023 G-Research

Licensed under the Apache License, Version 2.0 (the "License"); you may not use these files except in compliance with the License.
You may obtain a copy of the License at
Expand Down
Loading

0 comments on commit ea865a4

Please sign in to comment.