Skip to content

Commit

Permalink
now waiting for init container termination
Browse files Browse the repository at this point in the history
Signed-off-by: Surax98 <[email protected]>
  • Loading branch information
Surax98 committed Jun 18, 2024
1 parent 7fc494b commit e6c5118
Show file tree
Hide file tree
Showing 4 changed files with 141 additions and 168 deletions.
152 changes: 121 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,133 @@
# Template for interTwin repositories
# :information_source: Overview

This repository is to be used as a repository template for creating a new interTwin
repository, and is aiming at being a clean basis promoting currently accepted
good practices.
![Interlink logo](./docs/static/img/interlink_logo.png)

It includes:
## Introduction

- License information
- Copyright and author information
- Code of conduct and contribution guidelines
- Templates for PR and issues
- Code owners file for automatic assignment of PR reviewers
- [GitHub actions](https://github.com/features/actions) workflows for linting
and checking links
InterLink aims to provide an abstraction for the execution of a Kubernetes pod on any remote resource capable of managing
a Container execution lifecycle. We target to facilitate the development of provider specific plugins, so the resource
providers can leverage the power of virtual kubelet without a black belt in kubernetes internals.

Content is based on:
The project consists of two main components:

- [Contributor Covenant](http://contributor-covenant.org)
- [Semantic Versioning](https://semver.org/)
- [Chef Cookbook Contributing Guide](https://github.com/chef-cookbooks/community_cookbook_documentation/blob/master/CONTRIBUTING.MD)
- __A Kubernetes Virtual Node:__ based on the [VirtualKubelet](https://virtual-kubelet.io/) technology.
Translating request for a kubernetes pod execution into a remote call to the interLink API server.
- __The interLink API server:__ a modular and pluggable REST server where you can create your own Container manager plugin
(called sidecars), or use the existing ones: remote docker execution on a remote host, singularity Container on
a remote SLURM batch system. This repository aims to maintain the SLURM sidecar as a standalone plugin.

## GitHub repository management rules
The project got inspired by the [KNoC](https://github.com/CARV-ICS-FORTH/knoc) and
[Liqo](https://github.com/liqotech/liqo/tree/master) projects, enhancing that with the implemention a generic API
layer b/w the virtual kubelet component and the provider logic for the container lifecycle management.

All changes should go through Pull Requests.
## :electron: Usage

### Merge management
### :bangbang: Requirements

- Only squash should be enforced in the repository settings.
- Update commit message for the squashed commits as needed.
- __[Our Kubernetes Virtual Node and the interLink API server](https://github.com/interTwin-eu/interLink)__

### Protection on main branch
- __[The Go programming language](https://go.dev/doc/install)__ (to build binaries)

To be configured on the repository settings.
- __[Docker Engine](https://docs.docker.com/engine/)__ (optional)

- Require pull request reviews before merging
- Dismiss stale pull request approvals when new commits are pushed
- Require review from Code Owners
- Require status checks to pass before merging
- GitHub actions if available
- Other checks as available and relevant
- Require branches to be up to date before merging
- Include administrators
Note: if you want a quick start setup (using a Docker container), Go is not necessary

### :warning: It is very important for you to remember to set CPU and Memory Limits in your Pod/Deployment YAML, otherwise default resources will be applied; specifically, if you don't set a CPU limit, only 1 CPU will be used for each task, while if you don't set any Memory limit, only 1MB will be used for each task. :warning:

### :fast_forward: Quick Start

Just run:

```bash
cd docker && docker compose up -d
```

This way, a docker container with a full SLURM environment will be started. You can find the used SLURM configuration
in /docker/SlurmConfig.yaml. If you want to update the config after you already started it once, run:

```bash
docker compose up -d --build --force-recreate
```

So, the old container will be deleted, the image rebuilt and a new container with the updated config will be deployed.

### :hammer: Building binaries

It is of course possible to use binaries as a standalone application. Just run

```bash
make all
```

and you will be able to find the built slurm-sd binary inside the bin directory. Before executing it, remember to check
if the configuration file is correctly set according to your needs. You can find an example one under examples/config/SlurmConfig.yaml.
Do not forget to set the SLURMCONFIGPATH environment variable to point to your config.

### :gear: A SLURM config example

```yaml
SidecarPort: "4000"
SbatchPath: "/usr/bin/sbatch"
ScancelPath: "/usr/bin/scancel"
SqueuePath: "/usr/bin/squeue"
CommandPrefix: ""
SingularityPrefix: ""
ExportPodData: true
DataRootFolder: ".local/interlink/jobs/"
Namespace: "vk"
Tsocks: false
TsocksPath: "$WORK/tsocks-1.8beta5+ds1/libtsocks.so"
TsocksLoginNode: "login01"
BashPath: /bin/bash
VerboseLogging: true
ErrorsOnlyLogging: false
```
### :pencil2: Annotations
It is possible to specify Annotations when submitting Pods to the K8S cluster. A list of all Annotations follows:
| Annotation | Description|
|--------------|------------|
| slurm-job.vk.io/singularity-commands | Used to add specific Commands to be executed before the actual SLURM Job starts. It adds Commands on the Singularity exection line, in the SLURM bastch file |
| slurm-job.vk.io/pre-exec | Used to add commands to be executed before the Job starts. It adds a command in the SLURM batch file after the #SBATCH directives |
| slurm-job.vk.io/singularity-mounts | Used to add mountpoints to the Singularity Containers |
| slurm-job.vk.io/singularity-options | Used to specify Singularity arguments |
| slurm-job.vk.io/image-root | Used to specify the root path of the Singularity Image |
| slurm-job.vk.io/flags | Used to specify SLURM flags. These flags will be added to the SLURM script in the form of #SBATCH flag1, #SBATCH flag2, etc |
| slurm-job.vk.io/mpi-flags | Used to prepend "mpiexec -np $SLURM_NTASKS \*flags\*" to the Singularity Execution |
### :gear: Explanation of the SLURM Config file
Detailed explanation of the SLURM config file key values. Edit the config file before running the binary or before
building the docker image (`docker compose up -d --build --force-recreate` will recreate and re-run the updated image)
| Key | Value |
|--------------|-----------|
| SidecarPort | the sidecar listening port. Sidecar and Interlink will communicate on this port. Set $SIDECARPORT environment variable to specify a custom one |
| SbatchPath | path to your Slurm's sbatch binary |
| ScancelPath | path to your Slurm's scancel binary |
| CommandPrefix | here you can specify a prefix for the programmatically generated script (for the slurm plugin). Basically, if you want to run anything before the script itself, put it here. |
| ExportPodData | Set it to true if you want to export Pod's ConfigMaps and Secrets as mountpoints in your Singularity Container |
| DataRootFolder | Specify where to store the exported ConfigMaps/Secrets locally |
| Namespace | Namespace where Pods in your K8S will be registered |
| Tsocks | true or false values only. Enables or Disables the use of tsocks library to allow proxy networking. Only implemented for the Slurm sidecar at the moment. |
| TsocksPath | path to your tsocks library. |
| TsocksLoginNode | specify an existing node to ssh to. It will be your "window to the external world" |
| BashPath | Path to your Bash shell |
| VerboseLogging | Enable or disable Debug messages on logs. True or False values only |
| ErrorsOnlyLogging | Specify if you want to get errors only on logs. True or false values only |

### :wrench: Environment Variables list

Here's the complete list of every customizable environment variable. When specified, it overwrites the listed key
within the SLURM config file.

| Env | Value |
|--------------|-----------|
| SLURMCONFIGPATH | your SLURM config file path. Default is `/etc/interlink/SlurmConfig.yaml` |
| SIDECARPORT | the Sidecar listening port. Docker default is 4000, Slurm default is 4001. |
| SBATCHPATH | path to your Slurm's sbatch binary. Overwrites SbatchPath. |
| SCANCELPATH | path to your Slurm's scancel binary. Overwrites ScancelPath. |
| SHARED_FS | set this env to "true" to save configmaps values inside files directly mounted to Singularity containers instead of using ENVS to create them later |
| CUSTOMKUBECONF | path to a service account kubeconfig |
| TSOCKS | true or false, to use tsocks library allowing proxy networking. Working on Slurm sidecar at the moment. Overwrites Tsocks. |
| TSOCKSPATH | path to your tsocks library. Overwrites TsocksPath. |
131 changes: 0 additions & 131 deletions Readme.md

This file was deleted.

10 changes: 8 additions & 2 deletions pkg/slurm/Create.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ func (h *SidecarHandler) SubmitHandler(w http.ResponseWriter, r *http.Request) {
var singularity_command_pod []SingularityCommand
var resourceLimits ResourceLimits

for _, container := range containers {
for i, container := range containers {
log.G(h.Ctx).Info("- Beginning script generation for container " + container.Name)
singularityPrefix := commonIL.InterLinkConfigInst.SingularityPrefix
if singularityAnnotation, ok := metadata.Annotations["slurm-job.vk.io/singularity-commands"]; ok {
Expand Down Expand Up @@ -105,7 +105,13 @@ func (h *SidecarHandler) SubmitHandler(w http.ResponseWriter, r *http.Request) {
singularity_command = append(singularity_command, mounts)
singularity_command = append(singularity_command, image)

singularity_command_pod = append(singularity_command_pod, SingularityCommand{singularityCommand: singularity_command, containerName: container.Name, containerArgs: container.Args, containerCommand: container.Command})
isInit := false

if i < len(data.Pod.Spec.InitContainers) {
isInit = true
}

singularity_command_pod = append(singularity_command_pod, SingularityCommand{singularityCommand: singularity_command, containerName: container.Name, containerArgs: container.Args, containerCommand: container.Command, isInitContainer: isInit})
}

path, err := produceSLURMScript(h.Ctx, h.Config, string(data.Pod.UID), filesPath, metadata, singularity_command_pod, resourceLimits)
Expand Down
16 changes: 12 additions & 4 deletions pkg/slurm/aux.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ type ResourceLimits struct {

type SingularityCommand struct {
containerName string
isInitContainer bool
singularityCommand []string
containerCommand []string
containerArgs []string
Expand Down Expand Up @@ -460,10 +461,17 @@ func produceSLURMScript(
os.Chmod(f2.Name(), 0777|os.ModePerm)
os.Chmod(f3.Name(), 0777|os.ModePerm)

stringToBeWritten += "\n" + strings.Join(singularityCommand.singularityCommand[:], " ") + " " +
"/bin/sh" + " /tmp/" + "command_" + singularityCommand.containerName + ".sh" +
" &> " + path + "/" + singularityCommand.containerName + ".out; " +
"echo $? > " + path + "/" + singularityCommand.containerName + ".status &"
if singularityCommand.isInitContainer {
stringToBeWritten += "\n" + strings.Join(singularityCommand.singularityCommand[:], " ") + " " +
"/bin/sh" + " /tmp/" + "command_" + singularityCommand.containerName + ".sh" +
" &> " + path + "/" + singularityCommand.containerName + ".out; " +
"echo $? > " + path + "/" + singularityCommand.containerName + ".status"
} else {
stringToBeWritten += "\n" + strings.Join(singularityCommand.singularityCommand[:], " ") + " " +
"/bin/sh" + " /tmp/" + "command_" + singularityCommand.containerName + ".sh" +
" &> " + path + "/" + singularityCommand.containerName + ".out; " +
"echo $? > " + path + "/" + singularityCommand.containerName + ".status &"
}
}

stringToBeWritten += "\n" + postfix
Expand Down

0 comments on commit e6c5118

Please sign in to comment.