E4-Computer-Engineering · GlassOfWhiskey · Mar 11, 2025 · Mar 7, 2025 · Mar 20, 2025 · Mar 24, 2025
diff --git a/.gitignore b/.gitignore
@@ -55,3 +55,8 @@ Thumbs.db
 
 # Build directory
 build/
+
+# StreamFlow
+.streamflow
+report.html
+output.txt
diff --git a/.gitmodules b/.gitmodules
@@ -0,0 +1,3 @@
+[submodule "SimulatedAnnealing"]
+	path = SimulatedAnnealing
+	url = [email protected]:E4-Computer-Engineering/SimulatedAnnealing.git
diff --git a/README.md b/README.md
@@ -3,24 +3,33 @@
 This work is based upon the approach from [A clustering aggregation algorithm on neutral-atoms and annealing quantum processors](https://arxiv.org/pdf/2412.07558).
 
 ## How to run
-Make sure to have a working MPI installation available. Its include path should either be added to $INCLUDE or $MPI_INC.
+
+Make sure to have a working MPI installation available. Its include path should either be added to `$INCLUDE` or `$MPI_INC`.
 
 The code can be compiled using `make`. The newly built executable will be under the build/bin directory.
 
 The code can be run as follows:
+
 ```bash
 mpirun -n 3 build/bin/clustering data/input/cluster_points_article.csv
 ```
+
 You can optionally add another argument to save the output matrix to file:
+
 ```bash
 mpirun -n 3 build/bin/clustering data/input/cluster_points_article.csv example_output.txt
 ```
+
 You can add one more optional argument to save the indices of points that comprise each cluster:
+
 ```bash
 mpirun -n 3 build/bin/clustering data/input/cluster_points_article.csv example_output.txt cluster_indices.txt
 ```
+
 ### Expected output
+
 Running the clustering executable will create an overlap matrix in the following form:
+
 ```
 -1 8 8 8 0 0 0 0
 0 -1 0 8 0 0 0 0
@@ -31,33 +40,75 @@ Running the clustering executable will create an overlap matrix in the following
 0 0 0 0 0 0 -1 8
 0 0 0 0 0 0 0 -1
 ```
+
 Each column/row represent a possible cluster. The diagonal terms are equal to -1, the off-diagonal ones are either 0 or a positive integer $\lambda$. Positive values denote overlaps between clusters. The value of $\lambda$ is defined as the number of different clusters, in this case 8, in order to prevent the selection of overlapping clusters.
 
 If you choose to also save the points of each cluster, they will be in this form:
+
 ```
 0,1,3,4
 2,5,7
 6,8,9
 ```
+
 Each line corresponds to a different cluster. Each of its comma-separated values corresponds to a point from the original input file.
+
+## Workflow run
+
+It is also possible to run the whole Classical-Quantum pipeline (clustering + simulated annealing) as a workflow using the [StreamFlow](https://streamflow.di.unito.it) WMS. To do that, you need to clone this repository and all the included submodules, as follows:
+
+```bash
+git clone --recurse-submodules [email protected]:E4-Computer-Engineering/clustering-mis.git
+```
+
+The StreamFlow WMS requires Python 3.9 or newer. It can easily be installed as a Python package using the following commands:
+
+```bash
+python -m venv venv
+source venv/bin/activate
+pip install streamflow[report]==0.2.0.dev12
+```
+
+The workflow configuration is expressed in a declarative `streamflow.yml` file. An [example](workflow/streamflow.yml) targeting the [CINECA@Leonardo](https://leonardo-supercomputer.cineca.eu/) HPC facility is included in this repository. Modify it by adding your credentials (`username` and `sshKey`) and a path to a working directory in a shared portion of the Leonardo filesystem (e.g., in your `$HOME` folder).
+
+At this point, simply run the workflow using this command:
+
+```bash
+streamflow run --name smart-hpc-qc workflow/streamflow.yml
+```
+
+When the workflow completes succesfully, you should find an `output.txt` file containing the results of the simulated annealing phase. In addition, the following command generates a report of the workflow run:
+
+```bash
+streamflow report --file workflow/streamflow.yml smart-hpc-qc
+```
+
 ## TODO
+
 - [ ] Add brief description with images
 
 ## Suggested dev setup
+
 It recommended to use [VS Code](https://code.visualstudio.com/).
+
 ### Linting and autocompletion
+
 IntelliSense from the Microsoft-provided C++ and Makefile extensions reports errors even if the code compiles.
 It is recommended to use the [clangd extension](https://marketplace.visualstudio.com/items?itemName=llvm-vs-code-extensions.vscode-clangd) instead.
 
 Install the clangd extension and allow it to disable IntelliSense. Install the clangd language server if prompted.
 
 Then install [bear](https://github.com/rizsotto/Bear) and, from the project root directory, run the following:
+
 ```bash
 make clean; bear -- make
 ```
+
 This will create a `compile_commands.json` file that is used by clangd to correctly inspect code.
 Run "clangd: Restart language server" from the Command Palette (Ctrl+Shift+P) to read the newly created file.
 
-You need to execute again the commands from above and restart the language server after each Makefile change 
+You need to execute again the commands from above and restart the language server after each Makefile change
+
 ### Formatting
+
 The clangd extension from the previous section can also format C/C++ code. Invoke it from Command Palette -> Format Document.
diff --git a/SimulatedAnnealing b/SimulatedAnnealing
diff --git a/workflow/cwl/clt/annealing.cwl b/workflow/cwl/clt/annealing.cwl
@@ -0,0 +1,22 @@
+cwlVersion: v1.2
+class: CommandLineTool
+requirements:
+  ToolTimeLimit:
+    timelimit: 300
+arguments:
+  - position: 3
+    valueFrom: output.txt
+inputs:
+  annealing:
+    type: File
+    inputBinding:
+      position: 1
+  qubo:
+    type: File
+    inputBinding:
+      position: 2
+outputs:
+  output:
+    type: File
+    outputBinding:
+      glob: output.txt
diff --git a/workflow/cwl/clt/build.cwl b/workflow/cwl/clt/build.cwl
@@ -0,0 +1,18 @@
+cwlVersion: v1.2
+class: CommandLineTool
+requirements:
+  InitialWorkDirRequirement:
+    listing: $(inputs.src)
+baseCommand: [make]
+inputs:
+  src:
+    type:
+      type: array
+      items: [File, Directory]
+  output_path: string
+outputs:
+  output:
+    type: File
+    outputBinding:
+      glob: $(inputs.output_path)
+
diff --git a/workflow/cwl/clt/clustering.cwl b/workflow/cwl/clt/clustering.cwl
@@ -0,0 +1,34 @@
+cwlVersion: v1.2
+class: CommandLineTool
+requirements:
+  ToolTimeLimit:
+    timelimit: 300
+baseCommand: [mpirun, --bind-to, core:overload-allowed]
+arguments:
+  - position: 4
+    valueFrom: output.txt
+  - position: 5
+    valueFrom: indices.txt
+inputs:
+  clustering:
+    type: File
+    inputBinding:
+      position: 2
+  points:
+    type: File
+    inputBinding:
+      position: 3
+  processes:
+    type: int
+    inputBinding:
+      position: 1
+      prefix: -n
+outputs:
+  indices:
+    type: File
+    outputBinding:
+      glob: indices.txt
+  output:
+    type: File
+    outputBinding:
+      glob: output.txt
diff --git a/workflow/cwl/config.yml b/workflow/cwl/config.yml
@@ -0,0 +1,19 @@
+annealing_src:
+  - class: File
+    path: ../../SimulatedAnnealing/Makefile
+  - class: Directory
+    path: ../../SimulatedAnnealing/src
+clustering_src:
+  - class: File
+    path: ../../Makefile
+  - class: File
+    path: ../../clustering.cpp
+  - class: File
+    path: ../../points.cpp
+  - class: Directory
+    path: ../../include
+  - class: Directory
+    path: ../../vendor
+points:
+  class: File
+  path: ../../data/input/cluster_points_article.csv
diff --git a/workflow/cwl/main.cwl b/workflow/cwl/main.cwl
@@ -0,0 +1,51 @@
+cwlVersion: v1.2
+class: Workflow
+requirements:
+  StepInputExpressionRequirement: {}
+  ToolTimeLimit:
+    timelimit: 300
+inputs:
+  annealing_src:
+    type:
+      type: array
+      items: [File, Directory]
+  clustering_src:
+    type:
+      type: array
+      items: [File, Directory]
+  points: File
+  processes:
+    type: int
+    default: 3
+outputs:
+  annealing:
+    type: File
+    outputSource: annealing/output
+steps:
+  build-clustering:
+    run: clt/build.cwl
+    in:
+      src: clustering_src
+      output_path:
+        valueFrom: build/bin/clustering
+    out: [output]
+  clustering:
+    run: clt/clustering.cwl
+    in:
+      clustering: build-clustering/output
+      points: points
+      processes: processes
+    out: [indices, output]
+  build-annealing:
+    run: clt/build.cwl
+    in:
+      src: annealing_src
+      output_path:
+        valueFrom: build/bin/simAnnSingle.out
+    out: [output]
+  annealing:
+    run: clt/annealing.cwl
+    in:
+      annealing: build-annealing/output
+      qubo: clustering/output
+    out: [output]
diff --git a/workflow/slurm/leonardo.sh b/workflow/slurm/leonardo.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+
+module load openmpi/4.1.6--gcc--12.2.0
+
+{{streamflow_command}}
diff --git a/workflow/streamflow.yml b/workflow/streamflow.yml
@@ -0,0 +1,55 @@
+version: v1.0
+workflows:
+  smart-hpc-qc:
+    type: cwl
+    config:
+      file: cwl/main.cwl
+      settings: cwl/config.yml
+    bindings:
+      - step: /build-clustering
+        target:
+          deployment: leonardo
+          service: dcgp
+      - step: /clustering
+        target:
+          deployment: leonardo
+          service: dcgp
+      - step: /build-annealing
+        target:
+          deployment: leonardo
+          service: booster
+      - step: /annealing
+        target:
+          deployment: leonardo
+          service: booster
+database:
+  type: sqlite
+  config:
+    connection: .streamflow/sqlite.db
+deployments:
+  leonardo-ssh:
+    type: ssh
+    config:
+      nodes:
+        - hostname: login.leonardo.cineca.it
+          checkHostKey: false
+          sshKey: </path/to/ssh/key>
+          username: <username>
+    workdir: </path/to/shared/workdir>
+  leonardo:
+    type: slurm
+    config:
+      maxConcurrentJobs: 2
+      services:
+        booster:
+          account: IscrC_SHPC-QC
+          file: slurm/leonardo.sh
+          partition: boost_usr_prod
+        dcgp:
+          account: IscrC_SHPC-QC_0
+          file: slurm/leonardo.sh
+          gres: tmpfs:1g
+          partition: dcgp_usr_prod
+          ntasks: 3
+    wraps: leonardo-ssh
+