Skip to content

Commit

Permalink
deploy: 96bbc49
Browse files Browse the repository at this point in the history
  • Loading branch information
ErinWeisbart committed May 16, 2024
1 parent fc9cbdf commit 6c83b9e
Show file tree
Hide file tree
Showing 51 changed files with 873 additions and 670 deletions.
2 changes: 1 addition & 1 deletion .buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 43910ce4514717c925ed4a9dd8e274c7
config: 19dbb8ba01cd8bc55be71fdd575819b7
tags: 645f666f9bcd5a90fca523b33c5a78b7
59 changes: 34 additions & 25 deletions SQS_QUEUE_information.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<!DOCTYPE html>


<html lang="en" >
<html lang="en" data-content_root="" >

<head>
<meta charset="utf-8" />
Expand All @@ -19,12 +19,12 @@
</script>

<!-- Loaded before other Sphinx assets -->
<link href="_static/styles/theme.css?digest=e353d410970836974a52" rel="stylesheet" />
<link href="_static/styles/bootstrap.css?digest=e353d410970836974a52" rel="stylesheet" />
<link href="_static/styles/pydata-sphinx-theme.css?digest=e353d410970836974a52" rel="stylesheet" />
<link href="_static/styles/theme.css?digest=5b4479735964841361fd" rel="stylesheet" />
<link href="_static/styles/bootstrap.css?digest=5b4479735964841361fd" rel="stylesheet" />
<link href="_static/styles/pydata-sphinx-theme.css?digest=5b4479735964841361fd" rel="stylesheet" />


<link href="_static/vendor/fontawesome/6.1.2/css/all.min.css?digest=e353d410970836974a52" rel="stylesheet" />
<link href="_static/vendor/fontawesome/6.1.2/css/all.min.css?digest=5b4479735964841361fd" rel="stylesheet" />
<link rel="preload" as="font" type="font/woff2" crossorigin href="_static/vendor/fontawesome/6.1.2/webfonts/fa-solid-900.woff2" />
<link rel="preload" as="font" type="font/woff2" crossorigin href="_static/vendor/fontawesome/6.1.2/webfonts/fa-brands-400.woff2" />
<link rel="preload" as="font" type="font/woff2" crossorigin href="_static/vendor/fontawesome/6.1.2/webfonts/fa-regular-400.woff2" />
Expand All @@ -38,8 +38,9 @@
<link rel="stylesheet" type="text/css" href="_static/design-style.4045f2051d55cab465a707391d5b2007.min.css" />

<!-- Pre-loaded scripts that we'll load fully later -->
<link rel="preload" as="script" href="_static/scripts/bootstrap.js?digest=e353d410970836974a52" />
<link rel="preload" as="script" href="_static/scripts/pydata-sphinx-theme.js?digest=e353d410970836974a52" />
<link rel="preload" as="script" href="_static/scripts/bootstrap.js?digest=5b4479735964841361fd" />
<link rel="preload" as="script" href="_static/scripts/pydata-sphinx-theme.js?digest=5b4479735964841361fd" />
<script src="_static/vendor/fontawesome/6.1.2/js/all.min.js?digest=5b4479735964841361fd"></script>

<script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
<script src="_static/jquery.js"></script>
Expand Down Expand Up @@ -79,6 +80,15 @@

<a class="skip-link" href="#main-content">Skip to main content</a>

<div id="pst-scroll-pixel-helper"></div>


<button type="button" class="btn rounded-pill" id="pst-back-to-top">
<i class="fa-solid fa-arrow-up"></i>
Back to top
</button>


<input type="checkbox"
class="sidebar-toggle"
name="__primary"
Expand Down Expand Up @@ -131,20 +141,22 @@

<div class="sidebar-primary-items__start sidebar-primary__section">
<div class="sidebar-primary-item">



<a class="navbar-brand logo" href="introduction.html">










<img src="_static/Distributed_Something_Logo_only.png" class="logo__image only-light" alt="Logo image"/>
<script>document.write(`<img src="_static/Distributed_Something_Logo_only.png" class="logo__image only-dark" alt="Logo image"/>`);</script>
<img src="_static/Distributed_Something_Logo_only.png" class="logo__image only-light" alt="DS Documentation - Home"/>
<script>document.write(`<img src="_static/Distributed_Something_Logo_only.png" class="logo__image only-dark" alt="DS Documentation - Home"/>`);</script>


</a></div>
Expand Down Expand Up @@ -185,11 +197,9 @@
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Technical guides</span></p>
<ul class="nav bd-sidenav">
<li class="toctree-l1"><a class="reference internal" href="dashboard.html">AWS Cloudwatch Dashboard</a></li>

<li class="toctree-l1"><a class="reference internal" href="troubleshooting_runs.html">Troubleshooting</a></li>
<li class="toctree-l1"><a class="reference internal" href="hygiene.html">AWS Hygiene Scripts</a></li>
<li class="toctree-l1"><a class="reference internal" href="versions.html">Versions</a></li>

</ul>

</div>
Expand Down Expand Up @@ -360,20 +370,22 @@
</button>



<script>
document.write(`
<button class="theme-switch-button btn btn-sm btn-outline-primary navbar-btn rounded-circle" title="light/dark" aria-label="light/dark" data-bs-placement="bottom" data-bs-toggle="tooltip">
<span class="theme-switch" data-mode="light"><i class="fa-solid fa-sun"></i></span>
<span class="theme-switch" data-mode="dark"><i class="fa-solid fa-moon"></i></span>
<span class="theme-switch" data-mode="auto"><i class="fa-solid fa-circle-half-stroke"></i></span>
<button class="btn btn-sm navbar-btn theme-switch-button" title="light/dark" aria-label="light/dark" data-bs-placement="bottom" data-bs-toggle="tooltip">
<span class="theme-switch nav-link" data-mode="light"><i class="fa-solid fa-sun fa-lg"></i></span>
<span class="theme-switch nav-link" data-mode="dark"><i class="fa-solid fa-moon fa-lg"></i></span>
<span class="theme-switch nav-link" data-mode="auto"><i class="fa-solid fa-circle-half-stroke fa-lg"></i></span>
</button>
`);
</script>


<script>
document.write(`
<button class="btn btn-sm navbar-btn search-button search-button__button" title="Search" aria-label="Search" data-bs-placement="bottom" data-bs-toggle="tooltip">
<i class="fa-solid fa-magnifying-glass"></i>
<i class="fa-solid fa-magnifying-glass fa-lg"></i>
</button>
`);
</script>
Expand Down Expand Up @@ -472,11 +484,10 @@ <h2>Example SQS Queue<a class="headerlink" href="#example-sqs-queue" title="Perm



<footer class="bd-footer-article">


<footer class="prev-next-footer">

<div class="footer-article-items footer-article__inner">

<div class="footer-article-item"><!-- Previous / next buttons -->
<div class="prev-next-area">
<a class="left-prev"
href="step_1_configuration.html"
Expand All @@ -496,10 +507,7 @@ <h2>Example SQS Queue<a class="headerlink" href="#example-sqs-queue" title="Perm
</div>
<i class="fa-solid fa-angle-right"></i>
</a>
</div></div>

</div>

</footer>

</div>
Expand Down Expand Up @@ -539,6 +547,7 @@ <h2>Example SQS Queue<a class="headerlink" href="#example-sqs-queue" title="Perm

<div class="footer-item">


<p class="copyright">

© Copyright 2022.
Expand All @@ -565,8 +574,8 @@ <h2>Example SQS Queue<a class="headerlink" href="#example-sqs-queue" title="Perm
</div>

<!-- Scripts loaded after <body> so the DOM is not blocked -->
<script src="_static/scripts/bootstrap.js?digest=e353d410970836974a52"></script>
<script src="_static/scripts/pydata-sphinx-theme.js?digest=e353d410970836974a52"></script>
<script src="_static/scripts/bootstrap.js?digest=5b4479735964841361fd"></script>
<script src="_static/scripts/pydata-sphinx-theme.js?digest=5b4479735964841361fd"></script>

<footer class="bd-footer">
</footer>
Expand Down
15 changes: 9 additions & 6 deletions _sources/costs.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,19 @@

Distributed-Something is run by a series of three commands, only one of which incurs costs at typical scale of usage:

[`setup`](step_1_configuration.md) creates a queue in SQS and a cluster, service, and task definition in ECS.
ECS is entirely free.
[`setup`](step_1_configuration.md) creates a queue in SQS and a cluster, service, and task definition in ECS.
ECS is entirely free.
SQS queues are free to create and use up to 1 million requests/month.

[`submitJobs`](step_2_submit_jobs.md) places messages in the SQS queue which is free (under 1 million requests/month).

[`startCluster`](step_3_start_cluster.md) is the only command that incurs costs with initiation of your spot fleet request, creating machine alarms, and optionally creating a run dashboard.
[`startCluster`](step_3_start_cluster.md) is the only command that incurs costs with initiation of your spot fleet request, creating machine alarms, and optionally creating a run dashboard.

The spot fleet is the major cost of running Distributed-Something, exact pricing of which depends on the number of machines, type of machines, and duration of use.
The spot fleet is the major cost of running Distributed-Something, exact pricing of which depends on the number of machines, type of machines, and duration of use.
Your bid is configured in the [config file](step_1_configuration.md).

Spot fleet costs can be minimized/stopped in multiple ways:

1) We encourage the use of [`monitor`](step_4_monitor.md) during your job to help minimize the spot fleet cost as it automatically scales down your spot fleet request as your job queue empties and cancels your spot fleet request when you have no more jobs in the queue.
Note that you can also perform a more aggressive downscaling of your fleet by monitor by engaging Cheapest mode (see [`more information here`](step_4_monitor.md)).
2) If your job is finished, you can still initiate [`monitor`](step_4_monitor.md) to perform the same cleanup (without the automatic scaling).
Expand All @@ -23,14 +24,16 @@ Note that you can also perform a more aggressive downscaling of your fleet by mo
After the spot fleet has started, a Cloudwatch instance alarm is automatically placed on each instance in the fleet.
Cloudwatch instance alarms [are currently $0.10/alarm/month](https://aws.amazon.com/cloudwatch/pricing/).
Cloudwatch instance alarm costs can be minimized/stopped in multiple ways:

1) If you run monitor during your job, it will automatically delete Cloudwatch alarms for any instance that is no longer in use once an hour while running and at the end of a run.
2) If your job is finished, you can still initiate [`monitor`](step_4_monitor.md) to delete Cloudwatch alarms for any instance that is no longer in use.
3) In [AWS Cloudwatch console](https://console.aws.amazon.com/cloudwatch/) you can select unused alarms by going to Alarms => All alarms. Change Any State to Insufficient Data, select all alarms, and then Actions => Delete.
4) We provide a [hygiene script](hygiene.md) that will clean up old alarms for you.

Cloudwatch Dashboards [are currently free](https://aws.amazon.com/cloudwatch/pricing/) for 3 Dashboards with up to 50 metrics per month and are $3 per dashboard per month after that.
Cloudwatch Dashboards [are currently free](https://aws.amazon.com/cloudwatch/pricing/) for 3 Dashboards with up to 50 metrics per month and are $3 per dashboard per month after that.
Cloudwatch Dashboard costs can be minimized/prevented in multiple ways:

1) You can choose not to have Distributed-Something create a Dashboard by setting `CREATE_DASHBOARD = 'False'` in your [config file](step_1_configuration.md).
2) We encourage the use of [`monitor`](step_4_monitor.md) during your job as if you have set `CLEAN_DASHBOARD = 'True'` in your [config file](step_1_configuration.md) it will automatically delete your Dashboard when your job is done.
3) If your job is finished, you can still initiate [`monitor`](step_4_monitor.md) to perform the same cleanup (without the automatic scaling).
4) You can manually delete Dashboards in the [Cloudwatch Console]((https://console.aws.amazon.com/cloudwatch/)) by going to Dashboards, selecting your Dashboard, and selecting Delete.
4) You can manually delete Dashboards in the [Cloudwatch Console]((https://console.aws.amazon.com/cloudwatch/)) by going to Dashboards, selecting your Dashboard, and selecting Delete.
17 changes: 9 additions & 8 deletions _sources/customizing_DS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ Before starting to customize Distributed-Something code, do some research on you
Distributed-Something only works on "perfectly parallel" tasks, or tasks that do not communicate with each other while running.
If the end product you envision cannot easily be split into perfectly parallel tasks, then it may not be a good fit for Distributed-Something.

Scale has a large impact on how splittable your function is.
For example, if you want to stitch together a set of images into one larger image, that set that you are stitching is the smallest unit you can make your job. Because jobs must be "perfectly parallel", you cannot distribute the images any further.
If you're generally working with datasets that only require a few stitching jobs, Distributed-Something may not be a good fit for your general use case.
However, if you often work with very large datasets where you need to stitch many sets of images, even though you cannot further parallelize your jobs, distributing stitching tasks with Distributed-Something may still provide a significant savings in time and compute cost.
Scale has a large impact on how splittable your function is.
For example, if you want to stitch together a set of images into one larger image, that set that you are stitching is the smallest unit you can make your job. Because jobs must be "perfectly parallel", you cannot distribute the images any further.
If you're generally working with datasets that only require a few stitching jobs, Distributed-Something may not be a good fit for your general use case.
However, if you often work with very large datasets where you need to stitch many sets of images, even though you cannot further parallelize your jobs, distributing stitching tasks with Distributed-Something may still provide a significant savings in time and compute cost.

2) **Make or find a Docker of the software you want to distribute.**
You can find over 1000 scientific softwares already Dockerized at [Biocontainers](http://biocontainers.pro) and many open-source softwares provide Docker files within their GitHub repositories.
Expand All @@ -31,14 +31,15 @@ What is generic to how you like to run the application and what is different for
4) **Think about how you will set up/access your data so that it is batchable/parallelizeable.**
Because Distributed-Something is so application specific, there are many approaches one can take to parse a dataset into batches that can be parallelized.
Implemented examples you can reference are:
- In [Distributed-CellProfiler](https://github.com/DistributedScience/Distributed-CellProfiler), we use LoadData.csvs to pass to CellProfiler the exact list of files with their S3 file paths that we want it to access/download for processing.
- In [Distributed-FIJI](https://github.com/DistributedScience/Distributed-Fiji), we tell it what folder to access and pass upload and download filters for it to select specific files within that folder.

- In [Distributed-CellProfiler](https://github.com/DistributedScience/Distributed-CellProfiler), we use LoadData.csvs to pass to CellProfiler the exact list of files with their S3 file paths that we want it to access/download for processing.
- In [Distributed-FIJI](https://github.com/DistributedScience/Distributed-Fiji), we tell it what folder to access and pass upload and download filters for it to select specific files within that folder.
- In [Distributed-OMEZARRCreator](https://github.com/DistributedScience/Distributed-OMEZARRCreator), the job unit is always the same (one plate of images) so less flexibility is required and the S3 path and plate name passed in the job file is sufficient.

## Using the Distributed-Something template

Distributed-Something is a template repository.
Read more about [Github template repositories](https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template) and follow the instructions to create your own project repository from the template.
Read more about [Github template repositories](https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template) and follow the instructions to create your own project repository from the template.

## Customization details

Expand All @@ -60,6 +61,7 @@ Each job contains the shared variables common to all jobs, listed in the example
These variables are passed to your worker as the `message` and should include any metadata that may possibly change between runs of your Distributed-Something implementation.

Some common variables used in Job files include:

- input location
- output location
- output structure
Expand Down Expand Up @@ -113,4 +115,3 @@ More configuration information is available in [Step 1: Configuration](step_1_co

You need to customize the Dashboard creation function by changing 'start run' to whatever your run command is.
If you have changed anything in config.py, you will need to edit the section on Task Definitions to match.

Loading

0 comments on commit 6c83b9e

Please sign in to comment.