Skip to content

Latest commit

 

History

History
100 lines (67 loc) · 5.38 KB

File metadata and controls

100 lines (67 loc) · 5.38 KB

Description

This module allows creating an instance of Distributed Asynchronous Object Storage (DAOS) on Google Cloud Platform (GCP).

For more information, please refer to the Google Cloud DAOS repo on GitHub.

For more information on this and other network storage options in the Cloud HPC Toolkit, see the extended Network Storage documentation.

NOTE: DAOS on GCP does not require an HPC Toolkit wrapper and, therefore, sources directly from GitHub. It will not work as a local or embedded module.

Examples

Working examples of a DAOS deployment and how it can be used in conjunction with Slurm can be found in the community examples folder.

A full list of server module parameters can be found at the DAOS Server module README.

DAOS Server Images

In order to use the DAOS server terraform module a DAOS server image must be created as instructed in the images directory here.

DAOS server images must be built from the same tagged version of the google-cloud-daos repository that is specified in the source: attribute for modules used in the community examples.

For example, in the following snippet taken from the community/example/intel/daos-cluster.yml the source: attribute specifies v0.2.1 of the daos_server terraform module

  - id: daos-server
    source: github.com/daos-stack/google-cloud-daos.git//terraform/modules/daos_server?ref=v0.2.1
    use: [network1]
    settings:
      number_of_instances: 2
      labels: {ghpc_role: file-system}

In order to use the daos_server module v0.2.1 , you need to

  1. Clone the google-cloud-daos repo and check out v0.2.1
  2. Follow the instructions in the images/README.md directory to build a DAOS server image

Recommended settings

By default, the DAOS system is created with 4 servers will be configured for best cost per GB (TCO, see below), the system will be formated at the server side using dmg format but no pool or containers will be created.

The following settings will configure this system for TCO (default):

  - id: daos-server
    source: github.com/daos-stack/google-cloud-daos.git//terraform/modules/daos_server?ref=v0.2.1
    use: [network1]
    settings:
      labels: {ghpc_role: file-system}
      number_of_instances : 4 # number of DAOS server instances
      machine_type        : "n2-custom-36-215040"
      os_disk_size_gb     : 20
      daos_disk_count     : 16
      daos_scm_size       : 180

The following settings will configure this system for best performance:

  - id: daos-server
    source: github.com/daos-stack/google-cloud-daos.git//terraform/modules/daos_server?ref=v0.2.1
    use: [network1]
    settings:
      labels: {ghpc_role: file-system}
      # The default DAOS settings are optimized for TCO
      # The following will tune this system for best perf
      machine_type        : "n2-standard-16"
      os_disk_size_gb     : 20
      daos_disk_count     : 4
      daos_scm_size       : 45

Support

Content in the google-cloud-daos repository is licensed under the Apache License Version 2.0 open-source license.

DAOS is being distributed under the BSD-2-Clause-Patent open-source license.

Intel Corporation provides several ways for the users to get technical support:

  1. Community support is available to everybody through Jira and via the DAOS channel for the Google Cloud users on Slack.

    To access Jira, please follow these steps:

    To access the Slack channel for DAOS on Google Cloud, please follow this link https://daos-stack.slack.com/archives/C03GLTLHA59

    This type of support is provided on a best-effort basis, and it does not have any SLA attached.

  2. Commercial L3 support is available on an on-demand basis. Please get in touch with Intel Corporation to obtain more information.

here