A {{ dataproc-name }} template is a special resource for rapid deployment of {{ dataproc-name }} clusters in {{ ml-platform-name }} projects. Templates define cluster configuration and can be used by {{ ml-platform-name }} to deploy the cluster multiple times.
{% include data-proc-template-presetting %}
The following information is stored about each template:
- Resource name.
- Resource creator.
- Cluster configuration.
- Template creation date in UTC format, such as
July 18, 2022, 14:23
.
You can view all {{ dataproc-name }} templates created in your project on the {{ dataproc-name }} resource page. It also provides a list of all {{ dataproc-name }} clusters available in the project. It contains both temporary clusters based on {{ dataproc-name }} templates and connected clusters deployed in {{ dataproc-full-name }}. To view detailed information about a template or cluster, click it.
To create a cluster from a {{ dataproc-name }} template, activate the template in your project. When running a project in the IDE, {{ ml-platform-name }} creates a temporary cluster in the {{ yandex-cloud }} folder and subnet specified in the project settings.
{{ ml-platform-name }} tracks the cluster's lifetime and automatically deletes it if no computations have been performed on it within two hours. The cluster will also be deleted if you force stop the computations running in the project.
Automated {{ dataproc-name }} clusters are deployed on {{ compute-full-name }} VMs powered by Intel Cascade Lake (standard-v2
).
You can calculate the total disk storage capacity required for different cluster configurations using this formula:
<number_of_Yandex_Data_Processing_hosts> × 256 + 128
Cluster type | Number of hosts | Disk size | Host parameters |
---|---|---|---|
XS | 1 | 384 GB HDD | 4 vCPUs, 16 GB RAM |
S | 4 | 1152 GB SSD | 4 vCPUs, 16 GB RAM |
M | 8 | 2176 GB SSD | 16 vCPUs, 64 GB RAM |
L | 16 | 4224 GB SSD | 16 vCPUs, 64 GB RAM |
XL | 32 | 8320 GB SSD | 16 vCPUs, 64 GB RAM |
{% note tip %}
Before running a project with an activated {{ dataproc-name }} template, make sure the [quotas]({{ link-console-quotas }}) for creating HDDs or SSDs allow you to create a disk of a sufficient size.
{% endnote %}
You will be charged extra for using temporary clusters created based on {{ dataproc-name }} templates according to the {{ dataproc-full-name }} pricing policy.
{{ ml-platform-name }} creates a temporary {{ dataproc-name }} cluster once you open your project in the IDE.
The created cluster appears in the list of available clusters on the {{ dataproc-name }} resource page. A temporary cluster can have one of the following statuses:
STARTING
: The cluster is being created.UP
: The cluster has been created and is ready to run calculations.DOWN
: There have been issues while creating the cluster.