You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* No-code checks in Soda Cloud
* add outline content
* Documentation for no-code checks
- includes references elsewhere in docs
* added link to refer
* Refinements and corrections based on feedback
* Clarified Soda Agent reqruirement
If you wish to run a scan immediately to see the scan results for the checks you included in your agreement, you can run an ad hoc scan from the scan schedule.
2
2
3
-
1.In Soda Cloud, navigate to **Scans**.
3
+
1.As an Admin in your Soda Cloud account, navigate to **Scans**.
4
4
2. In the list of **Scans**, click the one that is associated with your agreement. If you don't know which scan schedule your agreement uses, navigate to **Agreements**, select your agreement, then find the name of the scan schedule in the upper-left tile.
5
5
3. In the scan schedule page, click **Run Scan** to immediately execute all agreements and checks that use this scan schedule.
For a new rule, you define conditions for sending notifications including the severity of a check result and whom to notify when bad data triggers an alert.
2
+
3
+
In Soda Cloud, navigate to **your avatar** > **Notification Rules**, then click **New Notification Rule**. Follow the guided steps to complete the new rule. Use the table below for insight into the values to enter in the fields and editing panels.
4
+
5
+
| Field or Label | Guidance |
6
+
| ----------------- | ----------- |
7
+
| Name | Provide a unique identifier for your notification. |
8
+
| For | Select **All Checks**, or select **Selected Checks** to use conditions to identify specific checks to which you want the rule to apply. You can identify checks according to several attributes such as **Data Source Name**, **Dataset Name**, or **Check Name**.|
9
+
| Notify Recipient | Select the destination to which this rule sends its notifications. For example, you can send the rule's notifications to a channel in Slack. |
10
+
| Notify About | Identify the notifications this rule sends based on the severity of the check result: warn, fail, or both.|
Copy file name to clipboardexpand all lines: soda-agent/basics.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ redirect_from: /soda-agent/
12
12
13
13
The **Soda Agent** is a tool that empowers Soda Cloud users to securely access data sources to scan for data quality. Create a Kubernetes cluster in a cloud services provider environment, then use Helm to deploy a Soda Agent in the cluster.
14
14
15
-
This setup enables Soda Cloud users to securely connect to data sources (Snowflake, Amazon Athena, etc.) from within the Soda Cloud web application. Any user in your Soda Cloud account can add a new data source via the agent, then write their own agreements to check for data quality in the new data source.
15
+
This setup enables Soda Cloud users to securely connect to data sources (Snowflake, Amazon Athena, etc.) from within the Soda Cloud web application. Any user in your Soda Cloud account can add a new data source via the agent, then write their own no-code checks and agreements to check for data quality in the new data source.
16
16
17
17
What follows is an extremely abridged introduction to a few basic elements involved in the deployment and setup of a Soda Agent.
Copy file name to clipboardexpand all lines: soda-agent/deploy.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ redirect_from:
21
21
22
22
The **Soda Agent** is a tool that empowers Soda Cloud users to securely access data sources to scan for data quality. Create a Kubernetes cluster, then use Helm to deploy a Soda Agent in the cluster.
23
23
24
-
This setup enables Soda Cloud users to securely connect to data sources (BigQuery, Snowflake, etc.) from within the Soda Cloud web application. Any user in your Soda Cloud account can add a new data source via the agent, then write their own agreements to check for data quality in the new data source.
24
+
This setup enables Soda Cloud users to securely connect to data sources (BigQuery, Snowflake, etc.) from within the Soda Cloud web application. Any user in your Soda Cloud account can add a new data source via the agent, then write their own no-code checks and agreements to check for data quality in the new data source.
25
25
26
26
As a step in the **Get started roadmap**, this guide offers instructions to set up, install, and configure Soda in a [self-hosted agent deployment model]({% link soda/setup-guide.md %}#self-hosted-agent).
Copy file name to clipboardexpand all lines: soda-cl/check-attributes.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -66,9 +66,9 @@ Note that you can only define or edit check attributes as an [Admin]({% link sod
66
66
## Apply an attribute to one or more checks
67
67
68
68
While only a Soda Cloud Admin can define or revise check attributes, any Author user can apply attributes to new or existing checks when:
69
-
* writing or editing checks in an agreement in Soda Cloud, <br />
70
-
OR <br />
71
-
* writing or editing checks in a checks YAML file for Soda Library.
69
+
* writing or editing checks in an agreement in Soda Cloud
70
+
* creating or editing no-code checks in Soda Cloud
71
+
* writing or editing checks in a checks YAML file for Soda Library
72
72
73
73
Apply attributes to checks using key:value pairs, as in the following example which applies five Soda Cloud-created attributes to a new `row_count` check.
Copy file name to clipboardexpand all lines: soda-cl/custom-check-examples.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ Out of the box, Soda Checks Language (SodaCL) makes several built-in metrics and
13
13
14
14
**User-defined checks** and **failed rows checks** enable you to define your own metrics that you can use in a SodaCL check. You can also use these checks to simply define SQL queries or Common Table Expressions (CTE) that Soda executes during a scan, which is what most of these examples do.
15
15
16
-
The examples below offer examples of how you can define user-defined checks in your checks YAML file, if using Soda Library, or within an agreement, if using Soda Cloud, to extract more complex, customized, business-specific measurements from your data.
16
+
The examples below offer examples of how you can define user-defined checks in your checks YAML file, if using Soda Library or, if using Soda Cloud, within a no-code SQL Failed Rows check or an agreement, to extract more complex, customized, business-specific measurements from your data.
17
17
18
18
[Set an acceptable threshold for row count delta](#set-an-acceptable-threshold-for-row-count-delta)<br />
19
19
[Find duplicates in a dataset without a unique ID column](#find-duplicates-in-a-dataset-without-a-unique-id-column)<br />
<label class="tab" id="two-tab" for="two">Use an agreement</label>
79
+
<label class="tab" id="three-tab" for="three">Use a YAML file</label>
78
80
</div>
79
81
<div class="panels">
80
82
<div class="panel" id="one-panel" markdown="1">
83
+
Create **no-code checks** for data quality directly in the Soda Cloud user interface. When you create a no-code check, you also set a schedule for Soda to execute your check when it runs a scan of your data source. <br />
* You, or an Admin on your Soda Cloud account, has [deployed a Soda Agent]({% link soda-agent/deploy.md %}) version 0.8.52 or greater, and connected it to your Soda Cloud account.
92
+
* You, or an Admin on your Soda Cloud account, has [added a new datasource]({% link soda-agent/deploy.md %}#add-a-new-data-source) via the Soda Agent in your Soda Cloud account.
93
+
* You must have permission to edit the dataset as an Admin, Manager, or Editor; see [Roles and rights]({% link soda-cloud/roles-and-rights.md %}).
94
+
95
+
### Create a new check
96
+
97
+
SodaCL includes over 25 built-in metrics that you can use to write checks, a subset of which are accessible via no-codecheck creation. The table below lists the checks available to create via the no-code interface; access [SodaCL reference]({% link soda-cl/metrics-and-checks.md %}) for detailed information about each metric or check.
1. As an Admin, or Manager or Editor of a dataset to which you wish to add checks, navigate to the dataset, then click **Add Check**. You can only create a check via the no-code interface for datasets in data sources connected via a Soda Agent.
107
+
2. Select the type of check you wish to create, then complete the form to create the check. Refer to table below for guidance on the values to enter.
108
+
3. Optionally, test your check, then save. Soda executes the check during the next scan according to the schedule you selected, or whenever a Soda Cloud user runs the schedule scan manually.
109
+
4. Optionally, you can execute your check immediately. From the dataset's page, locate the check you just created and click the stacked dots, then select **Execute Check**. Soda executes *only* your check.
110
+
111
+
| Field or Label | Guidance |
112
+
| --------------- | -------- |
113
+
| Dataset | Select the dataset to which you want the check to apply. |
114
+
| Check Name | Provide a unique name for your check. |
115
+
| Schedule | Select the scan schedule to which you wish to add your check. Optionally, you can click **Create a New Schedule** if you want Soda to execute the check more or less frequently, or at a different time of day than existing schedules dictate. See [Manage scheduled scans]({% link soda-cloud/scan-mgmt.md %}) for details. |
116
+
| Filter fields | Optionally, add an [in-check filter]({% link soda-cl/optional-config.md %}#add-a-filter-to-a-check) to apply conditions that specify a portion of the data against which Soda executes the check. |
117
+
| Define Metric/Values/Column/SQL | As each metric or check requires different values, refer to [SodaCL reference]({% link soda-cl/metrics-and-checks.md %}) for detailed information about each metric or check. |
118
+
| Alert Level | Select the check result state(s) for which you wish to be notified: Fail, Warn, or Fail and Warn. See [View scan results]({% link soda-library/run-a-scan.md %}#view-scan-results) for details. <br />By default, alert notifications for your check go to the **Dataset Owner**. See [Define alert notification rules](#define-alert-notification-rules) to set up more alert notifications. |
119
+
| Fail Condition, Value, and Value Type | Set the values of these fields to specify the threshold that constitutes a fail or warn check result. <br /> For example, if you are creating a **Duplicate Check** and you want to make sure that less than 5% of the rows in the column you identified contain duplicates, set <br />• **Fail Condition** to `>` <br />• **Value** to `5` <br />• **Value Type** to `Percent`|
120
+
| Attribute fields | Select from among the list of existing attributes to apply to your check so as to organize your checks and alert notifications in Soda Cloud. Refer to [Add check attributes]({% link soda-cl/check-attributes.md %}) for details. |
121
+
122
+
<br />
123
+
124
+
### Define alert notification rules
125
+
126
+
By default, alert notifications for your no-code check go to the **Dataset Owner** and **Check Owner**. If you wish to send alerts elsewhere, in addition to the owner, create a notification rule.
127
+
128
+
{% include notif-rule.md %}
129
+
130
+
<br />
131
+
132
+
### Edit an existing check
133
+
134
+
1. As an Admin, or Manager or Editor of a dataset in which the no-code check exists, navigate to the dataset.
135
+
2. To the right of the check you wish to edit, click the stacked dots, then select **Edit Check**. You can only edit a check via the no-code interface if it was first created as a no-code check, as indicated by the cloud icon in the **Origin** column of the table of checks.
136
+
3. Adjust the check as needed, test your check, then save. Soda executes the check during the next scan according to the schedule you selected.
137
+
4. Optionally, you can execute your check immediately. Locate the check you just edited and click the stacked dots, then select **Execute Check**. Soda executes *only* your check.
138
+
139
+
140
+
</div>
141
+
<div class="panel" id="two-panel" markdown="1">
81
142
You can write SodaCL checks directly in the Soda Cloud user interface within an **agreement**. An agreement is a contract between stakeholders that stipulates the expected and agreed-upon state of data quality in a data source.<br />
82
143
*Requires a Soda Agent*
83
144
@@ -163,7 +224,7 @@ Further, take into account the following tips and best practices when writing So
163
224
See also: [Tips and best practices for SodaCL]({% link soda/quick-start-sodacl.md %}#tips-and-best-practices-for-sodacl)
164
225
165
226
</div>
166
-
<div class="panel" id="two-panel" markdown="1">
227
+
<div class="panel" id="three-panel" markdown="1">
167
228
168
229
As a Data Engineer, you can write SodaCL checks directly in a `checks.yml` file, or leverage check suggestions in the Soda Library CLI to prepare a basic set of data quality checks for you. Alternatively, you can add SodaCL checks to a programmatic invocation of Soda Library.
In Soda Cloud, you can define where and when to send alert notifications when check results warn or fail. You can define these parameters for:
12
-
***agreements** as you create or edit them; see [Define SodaCL checks]({% link soda-cl/soda-cl-overview.md %}#define-sodacl-checks) for business users
12
+
***agreements** as you create or edit them; see [Define SodaCL checks]({% link soda-cl/soda-cl-overview.md %}#define-sodacl-checks) for Use an agreement.
13
+
***no-code checks** after you have created them; see [Define SodaCL checks]({% link soda-cl/soda-cl-overview.md %}#define-sodacl-checks) for Use a no-code check.
13
14
***multiple checks** by defining notification rules; read on!
14
15
15
16
For example, you can define a notification rule to instruct Soda Cloud to send an alert to your #sales-engineering Slack channel whenever a data quality check on the `snowflake_sales` data source fails.
@@ -31,16 +32,7 @@ Refer to [Data source, dataset, agreement, and check owners]({% link soda-cloud/
31
32
32
33
## Set new rules
33
34
34
-
For a new rule, you define conditions for sending notifications including the severity of a check result and whom to notify when bad data triggers an alert.
35
-
36
-
In Soda Cloud, navigate to **your avatar** > **Notification Rules**, then click **New Notification Rule**. Follow the guided steps to complete the new rule. Use the table below for insight into the values to enter in the fields and editing panels in the guided steps.
37
-
38
-
| Field or Label | Guidance |
39
-
| ----------------- | ----------- |
40
-
| Name | Provide a unique identifier for your notification. |
41
-
| For | Select **All Checks**, or select **Selected Checks** to use conditions to identify specific checks to which you want the rule to apply. You can identify checks according to several attributes such as **Data Source Name**, **Dataset Name**, or **Check Name**.|
42
-
| Notify Recipient | Select the destination to which this rule sends its notifications. For example, you can send the rule's notifications to a channel in Slack. |
43
-
| Notify About | Identify the notifications this rule sends based on the severity of the check result: warn, fail, or both.|
Copy file name to clipboardexpand all lines: soda-cloud/roles-and-rights.md
+3
Original file line number
Diff line number
Diff line change
@@ -37,6 +37,7 @@ The following table outlines the rights of each role.
37
37
| Invite colleagues to join the organization's Soda Cloud account as members | ✓ | ✓ |
38
38
| Set and edit notification rules | ✓ | ✓ |
39
39
| Apply check attributes to checks | ✓ | ✓ |
40
+
| Create no-code checks | ✓ | ✓ |
40
41
| Create or edit check attributes | ✓ ||
41
42
| View Organization Settings for a Soda Cloud account | ✓ ||
42
43
| Change the name of the organization | ✓ ||
@@ -113,6 +114,7 @@ The following table outlines the rights of each role associated with each resour
113
114
| Approve and reject agreements as a stakeholder | ✓ | ✓ | ✓ | ✓ |
114
115
| Create a new agreement | ✓ | ✓ | ✓ ||
115
116
| Edit an existing agreement, including adding a new scan schedule | ✓ | ✓ | ✓ ||
117
+
| Create no-code checks | ✓ | ✓ | ✓ ||
116
118
| Add and edit dataset Attributes, such as Description or Tags | ✓ | ✓ | ✓ ||
117
119
| Control member access to a dataset and its checks (add or remove access) | ✓ | ✓ |||
118
120
| Change the roles of members with access to a dataset and its checks | ✓ | ✓ |||
@@ -210,6 +212,7 @@ There are four ownership roles in Soda Cloud that identify the member that owns
210
212
211
213
* By default, the member who added the data source becomes the **Data Source Owner** and **Dataset Owner** of all datasets in that data source. The default role that Soda Cloud assigns to the Dataset Owner is that of Manager.
212
214
* By default, the member who creates an agreement becomes the **Check Owner** of all checks defined in the agreement.
215
+
* By default, the member who creates a no-code check becomes its **Check Owner**.
0 commit comments