Skip to content

Commit 7daacae

Browse files
authored
Added copy-to-clipboard button to most code snippets (#527)
1 parent 31d45b3 commit 7daacae

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+360
-282
lines changed

_includes/agent-helm-command.md

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
{% include code-header.html %}
12
```shell
23
helm install soda-agent soda-agent/soda-agent \
34
--set soda.agent.target=azure-aks-virtualnodes \

_includes/agent-practice-datasource.md

+2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
If you wish to try creating a new data source in Soda Cloud using the agent you deployed, you can use the following command to create a PostgreSQL warehouse containing example data from the <a href="https://data.cityofnewyork.us/Transportation/Bus-Breakdown-and-Delays/ez4e-fazm" target="_blank">NYC Bus Breakdowns and Delay Dataset</a>.
22

33
From the command-line, copy+paste and run the following to create the data source as a pod on your new cluster.
4+
{% include code-header.html %}
45
```shell
56
cat <<EOF | kubectl apply -n soda-agent -f -
67
---
@@ -45,6 +46,7 @@ service/nybusbreakdowns created
4546

4647
<br />
4748
Once the pod of practice data is running, you can use the following configuration details when you add a data source in Soda Cloud, in [step 2]({% link soda-cloud/add-datasource.md %}#2-connect-the-data-source), **Connect the Data Source**.
49+
{% include code-header.html %}
4850
```yaml
4951
data_source your_datasource_name:
5052
type: postgres

_includes/expect-one-result.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Be aware that a check that contains one or more alert configurations only ever yields a *single* check result; one check yields one check result. If your check triggers both a `warn` and a `fail`, the check result only displays the more severe, failed check result.
22

33
Using the following example, Soda Core, during a scan, discovers that the data in the dataset triggers both alerts, but the check result is still `Only 1 warning`. Nonetheless, the results in the CLI still display both alerts as having both triggered a `warn`.
4-
4+
{% include code-header.html %}
55
```yaml
66
checks for dim_employee:
77
- schema:
@@ -23,7 +23,7 @@ Sending results to Soda Cloud
2323
```
2424

2525
Adding to the example check above, the check in the example below data triggers both `warn` alerts and the `fail` alert, but only returns a single check result, the more severe `Oops! 1 failures.`
26-
26+
{% include code-header.html %}
2727
```yaml
2828
checks for dim_employee:
2929
- schema:

_includes/foreach-config.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Add a **for each** section to your checks YAML file to specify a list of checks
44
2. Nested under the section header, add two nested keys, one for `datasets` and one for `checks`.
55
3. Nested under `datasets`, add a list of datasets against which to run the checks. Refer to the example below that illustrates how to use `include` and `exclude` configurations and wildcard characters {% raw %} (%) {% endraw %}.
66
4. Nested under `checks`, write the checks you wish to execute against all the datasets listed under `datasets`.
7-
7+
{% include code-header.html %}
88
```yaml
99
for each dataset T:
1010
datasets:

_includes/in-check-filters.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Add a filter to a check to apply conditions that specify a portion of the data against which Soda executes the check. For example, you may wish to use an in-check filter to support a use case in which "Column X must be filled in for all rows that have value Y in column Z".
22

33
Add a filter as a nested key:value pair, as in the following example which filters the check results to display only those rows with a value of 81 or greater and which contain `11` in the `sales_territory_key` column. You cannot use a variable to specify an in-check filter.
4-
4+
{% include code-header.html %}
55
```yaml
66
checks for dim_employee:
77
- max(vacation_hours) < 80:
@@ -10,7 +10,7 @@ checks for dim_employee:
1010
```
1111
1212
You can use `AND` or `OR` to add multiple filter conditions to a filter key:value pair to further refine your results, as in the following example.
13-
13+
{% include code-header.html %}
1414
```yaml
1515
checks for dim_employee:
1616
- max(vacation_hours) < 80:
@@ -19,7 +19,7 @@ checks for dim_employee:
1919
```
2020

2121
To improve the readability of multiple filters in a check, consider adding filters as separate line items, as per the following example.
22-
22+
{% include code-header.html %}
2323
```yaml
2424
checks for dim_employee:
2525
- max(vacation_hours) < 80:

_includes/test-connection.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
## Test the data source connection
22

33
To confirm that you have correctly configured the connection details for the data source(s) in your configuration YAML file, use the `test-connection` command. If you wish, add a `-V` option to the command to returns results in verbose mode in the CLI.
4-
4+
{% include code-header.html %}
55
```shell
66
soda test-connection -d my_datasource -c configuration.yml -V
77
```

soda-agent/secrets.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ OR
3131
* in a values YAML file which you store locally but reference in the `helm install` command; see below
3232

3333
### Values YAML file
34-
34+
{% include code-header.html %}
3535
```yaml
3636
soda:
3737
apikey:
@@ -42,7 +42,7 @@ soda:
4242
```
4343
4444
#### helm install command
45-
45+
{% include code-header.html %}
4646
```shell
4747
helm install soda-agent soda-agent/soda-agent \
4848
--values values.yml \

soda-cl/anomaly-score.md

+6-5
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ redirect_from: /soda-cloud/anomaly-detection.html
1212

1313
Use an anomaly score check to automatically discover anomalies in your time-series data. <br>
1414
*Requires Soda Cloud and Soda Core Scientific.*<br />
15-
15+
{% include code-header.html %}
1616
```yaml
1717
checks for dim_customer:
1818
- anomaly score for row_count < default
@@ -54,7 +54,7 @@ Refer to [Troubleshoot Soda Core Scientific installation](#troubleshoot-soda-cor
5454
## Define an anomaly score check
5555
5656
The following example demonstrates how to use the anomaly score for the `row_count` metric in a check. You can use any [numeric]({% link soda-cl/numeric-metrics.md %}), [missing]({% link soda-cl/missing-metrics.md %}), or [validity]({% link soda-cl/validity-metrics.md %}) metric in lieu of `row_count`.
57-
57+
{% include code-header.html %}
5858
```yaml
5959
checks for dim_customer:
6060
- anomaly score for row_count < default
@@ -66,13 +66,14 @@ checks for dim_customer:
6666
<br />
6767
You can use any [numeric]({% link soda-cl/numeric-metrics.md %}), [missing]({% link soda-cl/missing-metrics.md %}), or [validity]({% link soda-cl/validity-metrics.md %}) metric in anomaly score checks. The following example detects anomalies for the average of `order_price` in an `orders` dataset.
6868

69+
{% include code-header.html %}
6970
```yaml
7071
checks for orders:
7172
- anomaly score for avg(order_price) < default
7273
```
7374

7475
The following example detects anomalies for the count of missing values in the `id` column.
75-
76+
{% include code-header.html %}
7677
```yaml
7778
checks for orders:
7879
- anomaly score for missing_count(id) < default:
@@ -121,14 +122,14 @@ Consider using the Soda Core Python library to set up a [programmatic scan]({% l
121122

122123

123124
#### Example with quotes
124-
125+
{% include code-header.html %}
125126
```yaml
126127
checks for "dim_customer":
127128
- anomaly score for row_count < default
128129
```
129130

130131
#### Example with for each
131-
132+
{% include code-header.html %}
132133
```yaml
133134
for each dataset T:
134135
datasets:

soda-cl/automated-monitoring.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ parent: SodaCL
1111

1212
Use automated monitoring checks to instruct Soda to automatically check for row count anomalies and schema changes in a dataset.<br />
1313
*Requires Soda Cloud*
14-
14+
{% include code-header.html %}
1515
```yaml
1616
automated monitoring:
1717
datasets:
@@ -100,7 +100,7 @@ Need help? Ask the team in the <a href="https://community.soda.io/slack" target=
100100
In the context of [SodaCL check types]({% link soda-cl/metrics-and-checks.md %}#check-types), automated monitoring checks are unique. This check employs the `anomaly score` and `schema` checks, but is limited in its syntax variation, with only a couple of mutable parts to specify which datasets to automatically apply the anomaly and schema checks.
101101

102102
The example check below uses a wildcard character (`%`) to specify that Soda Core executes automated monitoring checks against all datasets with names that begin with `prod`, and *not* to execute the checks against any dataset with a name that begins with `test`.
103-
103+
{% include code-header.html %}
104104
```yaml
105105
automated monitoring:
106106
datasets:
@@ -111,7 +111,7 @@ automated monitoring:
111111
<br />
112112

113113
You can also specify individual datasets to include or exclude, as in the following example.
114-
114+
{% include code-header.html %}
115115
```yaml
116116
automated monitoring:
117117
datasets:
@@ -137,7 +137,7 @@ To review the checks results for automated monitoring checks in Soda Cloud, navi
137137

138138

139139
#### Example with wildcards
140-
140+
{% include code-header.html %}
141141
```yaml
142142
automated monitoring:
143143
datasets:

soda-cl/check-attributes.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ parent: SodaCL
99
*Last modified on {% last_modified_at %}*
1010

1111
As a Soda Cloud Admin user, you can define **check attributes** that your team can apply to checks when they write them in an agreement or in a checks YAML file for Soda Core.
12-
12+
{% include code-header.html %}
1313
```yaml
1414
checks for dim_product:
1515
- missing_count(discount) < 10:
@@ -73,7 +73,7 @@ OR <br />
7373
* writing or editing checks in a checks YAML file for Soda Core.
7474

7575
Apply attributes to checks using key:value pairs, as in the following example which applies five Soda Cloud-created attributes to a new `row_count` check.
76-
76+
{% include code-header.html %}
7777
```yaml
7878
checks for dim_product:
7979
- row_count = 10:
@@ -106,7 +106,7 @@ Note that users must use the attribute's **NAME** as the attribute's key in a ch
106106
## Optional check attribute SodaCL configurations
107107

108108
Using SodaCL, you can use variables to populate either the key or value of an existing attribute, as in the following example. Refer to [Configure variables in SodaCL]({% link soda-cl/filters.md %}#configure-variables-in-sodacl) for further details.
109-
109+
{% include code-header.html %}
110110
```yaml
111111
checks for dim_product:
112112
- row_count = 10:
@@ -116,7 +116,7 @@ checks for dim_product:
116116
```
117117

118118
You can use attributes in checks that Soda executes as part of a for each configuration, as in the following example. Refer to [Optional check configuration]({% link soda-cl/optional-config.md %}#apply-checks-to-multiple-datasets) for further details on for each.
119-
119+
{% include code-header.html %}
120120
```yaml
121121
for each dataset T:
122122
datasets:

soda-cl/compare.md

+7-5
Original file line numberDiff line numberDiff line change
@@ -24,22 +24,22 @@ Have you got an idea or example of how to compare data that we haven't documente
2424

2525
Use a [cross check]({% link soda-cl/cross-row-checks.md %}) to conduct a row count comparison between datasets in the same data source. <br />
2626
If you wish to compare datasets in different data sources, or datasets in the same data source but with different schemas, see [Compare data in different data sources or schemas](#compare-data-in-different-data-sources-or-schemas).
27-
27+
{% include code-header.html %}
2828
```yaml
2929
checks for dim_employee:
3030
- row_count same as dim_department_group
3131
```
3232
3333
Use a [reference check]({% link soda-cl/reference.md %}) to conduct a row-by-row comparison of values in two datasets _in the same data source_ and return a result that indicates the volume and samples of mismatched rows, as in the following example which ensures that the values in each of the two names columns are identical.<br />
3434
If you wish to compare datasets in the same data source but with different _schemas_, see [Compare data in different data sources or schemas](#compare-data-in-different-data-sources-or-schemas).
35-
35+
{% include code-header.html %}
3636
```yaml
3737
checks for dim_customers_dev:
3838
- values in (last_name, first_name) must exist in dim_customers_prod (last_name, first_name)
3939
```
4040
4141
Alternatively, you can use a [failed rows check]({% link soda-cl/failed-rows-checks.md %}) to customize a SQL query that compares the values of datasets.
42-
42+
{% include code-header.html %}
4343
```yaml
4444
- failed rows:
4545
name: Validate that the data is the same as retail customers
@@ -79,7 +79,7 @@ Alternatively, you can use a [failed rows check]({% link soda-cl/failed-rows-che
7979
8080
Use a [cross check]({% link soda-cl/cross-row-checks.md %}) to conduct a simple row count comparison of datasets in two different data sources, as in the following example that compares the row counts of two datasets in different data sources. <br />
8181
Note that each data source involved in this check has been connected to data source either in the `configuration.yml` file with Soda Core, or in the **Add Data Source** workflow in Soda Cloud.
82-
82+
{% include code-header.html %}
8383
```yaml
8484
checks for dim_customer:
8585
- row_count same as dim_customer in aws_postgres_retail
@@ -88,6 +88,7 @@ checks for dim_customer:
8888
You can use a [reference check]({% link soda-cl/reference.md %}) to compare the values of different datasets in the _same_ data source (same data source, same schema), but if the datasets are in different schemas, as might happen when you have different environments like production, staging, development, etc., then Soda considers those datasets as _different data sources_. Where that is the case, you have a couple of options.
8989

9090
You can use a cross check to compare the row count of datasets in the same data source, but with different schemas. First, you must add dataset + schema as a separate data source connection in your `configuration.yml`, as in the following example that uses the same connection details but provides different schemas:
91+
{% include code-header.html %}
9192
```yaml
9293
data_source retail_customers_stage:
9394
type: postgres
@@ -108,6 +109,7 @@ data_source retail_customers_prod:
108109
schema: production
109110
```
110111
Then, you can define a cross check that compares values across these data sources.
112+
{% include code-header.html %}
111113
```yaml
112114
checks for dim_customer:
113115
# Check row count between datasets in different data sources
@@ -117,7 +119,7 @@ checks for dim_customer:
117119
Alternatively, depending on the type of data source you are using, you can use a [failed rows check]({% link soda-cl/failed-rows-checks.md %}) to write a custom SQL query that compares contents of datasets that you define by adding the schema before the dataset name, such as `prod.retail_customers` and `staging.retail_customers`.
118120

119121
The following example accesses a single Snowflake data source and compares values between the same datasets but in different databases and schemas: `prod.staging.dmds_scores` and `prod.measurement.post_scores`.
120-
122+
{% include code-header.html %}
121123
```yaml
122124
- failed rows:
123125
fail query: |

soda-cl/cross-row-checks.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ redirect_from: /soda-cl/row-count.html
1313
Use a cross check to compare row counts between datasets within the same, or different, data sources.
1414

1515
See also: [Compare data using SodaCL]({% link soda-cl/compare.md %})
16-
16+
{% include code-header.html %}
1717
```yaml
1818
checks for dim_customer:
1919
# Check row count between datasets in one data source
@@ -33,7 +33,7 @@ checks for dim_customer:
3333
In the context of [SodaCL check types]({% link soda-cl/metrics-and-checks.md %}#check-types), cross checks are unique. This check employs the `row_count` metric and is limited in its syntax variation, with only a few mutable parts to specify dataset and data source names.
3434

3535
The example check below compares the volume of rows in two datasets in the same data source. If the row count in the `dim_department_group` is not the same as in `dim_customer`, the check fails.
36-
36+
{% include code-header.html %}
3737
```yaml
3838
checks for dim_customer:
3939
- row_count same as dim_department_group
@@ -44,7 +44,7 @@ checks for dim_customer:
4444
You can use cross checks to compare row counts between datasets in different data sources, as in the example below.
4545

4646
In the example, `retail_customers` is the name of the other dataset, and `aws_postgres_retail` is the name of the data source in which `retail_customers` exists.
47-
47+
{% include code-header.html %}
4848
```yaml
4949
checks for dim_customer:
5050
- row_count same as retail_customers in aws_postgres_retail
@@ -67,15 +67,15 @@ checks for dim_customer:
6767
| | Apply a dataset filter to partition data during a scan; see [example](#example-with-dataset-filter). | - |
6868

6969
#### Example with check name
70-
70+
{% include code-header.html %}
7171
```yaml
7272
checks for dim_customer:
7373
- row_count same as retail_customers in aws_postgres_retail:
7474
name: Cross check customer datasets
7575
```
7676

7777
#### Example with quotes
78-
78+
{% include code-header.html %}
7979
```yaml
8080
checks for dim_customer:
8181
- row_count same as "dim_department_group"

0 commit comments

Comments
 (0)