Skip to content

Commit 46cd614

Browse files
authored
Update gettingstarted.md
1 parent 2c975b9 commit 46cd614

File tree

1 file changed

+30
-13
lines changed

1 file changed

+30
-13
lines changed

doc/gettingstarted.md

+30-13
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This page is a step-by-step introduction of how to write an omniparser schema (specifically tailor
44
for the latest `"omni.2.1"` schema version) and how to ingest and transform inputs programmatically
5-
and by the cli tool.
5+
and by the CLI tool.
66

77
## Prerequisites and Notes
88

@@ -47,7 +47,6 @@ transform each of the data line into the following JSON output:
4747
"wind": "South East 4.97 mph"
4848
}
4949
]
50-
5150
```
5251
As you can see, in the desired output, we'd like to standardize all the input temperatures into the
5352
same fahrenheit unit; we'd also like to do some translation such that the wind direction and wind
@@ -56,7 +55,7 @@ into [RFC-3339](https://tools.ietf.org/html/rfc3339) standard format.
5655

5756
## CLI (command line interface)
5857

59-
Before we get into schema writing, let's first get familiar with omniparser cli so that we can easily
58+
Before we get into schema writing, let's first get familiar with omniparser CLI so that we can easily
6059
and incrementally test our schema writing.
6160

6261
Assuming you have the git repo cloned at `~/dev/jf-tech/omniparser/`, simply run this bash script:
@@ -77,7 +76,7 @@ $ touch input.csv
7776
$ touch schema.json
7877
```
7978
Use any editor to cut & paste the CSV content from [The Input](#the-input) into `input.csv`, and
80-
now run omniparser cli from `~/Downloads/omniparser/guide/`:
79+
now run omniparser CLI from `~/Downloads/omniparser/guide/`:
8180
```
8281
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
8382
Error: unable to perform schema validation: EOF
@@ -99,7 +98,7 @@ This is the common part of all omniparser schemas, the header `parser_settings`:
9998
}
10099
}
101100
```
102-
It's self-explanatory. Now let's run the cli again:
101+
It's self-explanatory. Now let's run the CLI again:
103102
```
104103
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
105104
Error: schema 'schema.json' validation failed: (root): transform_declarations is required
@@ -121,7 +120,7 @@ transformation. Let's add an empty `transform_declarations` for now:
121120
"transform_declarations": {}
122121
}
123122
```
124-
Run the cli we get another error:
123+
Run the CLI we get another error:
125124
```
126125
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
127126
Error: schema 'schema.json' validation failed: transform_declarations: FINAL_OUTPUT is required
@@ -143,7 +142,7 @@ the output. Given the section is called `transform_declarations` you might have
143142
multiple templates defined in it. Each template can reference other templates. There must be one
144143
and only one template called `FINAL_OUTPUT`.
145144

146-
Run the cli we get a new error:
145+
Run the CLI we get a new error:
147146
```
148147
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
149148
Error: schema 'schema.json' validation failed: (root): file_declaration is required
@@ -193,7 +192,7 @@ Let's add these:
193192
}
194193
```
195194

196-
Run the cli again:
195+
Run the CLI again:
197196
```
198197
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
199198
[
@@ -279,7 +278,7 @@ Let's make small modifications to our schema:
279278
}
280279
```
281280

282-
Rerun the cli to ensure everything is still working. Now the IDR and its imaginary converted XML
281+
Rerun the CLI to ensure everything is still working. Now the IDR and its imaginary converted XML
283282
equivalent look like this:
284283
```
285284
<>
@@ -339,7 +338,7 @@ Remember for the first data line, its corresponding IDR (or the IDR's equivalent
339338
Thus, an XPath query `"xpath": "DATE"` on the root of the IDR would return `01/31/2019 12:34:56-0800`, which is
340339
used as the value for the field `date`. So on and so forth for all other fields.
341340

342-
Run the cli, we have:
341+
Run the CLI, we have:
343342
```
344343
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
345344
[
@@ -388,7 +387,7 @@ built-in function to achieve this:
388387
}
389388
```
390389

391-
Run cli we have:
390+
Run CLI we have:
392391
```
393392
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
394393
[
@@ -508,7 +507,7 @@ Here we introduce two new things: 1) template and 2) custom_func `javascript`.
508507
value `10.5`, `"type": "float"` is used. However when the script is done, the result is already
509508
in float, there is no need to specify `"type": "float"` for the `custom_func` directive.
510509
511-
Now let's run cli:
510+
Now let's run CLI:
512511
```
513512
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
514513
[
@@ -562,7 +561,7 @@ numeric value. That should be an easy fix:
562561
Basically changing `"low_temperature_fahrenheit": { "xpath": "LOW_TEMP_F" }` to
563562
`"low_temperature_fahrenheit": { "xpath": "LOW_TEMP_F", "type": "float" }`.
564563
565-
Run cli again, we have:
564+
Run CLI again, we have:
566565
```
567566
$ ~/dev/jf-tech/omniparser/cli.sh transform -i input.csv -s schema.json
568567
[
@@ -804,3 +803,21 @@ code snippet of showing how to achieve this:
804803
// output contains a []byte of the ingested and transformed record.
805804
}
806805
```
806+
807+
### The Output
808+
```
809+
[
810+
{
811+
"date": "2019-01-31T12:34:56-08:00",
812+
"high_temperature_fahrenheit": 50.9,
813+
"low_temperature_fahrenheit": 30.2,
814+
"wind": "North 20.5 mph"
815+
},
816+
{
817+
"date": "2020-07-31T01:23:45-05:00",
818+
"high_temperature_fahrenheit": 102.2,
819+
"low_temperature_fahrenheit": 95,
820+
"wind": "South East 4.97 mph"
821+
}
822+
]
823+
```

0 commit comments

Comments
 (0)