|
4 | 4 | [](https://goreportcard.com/report/github.com/jf-tech/omniparser) |
5 | 5 | [](https://pkg.go.dev/github.com/jf-tech/omniparser) |
6 | 6 |
|
7 | | -Omniparser is written in naive Golang that ingests input data of various formats (**CSV, txt, XML, EDI, JSON**, and custom formats) in streaming fashion |
8 | | -and transforms data into desired JSON output based on a schema written in JSON. |
| 7 | +Omniparser is written in naive Golang that ingests input data of various formats (**CSV, txt, XML, EDI, JSON**, and |
| 8 | +custom formats) in streaming fashion and transforms data into desired JSON output based on a schema written in JSON. |
9 | 9 |
|
10 | | -Golang Version: 1.14.2 |
| 10 | +Golang Version: 1.14 |
11 | 11 |
|
12 | | -## Demo in Playground |
| 12 | +## Getting Started |
| 13 | + |
| 14 | +Follow the tutorial [Getting Started](./doc/gettingstarted.md) to write your first omniparser schema. |
| 15 | + |
| 16 | +## Online Playground |
13 | 17 |
|
14 | 18 | Use https://omniparser.herokuapp.com/ (may need to wait for a few seconds for heroku instance to wake up) |
15 | | -for trying out schemas and inputs, yours and from sample library, to see how transform works. |
| 19 | +for trying out schemas and inputs, yours or existing samples, to see how ingestion and transform work. |
16 | 20 |
|
17 | 21 |  |
18 | 22 |
|
19 | | -Take a detailed look at samples here: |
| 23 | +## More Examples |
20 | 24 | - [csv examples](extensions/omniv21/samples/csv) |
21 | 25 | - [fixed-length examples](extensions/omniv21/samples/fixedlength) |
22 | 26 | - [json examples](extensions/omniv21/samples/json) |
23 | 27 | - [xml examples](extensions/omniv21/samples/xml). |
24 | 28 | - [edi examples](extensions/omniv21/samples/edi). |
25 | 29 |
|
26 | | -## Simple Example (JSON -> JSON Transform) |
27 | | -- Input: |
28 | | - ``` |
29 | | - { |
30 | | - "order_id": "1234567", |
31 | | - "tracking_number": "1z9999999999999999", |
32 | | - "items": [ |
33 | | - { |
34 | | - "item_sku": "ab123", |
35 | | - "item_price": 12.34, |
36 | | - "number_purchased": 5 |
37 | | - }, |
38 | | - { |
39 | | - "item_sku": "ck763-23", |
40 | | - "item_price": 3.12, |
41 | | - "number_purchased": 2 |
42 | | - } |
43 | | - ] |
44 | | - } |
45 | | - ``` |
46 | | -- Schema: |
47 | | - ``` |
48 | | - { |
49 | | - "parser_settings": { |
50 | | - "version": "omni.2.1", |
51 | | - "file_format_type": "json" |
52 | | - }, |
53 | | - "transform_declarations": { |
54 | | - "FINAL_OUTPUT": { "object": { |
55 | | - "order_id": { "xpath": "order_id" }, |
56 | | - "tracking_number": { "custom_func": { |
57 | | - "name": "upper", |
58 | | - "args": [ { "xpath": "tracking_number" } ] |
59 | | - }}, |
60 | | - "items": { "array": [{ "xpath": "items/*", "object": { |
61 | | - "sku": { "custom_func": { |
62 | | - "name": "javascript", |
63 | | - "args": [ |
64 | | - { "const": "sku.toUpperCase().substring(0, 5)" }, |
65 | | - { "const": "sku" }, { "xpath": "item_sku" } |
66 | | - ] |
67 | | - }}, |
68 | | - "total_price": { "custom_func": { |
69 | | - "name": "javascript", |
70 | | - "args": [ |
71 | | - { "const": "num * price" }, |
72 | | - { "const": "num" }, { "xpath": "number_purchased", "type": "int" }, |
73 | | - { "const": "price" }, { "xpath": "item_price", "type": "float" } |
74 | | - ] |
75 | | - }} |
76 | | - }}]} |
77 | | - }} |
78 | | - } |
79 | | - } |
80 | | - ``` |
81 | | -- Code: |
82 | | - ``` |
83 | | - schema, err := omniparser.NewSchema("schema-name", strings.NewReader("...")) |
84 | | - if err != nil { ... } |
85 | | - transform, err := schema.NewTransform("input-name", strings.NewReader("..."), &transformctx.Ctx{}) |
86 | | - if err != nil { ... } |
87 | | - for { |
88 | | - b, err := transform.Read() |
89 | | - if err == io.EOF { break } |
90 | | - if err != nil { ... } |
91 | | - fmt.Println(string(b)) |
92 | | - } |
93 | | - ``` |
94 | | -- Output: |
95 | | - ``` |
96 | | - { |
97 | | - "order_id": "1234567", |
98 | | - "tracking_number": "1Z9999999999999999", |
99 | | - "items": [ |
100 | | - { |
101 | | - "sku": "AB123", |
102 | | - "total_price": 61.7 |
103 | | - }, |
104 | | - { |
105 | | - "sku": "CK763", |
106 | | - "total_price": 6.24 |
107 | | - } |
108 | | - ] |
109 | | - } |
110 | | - ``` |
111 | | -
|
112 | 30 | ## Why |
113 | 31 | - No good ETL transform/parser library exists in Golang. |
114 | 32 | - Even looking into Java and other languages, choices aren't many and all have limitations: |
115 | 33 | - [Smooks](https://www.smooks.org/) is dead, plus its EDI parsing/transform is too heavyweight, needing code-gen. |
116 | 34 | - [BeanIO](http://beanio.org/) can't deal with EDI input. |
117 | 35 | - [Jolt](https://github.com/bazaarvoice/jolt) can't deal with anything other than JSON input. |
118 | 36 | - [JSONata](https://jsonata.org/) still only JSON -> JSON transform. |
119 | | -- Many of the parsers/transforms don't support streaming read, loading entire input into memory - not acceptable in some situations. |
| 37 | +- Many of the parsers/transforms don't support streaming read, loading entire input into memory - not acceptable in some |
| 38 | +situations. |
120 | 39 |
|
121 | 40 | ## Requirements |
122 | 41 | - Golang 1.14 |
@@ -150,4 +69,5 @@ Take a detailed look at samples here: |
150 | 69 | - Ability to provide a new file format support to built-in omniv2 schema handler. |
151 | 70 |
|
152 | 71 | ## Footnotes |
153 | | -- omniparser is a collaboration effort of [jf-tech](https://github.com/jf-tech/), [Simon](https://github.com/liangxibing) and [Steven](http://github.com/wangjia007bond). |
| 72 | +- omniparser is a collaboration effort of [jf-tech](https://github.com/jf-tech/),[Simon](https://github.com/liangxibing) |
| 73 | +and [Steven](http://github.com/wangjia007bond). |
0 commit comments