You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/user-guide/logs/pipeline-config.md
+51-1Lines changed: 51 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,10 +11,16 @@ These configurations are provided in YAML format, allowing the Pipeline to proce
11
11
12
12
## Overall structure
13
13
14
-
Pipeline consists of two parts: Processors and Transform, both of which are in array format. A Pipeline configuration can contain multiple Processors and multiple Transforms. The data type described by Transform determines the table structure when storing log data in the database.
14
+
Pipeline consists of four parts: Processors, Dispatcher, Transform, and Table suffix.
15
+
Processors pre-processes input log data.
16
+
Dispatcher forwards pipeline execution context onto different subsequent pipeline.
17
+
Transform decides the final datatype and table structure in the database.
18
+
Table suffix allows storing the data into different tables.
15
19
16
20
- Processors are used for preprocessing log data, such as parsing time fields and replacing fields.
21
+
- Dispatcher(optional) is used for forwarding the context into another pipeline, so that the same batch of input data can be divided and processed by different pipeline based on certain fields.
17
22
- Transform is used for converting data formats, such as converting string types to numeric types.
23
+
- Table suffix(optional) is used for storing data into different table for later convenience.
18
24
19
25
Here is an example of a simple configuration that includes Processors and Transform:
20
26
@@ -45,6 +51,7 @@ transform:
45
51
# epoch is a special field type and must specify precision
46
52
type: epoch, ms
47
53
index: timestamp
54
+
table_suffix: _${string_field_a}
48
55
```
49
56
50
57
## Processor
@@ -770,3 +777,46 @@ matches the `http` rule, data is stored in `applogs_http`.
770
777
771
778
If no rules match, data is transformed by the current pipeline's
772
779
transformations.
780
+
781
+
## Table suffix
782
+
783
+
:::warning Experimental Feature
784
+
This experimental feature may contain unexpected behavior, have its functionality change in the future.
785
+
:::
786
+
787
+
There are cases where you want to split and insert log data into different target table
788
+
based on some certain values of input data. For example, if you want to divide and store the log data
789
+
based on the application where the log is produced, thus adding an app name suffix to the target table.
790
+
791
+
A sample configuration is like:
792
+
```yaml
793
+
table_suffix: _${app_name}
794
+
```
795
+
796
+
The syntax is simple: use `${}` to include the variable in the pipeline execution context.
797
+
The variable can be directly from the input data or a product of former process.
798
+
After the table suffix is formatted, the whole string will be added to the input table name.
799
+
800
+
Note:
801
+
1. The variable must be an integer number or a string type of data.
802
+
2. If any error occurs in runtime(e.g: the variable is missing or not a valid type), the input table
803
+
name will be used.
804
+
805
+
Here is an example of how it works. The input data is like following:
806
+
```JSON
807
+
[
808
+
{"type": "db"},
809
+
{"type": "http"},
810
+
{"t": "test"}
811
+
]
812
+
```
813
+
814
+
The input table name is `persist_app`, and the pipeline config is like
815
+
```YAML
816
+
table_suffix: _${type}
817
+
```
818
+
819
+
These three lines of input log will be inserted into three tables:
820
+
1. `persist_app_db`
821
+
2. `persist_app_http`
822
+
3. `persist_app`, for it doesn't have a `type` field, thus the default table name will be used.
0 commit comments