diff --git a/SUMMARY.md b/SUMMARY.md index 1dcb7877d..a44401c6c 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -14,13 +14,7 @@ * [Key concepts](concepts/key-concepts.md) * [Buffering](concepts/buffering.md) -* [Data pipeline](concepts/data-pipeline/README.md) - * [Input](concepts/data-pipeline/input.md) - * [Parser](concepts/data-pipeline/parser.md) - * [Filter](concepts/data-pipeline/filter.md) - * [Buffer](concepts/data-pipeline/buffer.md) - * [Router](concepts/data-pipeline/router.md) - * [Output](concepts/data-pipeline/output.md) +* [Data pipeline](concepts/data-pipeline.md) ## Installation @@ -91,7 +85,7 @@ ## Data pipeline * [Pipeline monitoring](pipeline/pipeline-monitoring.md) -* [Inputs](pipeline/inputs/README.md) +* [Inputs](pipeline/inputs.md) * [Collectd](pipeline/inputs/collectd.md) * [CPU metrics](pipeline/inputs/cpu-metrics.md) * [Disk I/O metrics](pipeline/inputs/disk-io-metrics.md) @@ -136,7 +130,7 @@ * [Windows Event logs (winevtlog)](pipeline/inputs/windows-event-log-winevtlog.md) * [Windows Event logs (winlog)](pipeline/inputs/windows-event-log.md) * [Windows exporter metrics](pipeline/inputs/windows-exporter-metrics.md) -* [Parsers](pipeline/parsers/README.md) +* [Parsers](pipeline/parsers.md) * [Configuring parsers](pipeline/parsers/configuring-parser.md) * [JSON](pipeline/parsers/json.md) * [Regular expression](pipeline/parsers/regular-expression.md) @@ -152,7 +146,7 @@ * [SQL](pipeline/processors/sql.md) * [Filters as processors](pipeline/processors/filters.md) * [Conditional processing](pipeline/processors/conditional-processing.md) -* [Filters](pipeline/filters/README.md) +* [Filters](pipeline/filters.md) * [AWS metadata](pipeline/filters/aws-metadata.md) * [CheckList](pipeline/filters/checklist.md) * [ECS metadata](pipeline/filters/ecs-metadata.md) @@ -175,7 +169,8 @@ * [Throttle](pipeline/filters/throttle.md) * [Type converter](pipeline/filters/type-converter.md) * [Wasm](pipeline/filters/wasm.md) -* [Outputs](pipeline/outputs/README.md) +* [Router](pipeline/router.md) +* [Outputs](pipeline/outputs.md) * [Amazon CloudWatch](pipeline/outputs/cloudwatch.md) * [Amazon Kinesis Data Firehose](pipeline/outputs/firehose.md) * [Amazon Kinesis Data Streams](pipeline/outputs/kinesis.md) diff --git a/concepts/buffering.md b/concepts/buffering.md index 5b55c3008..fdfb06f63 100644 --- a/concepts/buffering.md +++ b/concepts/buffering.md @@ -8,10 +8,30 @@ When [Fluent Bit](https://fluentbit.io) processes data, it uses the system memor Buffering is the ability to store the records, and continue storing incoming data while previous data is processed and delivered. Buffering in memory is the fastest mechanism, but there are scenarios requiring special strategies to deal with [backpressure](../administration/backpressure.md), data safety, or to reduce memory consumption by the service in constrained environments. +```mermaid +graph LR + accTitle: Fluent Bit data pipeline + accDescr: A diagram of the Fluent Bit data pipeline, which includes input, a parser, a filter, a buffer, routing, and various outputs. + A[Input] --> B[Parser] + B --> C[Filter] + C --> D[Buffer] + D --> E((Routing)) + E --> F[Output 1] + E --> G[Output 2] + E --> H[Output 3] + style D stroke:darkred,stroke-width:2px; +``` + Network failures or latency in third party service is common. When data can't be delivered fast enough and new data to process arrives, the system can face backpressure. Fluent Bit buffering strategies are designed to solve problems associated with backpressure and general delivery failures. Fluent Bit offers a primary buffering mechanism in memory and an optional secondary one using the file system. With this hybrid solution you can accommodate any use case safely and keep a high performance while processing your data. These mechanisms aren't mutually exclusive. When data is ready to be processed or delivered it's always be in memory. Other data in the queue might be in the file system until is ready to be processed and moved up to memory. +The `buffer` phase contains the data in an immutable state, meaning that no other filter can be applied. + +Buffered data uses the Fluent Bit internal binary representation, which isn't raw text. + +To avoid data loss in case of system failures, Fluent Bit offers a buffering mechanism in the file system that acts as a backup system. + To learn more about the buffering configuration in Fluent Bit, see [Buffering and Storage](../administration/buffering-and-storage.md). diff --git a/concepts/data-pipeline.md b/concepts/data-pipeline.md new file mode 100644 index 000000000..2fe61fd5a --- /dev/null +++ b/concepts/data-pipeline.md @@ -0,0 +1,27 @@ +# Data pipeline + +The Fluent Bit data pipeline incorporates several specific concepts. Data processing flows through the pipeline following these concepts in order. + +## Filters + +[Filters](../pipeline/filters.md) let you alter the collected data before delivering it to a destination. In production environments you need full control of the data you're collecting. Using filters lets you control data before processing. + +## Buffer + +The [`buffer`](./buffering.md) phase in the pipeline aims to provide a unified and persistent mechanism to store your data, using the primary in-memory model or the file system-based mode. + +## Inputs + +Fluent Bit provides [input plugins](../pipeline/inputs.md) to gather information from different sources. Some plugins collect data from log files, and others gather metrics information from the operating system. There are many plugins to suit different needs. + +## Outputs + +[Output plugins](../pipeline/outputs.md) let you define destinations for your data. Common destinations are remote services, local file systems, or other standard interfaces. + +## Parsers + +[Parsers](../pipeline/parsers.md) convert unstructured data to structured data. Use a parser to set a structure to the incoming data by using input plugins as data is collected. + +## Route + +[Routing](../pipeline/router.md) is a core feature that lets you route your data through filters, and then to one or multiple destinations. The router relies on the concept of [tags](./key-concepts.md#tag) and [matching](./key-concepts.md#match) rules. diff --git a/concepts/data-pipeline/README.md b/concepts/data-pipeline/README.md deleted file mode 100644 index 7ea3ea315..000000000 --- a/concepts/data-pipeline/README.md +++ /dev/null @@ -1 +0,0 @@ -# Data pipeline diff --git a/concepts/data-pipeline/buffer.md b/concepts/data-pipeline/buffer.md deleted file mode 100644 index e347985b6..000000000 --- a/concepts/data-pipeline/buffer.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -description: Data processing with reliability ---- - -# Buffer - -The [`buffer`](../buffering.md) phase in the pipeline aims to provide a unified and persistent mechanism to store your data, using the primary in-memory model or the file system-based mode. - -The `buffer` phase contains the data in an immutable state, meaning that no other filter can be applied. - -```mermaid -graph LR - accTitle: Fluent Bit data pipeline - accDescr: A diagram of the Fluent Bit data pipeline, which includes input, a parser, a filter, a buffer, routing, and various outputs. - A[Input] --> B[Parser] - B --> C[Filter] - C --> D[Buffer] - D --> E((Routing)) - E --> F[Output 1] - E --> G[Output 2] - E --> H[Output 3] - style D stroke:darkred,stroke-width:2px; -``` - -Buffered data uses the Fluent Bit internal binary representation, which isn't raw text. - -Fluent Bit offers a buffering mechanism in the file system that acts as a backup system to avoid data loss in case of system failures. diff --git a/concepts/key-concepts.md b/concepts/key-concepts.md index a99c973c5..80f97c546 100644 --- a/concepts/key-concepts.md +++ b/concepts/key-concepts.md @@ -58,7 +58,7 @@ to represent events. This format is still supported for reading input event stre ## Filtering -You might need to perform modifications on an Event's content. The process to alter, append to, or drop Events is called [_filtering_](data-pipeline/filter.md). +You might need to perform modifications on an event's content. The process to alter, append to, or drop Events is called [_filtering_](../pipeline/filters.md). Use filtering to: @@ -68,15 +68,15 @@ Use filtering to: ## Tag -Every Event ingested by Fluent Bit is assigned a Tag. This tag is an internal string used in a later stage by the Router to decide which Filter or [Output](data-pipeline/output.md) phase it must go through. +Every Event ingested by Fluent Bit is assigned a Tag. This tag is an internal string used in a later stage by the Router to decide which Filter or [Output](../pipeline/outputs.md) phase it must go through. -Most tags are assigned manually in the configuration. If a tag isn't specified, Fluent Bit assigns the name of the [Input](data-pipeline/input.md) plugin instance where that Event was generated from. +Most tags are assigned manually in the configuration. If a tag isn't specified, Fluent Bit assigns the name of the [Input](../pipeline/inputs.md) plugin instance where that Event was generated from. {% hint style="info" %} The [Forward](../pipeline/inputs/forward.md) input plugin doesn't assign tags. This plugin speaks the Fluentd wire protocol called Forward where every Event already comes with a Tag associated. Fluent Bit will always use the incoming Tag set by the client. {% endhint %} -A tagged record must always have a Matching rule. To learn more about Tags and Matches, see [Routing](data-pipeline/router.md). +A tagged record must always have a Matching rule. To learn more about Tags and Matches, see [Routing](../pipeline/router.md). ## Timestamp @@ -97,7 +97,7 @@ where: Fluent Bit lets you route your collected and processed Events to one or multiple destinations. A _Match_ represents a rule to select Events where a Tag matches a defined rule. -To learn more about Tags and Matches, see [Routing](data-pipeline/router.md). +To learn more about Tags and Matches, see [Routing](../pipeline/router.md). ## Structured messages diff --git a/installation/sources/build-with-static-configuration.md b/installation/sources/build-with-static-configuration.md index 824e6515e..9be722484 100644 --- a/installation/sources/build-with-static-configuration.md +++ b/installation/sources/build-with-static-configuration.md @@ -12,7 +12,7 @@ The following steps assume you are familiar with configuring Fluent Bit using te #### Configuration directory -In your file system, prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. This directory must contain a minimum of one configuration file called `fluent-bit.conf` containing the required [`SERVICE`](/administration/configuring-fluent-bit/yaml/service-section.md), [`INPUT`](/concepts/data-pipeline/input.md), and [`OUTPUT`](/concepts/data-pipeline/output.md) sections. +In your file system, prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. This directory must contain a minimum of one configuration file, called `fluent-bit.conf`, that contains the required [`SERVICE`](/administration/configuring-fluent-bit/yaml/service-section.md), [`INPUT`](../../pipeline/input.md), and [`OUTPUT`](../../pipeline/output.md) sections. As an example, create a new `fluent-bit.yaml` file or `fluent-bit.conf` file: @@ -84,4 +84,4 @@ $ bin/fluent-bit ... [0] cpu.local: [1539984752.000347547, {"cpu_p"=>0.750000, "user_p"=>0.500000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}] -``` \ No newline at end of file +``` diff --git a/concepts/data-pipeline/filter.md b/pipeline/filters.md similarity index 83% rename from concepts/data-pipeline/filter.md rename to pipeline/filters.md index b31d8e2a5..ff67d81f9 100644 --- a/concepts/data-pipeline/filter.md +++ b/pipeline/filters.md @@ -1,8 +1,4 @@ ---- -description: Modify, enrich or drop your records ---- - -# Filter +# Filters In production environments you need full control of the data you're collecting. Filtering lets you alter the collected data before delivering it to a destination. @@ -25,5 +21,3 @@ Filtering is implemented through plugins. Each available filter can be used to m Fluent Bit support many filters. A common use case for filtering is Kubernetes deployments. Every pod log needs the proper metadata associated with it. Like input plugins, filters run in an instance context, which has its own independent configuration. Configuration keys are often called _properties_. - -For more details about the Filters available and their usage, see [Filters](https://docs.fluentbit.io/manual/pipeline/filters). diff --git a/pipeline/filters/README.md b/pipeline/filters/README.md deleted file mode 100644 index fb3173267..000000000 --- a/pipeline/filters/README.md +++ /dev/null @@ -1,2 +0,0 @@ -# Filters - diff --git a/pipeline/filters/rewrite-tag.md b/pipeline/filters/rewrite-tag.md index 54f844155..fb4272e54 100644 --- a/pipeline/filters/rewrite-tag.md +++ b/pipeline/filters/rewrite-tag.md @@ -4,9 +4,9 @@ description: Powerful and flexible routing # Rewrite tag -Tags make [routing](../../concepts/data-pipeline/router.md) possible. Tags are set in the configuration of the `INPUT` definitions where the records are generated. There are scenarios where you might want to modify the tag in the pipeline to perform more advanced and flexible routing. +Tags make [routing](../../pipeline/router.md) possible. Tags are set in the configuration of the `INPUT` definitions where the records are generated. There are scenarios when you might want to modify the tag in the pipeline to perform more advanced and flexible routing. -The _Rewrite Tag_ filter lets you re-emit a record under a new tag. Once a record has been re-emitted, the original record can be preserved or discarded. +The _Rewrite Tag_ filter lets you re-emit a record under a new tag. After a record is re-emitted, the original record can be preserved or discarded. The Rewrite Tag filter defines rules that match specific record key content against a regular expression. If a match exists, a new record with the defined tag will be emitted, entering from the beginning of the pipeline. Multiple rules can be specified and are processed in order until one of them matches. diff --git a/concepts/data-pipeline/input.md b/pipeline/inputs.md similarity index 51% rename from concepts/data-pipeline/input.md rename to pipeline/inputs.md index 86edfaaad..ef83aa42d 100644 --- a/concepts/data-pipeline/input.md +++ b/pipeline/inputs.md @@ -1,10 +1,6 @@ ---- -description: The way to gather data from your sources ---- +# Inputs -# Input - -[Fluent Bit](http://fluentbit.io) provides input plugins to gather information from different sources. Some plugins collect data from log files, while others can gather metrics information from the operating system. There are many plugins to suit different needs. +Input plugins gather information from different sources. Some plugins collect data from log files, and others gather metrics information from the operating system. There are many different plugins, and they let you handle many different needs. ```mermaid graph LR @@ -21,7 +17,3 @@ graph LR ``` When an input plugin loads, an internal _instance_ is created. Each instance has its own independent configuration. Configuration keys are often called _properties_. - -Every input plugin has its own documentation section that specifies how to use it and what properties are available. - -For more details, see [Input Plugins](https://docs.fluentbit.io/manual/pipeline/inputs). diff --git a/pipeline/inputs/README.md b/pipeline/inputs/README.md deleted file mode 100644 index 1bf013c99..000000000 --- a/pipeline/inputs/README.md +++ /dev/null @@ -1,2 +0,0 @@ -# Inputs - diff --git a/concepts/data-pipeline/output.md b/pipeline/outputs.md similarity index 56% rename from concepts/data-pipeline/output.md rename to pipeline/outputs.md index 68a8092ba..604271ed9 100644 --- a/concepts/data-pipeline/output.md +++ b/pipeline/outputs.md @@ -1,10 +1,6 @@ ---- -description: Learn about destinations for your data, such as databases and cloud services. ---- +# Outputs -# Output - -The output interface lets you define destinations for your data. Common destinations are remote services, local file systems, or other standard interfaces. Outputs are implemented as plugins. +Outputs let you define destinations for your data. Common destinations are remote services, local file systems, or other standard interfaces. Outputs are implemented as plugins. ```mermaid graph LR @@ -23,7 +19,3 @@ graph LR ``` When an output plugin is loaded, an internal _instance_ is created. Every instance has its own independent configuration. Configuration keys are often called _properties_. - -Every output plugin has its own documentation section specifying how it can be used and what properties are available. - -For more details, see [Output Plugins](https://docs.fluentbit.io/manual/pipeline/outputs). diff --git a/pipeline/outputs/README.md b/pipeline/outputs/README.md deleted file mode 100644 index 7a73067ca..000000000 --- a/pipeline/outputs/README.md +++ /dev/null @@ -1,2 +0,0 @@ -# Outputs - diff --git a/concepts/data-pipeline/parser.md b/pipeline/parsers.md similarity index 71% rename from concepts/data-pipeline/parser.md rename to pipeline/parsers.md index 5792b6e9b..117cd18b9 100644 --- a/concepts/data-pipeline/parser.md +++ b/pipeline/parsers.md @@ -1,10 +1,8 @@ ---- -description: Convert unstructured messages to structured messages ---- +# Parsers -# Parser +Dealing with raw strings or unstructured messages is difficult. Having a structure makes data more usable. Set a structure for the incoming data by using input plugins as data is collected. -Dealing with raw strings or unstructured messages is difficult. Having a structure makes data more usable. Set a structure to the incoming data by using input plugins as data is collected: +Parsers are fully configurable and are independently and optionally handled by each input plugin. ```mermaid graph LR @@ -26,7 +24,7 @@ The parser converts unstructured data to structured data. As an example, conside 192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395 ``` -This log line is a raw string without format. Structuring the log makes it easier to process the data later. If the [regular expression parser](../../pipeline/parsers/regular-expression.md) is used, the log entry could be converted to: +This log line is a raw string without format. Structuring the log makes it easier to process the data later. If the [regular expression parser](./parsers/regular-expression.md) is used, the log entry could be converted to: ```javascript { @@ -40,5 +38,3 @@ This log line is a raw string without format. Structuring the log makes it easie "agent": "" } ``` - -Parsers are fully configurable and are independently and optionally handled by each input plugin. For more details, see [Parsers](https://docs.fluentbit.io/manual/pipeline/parsers). diff --git a/pipeline/parsers/README.md b/pipeline/parsers/README.md deleted file mode 100644 index e183f20a1..000000000 --- a/pipeline/parsers/README.md +++ /dev/null @@ -1,2 +0,0 @@ -# Parsers - diff --git a/concepts/data-pipeline/router.md b/pipeline/router.md similarity index 95% rename from concepts/data-pipeline/router.md rename to pipeline/router.md index e4e2a1fe5..57a3cb138 100644 --- a/concepts/data-pipeline/router.md +++ b/pipeline/router.md @@ -4,7 +4,7 @@ description: Create flexible routing rules # Router -Routing is a core feature that lets you route your data through filters and then to one or multiple destinations. The router relies on the concept of [Tags](../key-concepts.md) and [Matching](../key-concepts.md) rules. +Routing is a core feature that lets you route your data through filters and then to one or multiple destinations. The router relies on the concept of [Tags](../concepts/key-concepts.md) and [Matching](../key-concepts.md) rules. ```mermaid graph LR @@ -161,4 +161,4 @@ pipeline: {% endtab %} {% endtabs %} -In this configuration, the `Match_regex` rule is set to `.*_sensor_[AB]`. This regular expression matches any `Tag` that ends with `_sensor_A` or `_sensor_B`, regardless of what precedes it. This approach provides a more flexible and powerful way to handle different source tags with a single routing rule. \ No newline at end of file +In this configuration, the `Match_regex` rule is set to `.*_sensor_[AB]`. This regular expression matches any `Tag` that ends with `_sensor_A` or `_sensor_B`, regardless of what precedes it. This approach provides a more flexible and powerful way to handle different source tags with a single routing rule.