Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@

The following standard ClickHouse data types are currently supported in ClickPipes:

- Base numeric types - \[U\]Int8/16/32/64 and Float32/64
- Base numeric types - \[U\]Int8/16/32/64, Float32/64, and BFloat16
- Large integer types - \[U\]Int128/256
- Decimal Types
- Boolean
Expand All @@ -55,30 +55,22 @@
- all ClickHouse LowCardinality types
- Map with keys and values using any of the above types (including Nullables)
- Tuple and Array with elements using any of the above types (including Nullables, one level depth only)
- SimpleAggregateFunction types (for AggregatingMergeTree or SummingMergeTree destinations)

### Avro {#avro}

#### Supported Avro Data Types {#supported-avro-data-types}

ClickPipes supports all Avro Primitive and Complex types, and all Avro Logical types except `time-millis`, `time-micros`, `local-timestamp-millis`, `local_timestamp-micros`, and `duration`. Avro `record` types are converted to Tuple, `array` types to Array, and `map` to Map (string keys only). In general the conversions listed [here](/interfaces/formats/Avro#data-type-mapping) are available. We recommend using exact type matching for Avro numeric types, as ClickPipes does not check for overflow or precision loss on type conversion.
Alternatively, all Avro types can be inserted into a `String` column, and will be represented as a valid JSON string in that case.

Check warning on line 64 in docs/integrations/data-ingestion/clickpipes/kafka/03_reference.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.FutureTense

Instead of future tense 'will be', use present tense.

#### Nullable types and Avro unions {#nullable-types-and-avro-unions}

Nullable types in Avro are defined by using a Union schema of `(T, null)` or `(null, T)` where T is the base Avro type. During schema inference, such unions will be mapped to a ClickHouse "Nullable" column. Note that ClickHouse does not support
`Nullable(Array)`, `Nullable(Map)`, or `Nullable(Tuple)` types. Avro null unions for these types will be mapped to non-nullable versions (Avro Record types are mapped to a ClickHouse named Tuple). Avro "nulls" for these types will be inserted as:
- An empty Array for a null Avro array
- An empty Map for a null Avro Map
- A named Tuple with all default/zero values for a null Avro Record

### Experimental {#experimental-types-support}

#### Variant type support {#variant-type-support}

<ExperimentalBadge/>

Variant type support is automatic if your Cloud service is running ClickHouse 25.3 or later. Otherwise, you will
have to submit a support ticket to enable it on your service.

ClickPipes supports the Variant type in the following circumstances:
- Avro Unions. If your Avro schema contains a union with multiple non-null types, ClickPipes will infer the
appropriate variant type. Variant types are not otherwise supported for Avro data.
Expand All @@ -87,12 +79,6 @@
type can be used in the Variant definition - for example, `Variant(Int64, UInt32)` is not supported.

#### JSON type support {#json-type-support}

<ExperimentalBadge/>

JSON type support is automatic if your Cloud service is running ClickHouse 25.3 or later. Otherwise, you will
have to submit a support ticket to enable it on your service.

ClickPipes support the JSON type in the following circumstances:
- Avro Record types can always be assigned to a JSON column.
- Avro String and Bytes types can be assigned to a JSON column if the column actually holds JSON String objects.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,18 +99,16 @@
```

### Custom Certificates {#custom-certificates}
ClickPipes for Kafka supports the upload of custom certificates for Kafka brokers with SASL & public SSL/TLS certificate. You can upload your certificate in the SSL Certificate section of the ClickPipe setup.
:::note
Please note that while we support uploading a single SSL certificate along with SASL for Kafka, SSL with Mutual TLS (mTLS) is not supported at this time.
:::
ClickPipes for Kafka supports the upload of custom certificates for Kafka brokers which use non-public server certificates.
Upload of client certificates and keys is also supported for mutual TLS (mTLS) based authentication.

## Performance {#performance}

### Batching {#batching}
ClickPipes inserts data into ClickHouse in batches. This is to avoid creating too many parts in the database which can lead to performance issues in the cluster.

Batches are inserted when one of the following criteria has been met:
- The batch size has reached the maximum size (100,000 rows or 20MB)
- The batch size has reached the maximum size (100,000 rows or 32MB per 1GB of pod memory)

Check warning on line 111 in docs/integrations/data-ingestion/clickpipes/kafka/04_best_practices.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Units

Add a space between the number and the unit in '1GB'.

Check warning on line 111 in docs/integrations/data-ingestion/clickpipes/kafka/04_best_practices.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Units

Add a space between the number and the unit in '32MB'.
- The batch has been open for a maximum amount of time (5 seconds)

### Latency {#latency}
Expand Down
15 changes: 5 additions & 10 deletions docs/integrations/data-ingestion/clickpipes/kinesis.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@
### Standard types support {#standard-types-support}
The following ClickHouse data types are currently supported in ClickPipes:

- Base numeric types - \[U\]Int8/16/32/64 and Float32/64
- Base numeric types - \[U\]Int8/16/32/64, Float32/64, and BFloat16
- Large integer types - \[U\]Int128/256
- Decimal Types
- Boolean
Expand All @@ -107,19 +107,14 @@
- all ClickHouse LowCardinality types
- Map with keys and values using any of the above types (including Nullables)
- Tuple and Array with elements using any of the above types (including Nullables, one level depth only)
-
### Variant type support (experimental) {#variant-type-support}
Variant type support is automatic if your Cloud service is running ClickHouse 25.3 or later. Otherwise, you will
have to submit a support ticket to enable it on your service.
- SimpleAggregateFunction types (for AggregatingMergeTree or SummingMergeTree destinations)

### Variant type support {#variant-type-support}
You can manually specify a Variant type (such as `Variant(String, Int64, DateTime)`) for any JSON field
in the source data stream. Because of the way ClickPipes determines the correct variant subtype to use, only one integer or datetime
type can be used in the Variant definition - for example, `Variant(Int64, UInt32)` is not supported.

### JSON type support (experimental) {#json-type-support}
JSON type support is automatic if your Cloud service is running ClickHouse 25.3 or later. Otherwise, you will
have to submit a support ticket to enable it on your service.

### JSON type support {#json-type-support}
JSON fields that are always a JSON object can be assigned to a JSON destination column. You will have to manually change the destination
column to the desired JSON type, including any fixed or skipped paths.

Expand Down Expand Up @@ -148,7 +143,7 @@
ClickPipes inserts data into ClickHouse in batches. This is to avoid creating too many parts in the database which can lead to performance issues in the cluster.

Batches are inserted when one of the following criteria has been met:
- The batch size has reached the maximum size (100,000 rows or 20MB)
- The batch size has reached the maximum size (100,000 rows or 32MB per 1GB of replica memory)

Check warning on line 146 in docs/integrations/data-ingestion/clickpipes/kinesis.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Units

Add a space between the number and the unit in '1GB'.

Check warning on line 146 in docs/integrations/data-ingestion/clickpipes/kinesis.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Units

Add a space between the number and the unit in '32MB'.
- The batch has been open for a maximum amount of time (5 seconds)

### Latency {#latency}
Expand Down