Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New documentation on formatting settings for reading from s3 #15970

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
|Name|Description|Example|
|---|---|---|
|`POSIX`|String in `%Y-%m-%d %H:%M:%S` format|2001-03-26 16:10:00|
|`ISO`|Format, corresponding to the Iso8601 standart|2001-03-26 16:10:00Z|
|`UNIX_TIME_SECONDS`|Number of seconds that have elapsed since the start of the epoch|985623000|
|`UNIX_TIME_MILLISECONDS`|Number of milliseconds that have elapsed since the start of the epoch|985623000000|
|`UNIX_TIME_MICROSECONDS`|Number of microseconds that have elapsed since the start of the epoch|985623000000000|
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
|Setting name|Description|Possible values|
|----|----|---|
|`file_pattern`|File name template|File name template string. Wildcards `*` are supported.|
|`data.interval.unit`|Unit for parsing `Interval` type|`MICROSECONDS`, `MILLISECONDS`, `SECONDS`, `MINUTES`, `HOURS`, `DAYS`, `WEEKS`|
|`data.datetime.format_name`|Predefined format in which `Datetime` data is stored|`POSIX`, `ISO`|
|`data.datetime.format`|Strftime-like template which defines how `Datetime` data is stored|Formatting string, for example: `%Y-%m-%dT%H-%M`|
|`date.timestamp.format_name`|Predefined format in which `Timestamp` data is stored|`POSIX`, `ISO`, `UNIX_TIME_SECONDS`, `UNIX_TIME_MILLISECONDS`, `UNIX_TIME_MICROSECONDS`|
|`data.timestamp.format`|Strftime-like template which defines how `Timestamp` data is stored|Formatting string, for example: `%Y-%m-%dT%H-%M-%S`|
|`data.date.format`|The format in which `Date` data is stored|Formatting string, for example: `%Y-%m-%d`|
|`csv_delimiter`|Delimeter for `csv_with_names` format|Any character|
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,16 @@ In {{ ydb-full-name }}, the following data paths are supported:

{% include [!](_includes/path_format.md) %}

### Format settings {#format_settings}

In {{ ydb-full-name }}, the following format settings are supported:

{% include [!](_includes/format_settings.md) %}

Any conversion specifiers supported by `strftime`(C99) function can be used in formatting strings. In {{ ydb-full-name }}, the following `Datetime` and `Timestamp` formats are supported:

{% include [!](_includes/date_formats.md) %}

## Example {#read_example}

Example query to read data from S3 ({{ objstorage-full-name }}):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ Where:
- `csv_with_names` - one of the [permitted data storage formats](formats.md);
- `gzip` - one of the [permitted compression algorithms](formats.md#compression).

You can also specify [format settings](external_data_source.md#format_settings).

## Data model {#data-model}

Reading data using external tables from S3 ({{ objstorage-name }}) is done with regular SQL queries as if querying a normal table.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
|Имя|Описание|Пример|
|---|---|---|
|`POSIX`|Строка формата `%Y-%m-%d %H:%M:%S`|2001-03-26 16:10:00|
|`ISO`|Формат, соответствующий стандарту Iso8601|2001-03-26 16:10:00Z|
|`UNIX_TIME_SECONDS`|Количество секунд с начала эпохи|985623000|
|`UNIX_TIME_MILLISECONDS`|Количество миллисекунд с начала эпохи|985623000000|
|`UNIX_TIME_MICROSECONDS`|Количество микросекунд с начала эпохи|985623000000000|
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
|Имя параметра|Описание|Принимаемые значения|
|----|----|---|
|`file_pattern`|Шаблон имени файла|Строка шаблона имени. Поддерживаются wildcards `*`.|
|`data.interval.unit`|Единица измерения для парсинга типа `Interval`|`MICROSECONDS`, `MILLISECONDS`, `SECONDS`, `MINUTES`, `HOURS`, `DAYS`, `WEEKS`|
|`data.datetime.format_name`|Предопределенный формат, в котором записаны данные типа `Datetime`|`POSIX`, `ISO`|
|`data.datetime.format`|Шаблон, определяющий как записаны данные типа `Datetime`|Строка форматирования, например: `%Y-%m-%dT%H-%M`|
|`data.timestamp.format_name`|Предопределенный формат, в котором записаны данные типа `Timestamp`|`POSIX`, `ISO`, `UNIX_TIME_SECONDS`, `UNIX_TIME_MILLISECONDS`, `UNIX_TIME_MICROSECONDS`|
|`data.timestamp.format`|Шаблон, определяющий как записаны данные типа `Timestamp`|Строка форматирования, например: `%Y-%m-%dT%H-%M-%S`|
|`data.date.format`|Формат, в котором записаны данные типа `Date`|Строка форматирования, например: `%Y-%m-%d`|
|`csv_delimiter`|Разделитель данных в формате `csv_with_names`|Любой символ|
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,16 @@ WHERE

{% include [!](_includes/path_format.md) %}

### Параметры форматирования {#format_settings}

В {{ ydb-full-name }} поддерживаются следующие параметры форматирования:

{% include [!](_includes/format_settings.md) %}

В строках форматирования можно использовать любые шаблонные переменные, поддерживаемые функцией `strftime`(C99). В {{ ydb-full-name }} поддерживаются следующие форматы типов `Datetime` и `Timestamp`:

{% include [!](_includes/date_formats.md) %}

## Пример {#read_example}

Пример запроса для чтения данных из S3 ({{ objstorage-full-name }}):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ CREATE EXTERNAL TABLE `s3_test_data` (
- `csv_with_names` - один из [допустимых типов хранения данных](formats.md);
- `gzip` - один из [допустимых алгоритмов сжатия](formats.md#compression).

Также при создании внешних таблиц поддерживаются [параметры форматирования](external_data_source.md#format_settings).

## Модель данных {#data-model}

Expand Down