Skip to content

wr.s3.to_parquet() fails to parse explicit map of decimals type #750

Open
@johanLsp

Description

@johanLsp

Description

This issue occurs when passing a dtype argument to wr.s3.to_parquet() to coerce one of the column to a map containing decimals, e.g. map<int, decimal(12,2)>. See reproduction below.

The helper function used to split the map fields doesn't take into account the parenthesis and splits on all commas, resulting in field typesint, decimal(12 and 2) instead of the expected int and decimal(12,2).

Environment

awswrangler                   2.8.0

Reproduction

import awswrangler as wr
import pandas as pd
import decimal


df = pd.DataFrame({"map_col": [{"a": decimal.Decimal("1.23")}]})

wr.s3.to_parquet(
    df=df,
    dataset=True,
    path="dummy-location",
    database="dummy-db",
    table="dummy-table",
    dtype={"map_col": "map<int, decimal(12,2)>"},
)

Output:

Traceback (most recent call last):
  File "awswrangler_map_decimal.py", line 14, in <module>
    dtype={"map_col": "map<int, decimal(12,2)>"},
  File "/home/laspj/.local/lib/python3.6/site-packages/awswrangler/_config.py", line 417, in wrapper
    return function(**args)
  File "/home/laspj/.local/lib/python3.6/site-packages/awswrangler/s3/_write_parquet.py", line 537, in to_parquet
    df=df, index=index, ignore_cols=partition_cols, dtype=dtype
  File "/home/laspj/.local/lib/python3.6/site-packages/awswrangler/_data_types.py", line 581, in pyarrow_schema_from_pandas
    columns_types[k] = athena2pyarrow(dtype=v)
  File "/home/laspj/.local/lib/python3.6/site-packages/awswrangler/_data_types.py", line 291, in athena2pyarrow
    parts: List[str] = _split_map(s=orig_dtype[4:-1])
  File "/home/laspj/.local/lib/python3.6/site-packages/awswrangler/_data_types.py", line 250, in _split_map
    raise RuntimeError(f"Invalid map fields: {s}")
RuntimeError: Invalid map fields: int, decimal(12,2)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions