Skip to content

Parquet schema hint doesn't support integer types upcasting #6891

@gruuya

Description

@gruuya

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The present matching logic for overriding a Parquet schema doesn't support Integer up-casting

fn apply_hint(parquet: DataType, hint: DataType) -> DataType {
match (&parquet, &hint) {
// Not all time units can be represented as LogicalType / ConvertedType
(DataType::Int32 | DataType::Int64, DataType::Timestamp(_, _)) => hint,
(DataType::Int32, DataType::Time32(_)) => hint,
(DataType::Int64, DataType::Time64(_)) => hint,
// Date64 doesn't have a corresponding LogicalType / ConvertedType
(DataType::Int64, DataType::Date64) => hint,

Describe the solution you'd like
I'd like to be able to override any integer type as long is avoids precision loss (though it could be argued that even this is too conservative).

Describe alternatives you've considered
Apply some kind of a schema adapter/mask at a higher level, e.g. via some DataFusion extension mechanism.

Additional context
Related to apache/iceberg-rust#813.

Metadata

Metadata

Assignees

No one assigned

    Labels

    development-processRelated to development process of arrow-rsenhancementAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions