[log] Fix the 'Destination buffer is too small' error while decompress data in ZstdArrowCompressionCodec #464

swuferhong · 2025-02-21T09:23:26Z

Purpose

Linked issue: #462

Currently, if we define a String in the table and stream write a very large size of data, such as a single row size 5k, the ZstdArrowCompressionCodec will definitely throw a RuntimeException with the message 'Destination buffer is too small' during decompression.

This pr is aims to fix this error. Through investigation, we found that if we replace the compression way using Zstd.compressUnsafe(long dst, long dstSize, long src, long srcSize, int level) with memory copy methods like Zstd.compressDirectByteBuffer(ByteBuffer dst, int dstOffset, int dstSize, ByteBuffer src, int srcOffset, int srcSize, int level), or switch to using ZstdOutputStreamNoFinalizer, the error no longer occurs. The possible reason is that the compressUnsafe way has some memory operation anomalies that are not being handled correctly.

Why we use Zstd.compressDirectByteBuffer() here is that the stream output way ZstdOutputStreamNoFinalizer will consume more cpu than previous way.

Tests

API and Format

Documentation

…s data in ZstdArrowCompressionCodec

[log] Fix the 'Destination buffer is too small' error while decompres…

a811e76

…s data in ZstdArrowCompressionCodec

swuferhong requested a review from wuchong February 21, 2025 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[log] Fix the 'Destination buffer is too small' error while decompress data in ZstdArrowCompressionCodec #464

[log] Fix the 'Destination buffer is too small' error while decompress data in ZstdArrowCompressionCodec #464

swuferhong commented Feb 21, 2025 •

edited

Loading

[log] Fix the 'Destination buffer is too small' error while decompress data in ZstdArrowCompressionCodec #464

Are you sure you want to change the base?

[log] Fix the 'Destination buffer is too small' error while decompress data in ZstdArrowCompressionCodec #464

Conversation

swuferhong commented Feb 21, 2025 • edited Loading

Purpose

Tests

API and Format

Documentation

swuferhong commented Feb 21, 2025 •

edited

Loading