Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(java): zstd meta compressor #2042

Merged
merged 16 commits into from
Feb 7, 2025
Merged

Conversation

orisgarno
Copy link
Contributor

@orisgarno orisgarno commented Feb 6, 2025

What does this PR do?

create zstd metacompressor as an option. let zstd access the src arr instead of copy to new array.

Related issues

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

Benchmark

@orisgarno orisgarno marked this pull request as ready for review February 6, 2025 10:34
@orisgarno orisgarno marked this pull request as draft February 6, 2025 10:36
@orisgarno orisgarno marked this pull request as ready for review February 6, 2025 10:42
@chaokunyang
Copy link
Collaborator

Could we add a new fury java module named as fury-extension, and move those code to that module? In this way, we could reduce the dependencies in fury-core

@chaokunyang chaokunyang closed this Feb 6, 2025
@chaokunyang chaokunyang reopened this Feb 6, 2025
<dependency>
<groupId>com.github.luben</groupId>
<artifactId>zstd-jni</artifactId>
</dependency>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use povided dependency?

Copy link
Contributor Author

@orisgarno orisgarno Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im quite new to maven. ive made changes in the new commit, is it suffice?
d1a2930

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, d1a2930 is OK. Could you remove zstd-jni from fury-core module, it's used in fury-extensions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops forgot to remove that. Done

import com.github.luben.zstd.Zstd;
import com.github.luben.zstd.ZstdException;

public class ZstdMetaCompressor implements MetaCompressor {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test with org.apache.fury.config.FuryBuilder#withMetaCompressor set to this compressor?

Copy link
Contributor Author

@orisgarno orisgarno Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean this?
java/fury-extensions/src/test/java/org/apache/fury/meta/ClassDefEncoderTest.java
or anything specific in mind?

Comment on lines 36 to 45
Zstd.compressByteArray(
compressedData,
0,
(int) maxCompressedSize,
data,
offset,
size,
Zstd.defaultCompressionLevel());

return compressedData;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Zstd.compressByteArray(
compressedData,
0,
(int) maxCompressedSize,
data,
offset,
size,
Zstd.defaultCompressionLevel());
return compressedData;
long size = Zstd.compressByteArray(
compressedData,
0,
(int) maxCompressedSize,
data,
offset,
size,
Zstd.defaultCompressionLevel());
return Arrays.copyOf(compressedData, (int) size);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share why its better to have its copies? @chaokunyang

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether maxCompressedSize is exactly the compressed size fo zstd, or it's a n estimated max size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After read zstd code, I got the idea why. Nice catch.
Lgtm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check for the decompress as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

java/pom.xml Outdated Show resolved Hide resolved
java/pom.xml Outdated Show resolved Hide resolved
Copy link
Collaborator

@chaokunyang chaokunyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chaokunyang chaokunyang merged commit a80140a into apache:main Feb 7, 2025
43 checks passed
@orisgarno orisgarno deleted the zstd-direct-array branch February 7, 2025 07:02
chaokunyang pushed a commit to chaokunyang/fury that referenced this pull request Feb 7, 2025
<!--
**Thanks for contributing to Fury.**

**If this is your first time opening a PR on fury, you can refer to
[CONTRIBUTING.md](https://github.com/apache/fury/blob/main/CONTRIBUTING.md).**

Contribution Checklist

- The **Apache Fury (incubating)** community has restrictions on the
naming of pr titles. You can also find instructions in
[CONTRIBUTING.md](https://github.com/apache/fury/blob/main/CONTRIBUTING.md).

- Fury has a strong focus on performance. If the PR you submit will have
an impact on performance, please benchmark it first and provide the
benchmark result here.
-->

## What does this PR do?
create zstd metacompressor as an option. let zstd access the src arr
instead of copy to new array.

## Related issues

<!--
Is there any related issue? Please attach here.

- #xxxx0
- #xxxx1
- #xxxx2
-->

## Does this PR introduce any user-facing change?

<!--
If any user-facing interface changes, please [open an
issue](https://github.com/apache/fury/issues/new/choose) describing the
need to do so and update the document if necessary.
-->

- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark

<!--
When the PR has an impact on performance (if you don't know whether the
PR will have an impact on performance, you can submit the PR first, and
if it will have impact on performance, the code reviewer will explain
it), be sure to attach a benchmark data here.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants