Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend FileIOFactory to Improve Customization #724

Merged
merged 16 commits into from
Jan 29, 2025

Conversation

XJDKC
Copy link
Member

@XJDKC XJDKC commented Jan 13, 2025

Context

Currently, we have three places where a FileIO is constructed:

  1. PolarisBaseCatalog::loadFileIO: Used as the default FileIO across the entire catalog during testing.
  2. PolarisBaseCatalog::refreshIOWithCredential: Refreshes storage credentials and constructs a new FileIO.
  3. TaskFileIOSupplier::apply: Fetches subscoped credentials and uses FileIOFactory to create a FileIO for cleanup tasks.

We’ve identified an issue where not all properties required by FileIO are generated by the FileIOFactory. For instance, subscoped credentials are retrieved outside the FileIOFactory and then passed to it. Additionally, the current FileIOFactory interface has limited access to contextual information.

To enhance the ability to customize FileIOFactory, we plan to extend it by passing additional information. This improvement will enable developers to implement custom logic such as auditing, proxy, and more.

Some code pointers:

@XJDKC XJDKC force-pushed the pass-internal-properties-to-fileio branch from 057bd17 to 836d3f1 Compare January 13, 2025 23:01
@XJDKC XJDKC changed the title Pass InternalProperties to FileIO Pass Internal Properties to FileIO Jan 13, 2025
@XJDKC XJDKC force-pushed the pass-internal-properties-to-fileio branch from 836d3f1 to 0a10632 Compare January 14, 2025 09:04
@snazy
Copy link
Member

snazy commented Jan 14, 2025

Thanks for your contribution @XJDKC!
I think this change blurs the line between "properties" and "internal properties" quite a bit. I'm also, in general, concerned about generic property bag per se containing all kinds of information - things that can be exposed (upstream/downstream) and things that must not - nobody can ensure that no "bad things" (think: logging secrets) happen to this information.

@XJDKC XJDKC force-pushed the pass-internal-properties-to-fileio branch from 88bc1af to 6ff8f2b Compare January 14, 2025 18:10
@XJDKC XJDKC changed the title Pass Internal Properties to FileIO Extend FileIOFactory to Improve Customization Jan 22, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @XJDKC ! Overall, the new code looks good to me :) A few minor comments below.

@XJDKC XJDKC force-pushed the pass-internal-properties-to-fileio branch from 00a9ae4 to decc5b8 Compare January 22, 2025 17:09
Copy link
Contributor

@dennishuo dennishuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sfc-gh-dhuo
Copy link

@snazy @dimas-b Looks like all the major comments have been resolved; any final/remaining suggestions before merging this?

@XJDKC
Copy link
Member Author

XJDKC commented Jan 24, 2025

Hey @snazy @dimas-b May I get another review for this PR, thanks!

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay. The latest changes LGTM with one remaining concern about WasbTranslatingFileIOFactory (which I'd like to clarity) and a few nit comments (which are optional).

@snazy
Copy link
Member

snazy commented Jan 27, 2025

Recent changes LGTM - deferring to @dimas-b

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @XJDKC mentioned making a few minor follow-up changes... otherwise LGTM.

@XJDKC XJDKC requested a review from MonkeyCanCode as a code owner January 27, 2025 22:38
@XJDKC XJDKC force-pushed the pass-internal-properties-to-fileio branch from 9fd3482 to f3a5ef0 Compare January 27, 2025 22:59
@XJDKC
Copy link
Member Author

XJDKC commented Jan 28, 2025

Hi folks, I think this PR is ready to be merged into main branch.
I have addressed all the comments, there are still some followup tasks, but we can do them later:
e.g. remove test only loadFileIO interface from PolarisBaseCatalog.

Also I have updated the PR desc to clarify the purposes of this PR.
Can I get a final stamp for this PR? Thanks!
cc: @dimas-b @snazy @dennishuo @eric-maynard @collado-mike

@eric-maynard eric-maynard enabled auto-merge (squash) January 29, 2025 21:27
@eric-maynard eric-maynard merged commit 23385a0 into apache:main Jan 29, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants