Search before asking
What would you like to be improved?
For large tables written by Flink, each commit will submit an EQ DELETE file associated with all previous data files. Most of the generated optimize tasks will repeatedly read this EQ DELETE file, causing duplicate IO cost.
How should we improve?
Each JVM(taskmanager, executor) in the Optimizer generates a Cache to cache the EQ DELETE File.
Are you willing to submit PR?
Subtasks
No response
Code of Conduct