Skip to content

try to fix #2086 #2087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

try to fix #2086 #2087

wants to merge 2 commits into from

Conversation

ijustyce
Copy link

@ijustyce ijustyce commented Feb 9, 2025

新增 StreamRead 可替代 OpenFile,用于真正流式读取 Excel,OpenFile 会调用 OpenReader,OpenReader 会调用 io.ReadAll, 而 io.ReadAll 会将整个文件加载到内存,使流式读取失效;

请注意,StreamRead 不支持打开加密的 Excel 2003

@xuri xuri added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 9, 2025
Copy link
Member

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR. I've left some comments.

@@ -217,6 +217,60 @@ func OpenReader(r io.Reader, opts ...Options) (*File, error) {
return f, err
}

// StreamRead 流式读取,不支持加密的 Excel 2003
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't introduce new exported function, just updated OpenReader function. Encrypted workbooks should be supported. Please also update unit tests for these changes.

// 不支持 加密的 Excel 2003 格式!
// openByName read data stream from io.Reader and return a populated
// spreadsheet file.
func openByName(name string, opts ...Options) (*File, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any perf data can be shared about these changes? So, we can compare perf before and after changes. For example: CPU, memory usage, disk I/O.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我在1G内存的pod 中尝试读取10G大小的 Excel 文件,会直接 oom killed,因为 io.ReadAll 会将整个文件加载到内存里面来;

修改后,内存占用稳定在 54M

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants