You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TLDR: The FileRequired function will change it's interface. If you just use the core lib or cli, no change needed. If you use plugins directly or have your own plugins, you need to migrate.
Currently Stat is called on every file. FileRequired checks in most plugins 2 things: 1. if the path matches a certain pattern 2. if the filesize is within limits, as a safe guard against OOM and timeouts. Given that many files usually don't match any plugin, Stat is called, but the result is never used. Scalibr spends up to 50% in Stat calls. We can fix this by making Stat lazy. Thus FileRequired gets an API, which allows the plugin to call Stat() only when needed. The core library makes sure that Stat is actually only called once on the file.
History
FileRequired(path string) bool
At the beginning FileRequired just got the path.
FileRequired(path string, mode FileMode) bool
For binary extractors (e.g. gobinaries) Scalibr needs to know if the file has the executable bit set. Thus we added FileMode as a parameter. FileMode is in the DirEntry we get from the walk function.
This was a bug, since Walk does not set the executable bit in FileMode, only things like if the file is a dir or regular file are set. Thus the core lib started to call Stat on all files
FileRequired(path string, info FileInfo) bool
For production applications we need to make sure Scalibr does not consume too much resources, especially RAM and CPU. Thus we added a file size limit to the plugins which are used in one of those environments.
FileMode does not contain the file size, but FileInfo contains FileMode and the files size. Thus we changed the interface to accept FileInfo.
[alternative] FileRequired(path string, stat func() (fs.FileInfo, error)) bool
Scalibr now calls Stat on all files on the file system, but is only interested in certain files (determined by file name) and limits those by size. Those stat calls are responsible for up to 50% of the runtime of Scalibr. To fix this FileRequired gets access to the stat function, to be able to call Stat only when necessary. Additionally the Core lib caches the stat calls, such that if multiple plugins need to Stat the same file, there is only one Stat call to the os.
[this proposal] FileRequired(path string, api FileApi) bool
With growing adoption of Scalibr these API changes get expensive, thus we introduce an abstraction, such that future changes do not break the plugin.
The text was updated successfully, but these errors were encountered:
TLDR: The FileRequired function will change it's interface. If you just use the core lib or cli, no change needed. If you use plugins directly or have your own plugins, you need to migrate.
before:
after:
Why?
Currently Stat is called on every file. FileRequired checks in most plugins 2 things: 1. if the path matches a certain pattern 2. if the filesize is within limits, as a safe guard against OOM and timeouts. Given that many files usually don't match any plugin, Stat is called, but the result is never used. Scalibr spends up to 50% in Stat calls. We can fix this by making Stat lazy. Thus FileRequired gets an API, which allows the plugin to call Stat() only when needed. The core library makes sure that Stat is actually only called once on the file.
History
FileRequired(path string) bool
At the beginning FileRequired just got the path.
FileRequired(path string, mode FileMode) bool
For binary extractors (e.g. gobinaries) Scalibr needs to know if the file has the executable bit set. Thus we added FileMode as a parameter. FileMode is in the DirEntry we get from the walk function.
This was a bug, since Walk does not set the executable bit in FileMode, only things like if the file is a dir or regular file are set. Thus the core lib started to call Stat on all files
FileRequired(path string, info FileInfo) bool
For production applications we need to make sure Scalibr does not consume too much resources, especially RAM and CPU. Thus we added a file size limit to the plugins which are used in one of those environments.
FileMode does not contain the file size, but FileInfo contains FileMode and the files size. Thus we changed the interface to accept FileInfo.
[alternative] FileRequired(path string, stat func() (fs.FileInfo, error)) bool
Scalibr now calls Stat on all files on the file system, but is only interested in certain files (determined by file name) and limits those by size. Those stat calls are responsible for up to 50% of the runtime of Scalibr. To fix this FileRequired gets access to the stat function, to be able to call Stat only when necessary. Additionally the Core lib caches the stat calls, such that if multiple plugins need to Stat the same file, there is only one Stat call to the os.
[this proposal] FileRequired(path string, api FileApi) bool
With growing adoption of Scalibr these API changes get expensive, thus we introduce an abstraction, such that future changes do not break the plugin.
The text was updated successfully, but these errors were encountered: