Cache rework - to avoid problems with pdf files and memory consumption #1546
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I prepared draft of assetCache upgrade. Current implementation of assetCache, especially handling pdf files does not work correctly.
My code is not very polished, and it contains several unnecessary and inelegant parts (as I am not highly experienced in programming asynchronous applications). However, I hope it may serve as a useful starting point for considering how to adapt the current cache to ensure it functions correctly and efficiently with PDF files.
I also tried to ensure that notes created in the current version of Saber can be loaded. When reloading a note that contains a PDF and then saving it again, an asset is created for each page, each containing the entire PDF. This is why
AssetCacheAll.addSync
is particularly complicated and makes use ofpreviewHash
to identify many identical files using starting 100KB bytes and file size.I tested this implementation on windows and Android (Samsung Galaxy Tab A7 lite). Both were able to display 850 pages pdf and note was loaded quite fast and pages are displayed almost instantly during listing pages.
Short description of current state of cache and my approach follows
Current cache and images problems:
because all bytes must be compared with each already added)
1. importPdfFromFilePath create one File " final pdfFile = File(path);"
and this File is used to create all instances of pages
2. when saving, all pages are one asset, because File is the same object!!!
why: PdfEditorImage class
1. when reading fromJson is created "pdfFile = FileManager.getFile('$sbnPath${Editor.extension}.$assetIndex');"
for each page (even if they are the same asset file)
2. PdfEditorImage constructor is called with this File - each page has its own File!!!
1. OrderedCache.add adds each page as new asset because each page is different File
1. PdfEditorImage keeps bytes of the whole pdf (wasting memory) even if it renders only one page
2. creates its own pdfDocument renderer - for each pdf page is new renderer keeping whole pdf
3. while saving note is to the OrderedAssetCache added each page of pdf separately as new asset.
New approach to cache
This code is very dirty and in fact only cache of File is needed (a lot of unnecessary code is implemented)
Cache handles jpg, png, pdf (not svg yet)
For each photo item provides ValueNotifier<ImageProvider?>, so the same items have the same ImageProvider.
For each pdf item provides PdfDocument, so every page of pdf use the same provider.
During reading note to Editor are new items added using addSync - which is fast and synchronous
addSync method:
this is important only for compatibility.
In Editor is used async method when adding new image
add method:
it compares first paths, file size and then hashes of all cache items
calculation of hash is very time consuming, it will be better for pdfs to extract /Info and read author, creation date, etc.
and use this to recognize different pdfs. It will take less time.
Cache properties:
If somebody try to test it, simply create new note from pdf, save it, close note and open it again.
I believe the cache needs to be rewritten, as many users work with PDFs and the current version is not functional and cannot handle displaying PDFs with hundreds of pages.