Cache rework - to avoid problems with pdf files and memory consumption #1546

QubaB · 2025-09-28T18:33:07Z

I prepared draft of assetCache upgrade. Current implementation of assetCache, especially handling pdf files does not work correctly.

My code is not very polished, and it contains several unnecessary and inelegant parts (as I am not highly experienced in programming asynchronous applications). However, I hope it may serve as a useful starting point for considering how to adapt the current cache to ensure it functions correctly and efficiently with PDF files.

I also tried to ensure that notes created in the current version of Saber can be loaded. When reloading a note that contains a PDF and then saving it again, an asset is created for each page, each containing the entire PDF. This is why AssetCacheAll.addSync is particularly complicated and makes use of previewHash to identify many identical files using starting 100KB bytes and file size.

I tested this implementation on windows and Android (Samsung Galaxy Tab A7 lite). Both were able to display 850 pages pdf and note was loaded quite fast and pages are displayed almost instantly during listing pages.

Short description of current state of cache and my approach follows

Current cache and images problems:

two caches assetCache (for working), OrderedAssetCache (for writing)
keeping bytes of each asset in memory (I do not know if it is problem, but it is problem when adding to cache
because all bytes must be compared with each already added)
after first saving of note importing pdf, is pdf saved as one asset because:
1. importPdfFromFilePath create one File " final pdfFile = File(path);"
and this File is used to create all instances of pages
2. when saving, all pages are one asset, because File is the same object!!!
when loading note again and saving note, each page is treated as different pdf
why: PdfEditorImage class
1. when reading fromJson is created "pdfFile = FileManager.getFile('$sbnPath${Editor.extension}.$assetIndex');"
for each page (even if they are the same asset file)
2. PdfEditorImage constructor is called with this File - each page has its own File!!!
1. OrderedCache.add adds each page as new asset because each page is different File
problems of PdfEditorImage
1. PdfEditorImage keeps bytes of the whole pdf (wasting memory) even if it renders only one page
2. creates its own pdfDocument renderer - for each pdf page is new renderer keeping whole pdf
3. while saving note is to the OrderedAssetCache added each page of pdf separately as new asset.

New approach to cache
This code is very dirty and in fact only cache of File is needed (a lot of unnecessary code is implemented)
Cache handles jpg, png, pdf (not svg yet)

For each photo item provides ValueNotifier<ImageProvider?>, so the same items have the same ImageProvider.
For each pdf item provides PdfDocument, so every page of pdf use the same provider.

During reading note to Editor are new items added using addSync - which is fast and synchronous
addSync method:

must treat duplicated assets (especially pdfs created by current version of OrderedCache)
it calculate fast hash from first 100 KB of file and file size, if hash is the same files are "identical"
this is important only for compatibility.

In Editor is used async method when adding new image
add method:
it compares first paths, file size and then hashes of all cache items
calculation of hash is very time consuming, it will be better for pdfs to extract /Info and read author, creation date, etc.
and use this to recognize different pdfs. It will take less time.

Cache properties:

Every cache item is created and treated as File (path). Even picked Photos are first saved as temporary files and then added to chache as File.
Each item provides PdfDocument for pdfs or ValueNotifier<ImageProvider?> for images. It saves memory.

If somebody try to test it, simply create new note from pdf, save it, close note and open it again.

I believe the cache needs to be rewritten, as many users work with PDFs and the current version is not functional and cannot handle displaying PDFs with hundreds of pages.

…time and removes all duplicities Now implemented only adding to cache during read of note fromJson. Calculated is previewHash from first 100kB of files for fast recognition of assets which are duplicated (all assets which are saved after note is reedited)

…ages as single object started implementation of lazy ImageProvider

…hare the same asset only one image provider is used replaceImage is implemented when png image is resized due to its greater size

important is add $documentsDirectory to file name otherwise assets are saved to root of file system and it is not allowed

QubaB added 9 commits September 27, 2025 09:58

implemented pdfDocument in asset cache so it is accessible from all p…

2505d3f

…ages as single object started implementation of lazy ImageProvider

Implemented image provider for png.

3d794d8

asset cache provides imageProvider as Notifier, so when more images s…

efb66ad

…hare the same asset only one image provider is used replaceImage is implemented when png image is resized due to its greater size

added comment about new cache approach

3870794

updated cache properties and description of problems of current cache

77bf194

added copyFile to FileManager - used to copy assets

3d50478

important is add $documentsDirectory to file name otherwise assets are saved to root of file system and it is not allowed

implemented saving of note using new assetCacheAll cache

af9cf7b

FileManager - reworked copyFile to add correct base directory

2b95962

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Cache rework - to avoid problems with pdf files and memory consumption #1546

Cache rework - to avoid problems with pdf files and memory consumption #1546

Uh oh!

QubaB commented Sep 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Cache rework - to avoid problems with pdf files and memory consumption #1546

Are you sure you want to change the base?

Cache rework - to avoid problems with pdf files and memory consumption #1546

Uh oh!

Conversation

QubaB commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

QubaB commented Sep 28, 2025 •

edited

Loading