Skip to content

Conversation

QubaB
Copy link
Contributor

@QubaB QubaB commented Sep 28, 2025

I prepared draft of assetCache upgrade. Current implementation of assetCache, especially handling pdf files does not work correctly.

My code is not very polished, and it contains several unnecessary and inelegant parts (as I am not highly experienced in programming asynchronous applications). However, I hope it may serve as a useful starting point for considering how to adapt the current cache to ensure it functions correctly and efficiently with PDF files.

I also tried to ensure that notes created in the current version of Saber can be loaded. When reloading a note that contains a PDF and then saving it again, an asset is created for each page, each containing the entire PDF. This is why AssetCacheAll.addSync is particularly complicated and makes use of previewHash to identify many identical files using starting 100KB bytes and file size.

I tested this implementation on windows and Android (Samsung Galaxy Tab A7 lite). Both were able to display 850 pages pdf and note was loaded quite fast and pages are displayed almost instantly during listing pages.

Short description of current state of cache and my approach follows

Current cache and images problems:

  1. two caches assetCache (for working), OrderedAssetCache (for writing)
  2. keeping bytes of each asset in memory (I do not know if it is problem, but it is problem when adding to cache
    because all bytes must be compared with each already added)
  3. after first saving of note importing pdf, is pdf saved as one asset because:
    1. importPdfFromFilePath create one File " final pdfFile = File(path);"
    and this File is used to create all instances of pages
    2. when saving, all pages are one asset, because File is the same object!!!
  4. when loading note again and saving note, each page is treated as different pdf
    why: PdfEditorImage class
    1. when reading fromJson is created "pdfFile = FileManager.getFile('$sbnPath${Editor.extension}.$assetIndex');"
    for each page (even if they are the same asset file)
    2. PdfEditorImage constructor is called with this File - each page has its own File!!!
    1. OrderedCache.add adds each page as new asset because each page is different File
  5. problems of PdfEditorImage
    1. PdfEditorImage keeps bytes of the whole pdf (wasting memory) even if it renders only one page
    2. creates its own pdfDocument renderer - for each pdf page is new renderer keeping whole pdf
    3. while saving note is to the OrderedAssetCache added each page of pdf separately as new asset.

New approach to cache
This code is very dirty and in fact only cache of File is needed (a lot of unnecessary code is implemented)
Cache handles jpg, png, pdf (not svg yet)

For each photo item provides ValueNotifier<ImageProvider?>, so the same items have the same ImageProvider.
For each pdf item provides PdfDocument, so every page of pdf use the same provider.

During reading note to Editor are new items added using addSync - which is fast and synchronous
addSync method:

  1. must treat duplicated assets (especially pdfs created by current version of OrderedCache)
  2. it calculate fast hash from first 100 KB of file and file size, if hash is the same files are "identical"
    this is important only for compatibility.

In Editor is used async method when adding new image
add method:
it compares first paths, file size and then hashes of all cache items
calculation of hash is very time consuming, it will be better for pdfs to extract /Info and read author, creation date, etc.
and use this to recognize different pdfs. It will take less time.

Cache properties:

  1. Every cache item is created and treated as File (path). Even picked Photos are first saved as temporary files and then added to chache as File.
  2. Each item provides PdfDocument for pdfs or ValueNotifier<ImageProvider?> for images. It saves memory.

If somebody try to test it, simply create new note from pdf, save it, close note and open it again.


I believe the cache needs to be rewritten, as many users work with PDFs and the current version is not functional and cannot handle displaying PDFs with hundreds of pages.

…time and removes all duplicities

Now implemented only adding to cache during read of note fromJson.
Calculated is previewHash from first 100kB of files for fast recognition of assets which are duplicated (all assets which are saved after note is reedited)
…ages as single object

started implementation of lazy ImageProvider
…hare the same asset only one image provider is used

replaceImage   is implemented when png image is resized due to its greater size
important is add  $documentsDirectory  to file name otherwise assets are saved to root of file system and it is not allowed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant