Skip to content

Improve combined fstree format #3418

@roman-khimov

Description

@roman-khimov

Is your feature request related to a problem? Please describe.

I'm always frustrated when reads from FSTree take more time than they could in case of combined format. We do not have any OID: [offset, size] map there, so reader has to Seek() through to get the object he needs. For 128 objects in a pack it could take some time.

Describe the solution you'd like

A table can be stored at the end of the file with this data, it's trivial to do for writer and then reader can:

  • open file, read a chunk
  • try the first object, try others already read, if found --- OK
  • if not, seek to length-MAX_TABLE_SIZE of the file
  • read data
  • find the number of entries stored in the last bytes
  • jump backwards through the data to find the object needed
  • Seek to it directly, read

This works for files generated on-the-fly from processed PUT requests. But if we have a write cache and use direct batched writing this can be further optimized by placing the table at the beginning of the file (format should support both). Then reader can get and use it immediately after initial read.

Describe alternatives you've considered

Not known.

Additional context

#2814
#2925

Metadata

Metadata

Assignees

No one assigned

    Labels

    I4No visible changesS3Minimally significantU3RegularenhancementImproving existing functionalityneofs-storageStorage node application issuesperformanceMore of something per second

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions