Summary 💡
I am proposing the ability to customize delta topological relationships (structured as a forest) during the repack process. Users should be able to define these relationships via stdin or other mechanisms. In this structure, if Object B is a child node of Object A, B serves as the delta target and A serves as the delta source (base).
Motivation 🔦
I'm using Git in some rare way, which treat Git as a compression tool. In large-scale storage scenarios, customized Delta topological graph are essential for balancing compression ratios with decompression performance.
My investigation into the Git source code reveals that the current implementation has limitations regarding objects after "delta reuse." This prevents certain incremental writes from fully reusing similar objects and often results in wasted CPU cycles on full repacks or complex workarounds (such as manually specifying preferred_base). While a standard git repack -f is computationally expensive, using pack-objects with preferred_base directly is both operationally complex and semantically limited.
Summary 💡
I am proposing the ability to customize delta topological relationships (structured as a forest) during the
repackprocess. Users should be able to define these relationships viastdinor other mechanisms. In this structure, if Object B is a child node of Object A, B serves as the delta target and A serves as the delta source (base).Motivation 🔦
I'm using Git in some rare way, which treat Git as a compression tool. In large-scale storage scenarios, customized Delta topological graph are essential for balancing compression ratios with decompression performance.
My investigation into the Git source code reveals that the current implementation has limitations regarding objects after "delta reuse." This prevents certain incremental writes from fully reusing similar objects and often results in wasted CPU cycles on full repacks or complex workarounds (such as manually specifying
preferred_base). While a standardgit repack -fis computationally expensive, usingpack-objectswithpreferred_basedirectly is both operationally complex and semantically limited.