-
Notifications
You must be signed in to change notification settings - Fork 843
Description
In hffs, we implement _fetch_range which allows to retrieve bytes from a remote file without downloading it entirely (see fsspec). This is nice when downloading only parts of a file but it we want to download it entirely, it would be best to benefit from the existing hf_hub_download than using the HF cache system.
@mariosasko @lhoestq given your knowledge of fsspec, do you think it would be possible to overwrite the read method so that if read is called with length=-1, then we cache the entire file and read it from disk? And if length!=-1 we default back to the normal implementation. Do you see any weird side effect that this could cause?
Also for _fetch_range instead of always fetching from remote, we could try to find the file locally first.
Otherwise I saw that they also define a BaseCache object that we could extend. To you think it's worth trying to tweak it to use our existing cache?