Is leveraging custom protocols to specify different data sources a bad idea? #1529
-
| I'm working on a file storage library for work, and I'm trying to abstract out the underlying file storage layer a bit. I got a proof of concept working with anonymous classes which works with the protocol registry. | 
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
| I have absolutely no problem for you using the registry in this way. You might also be interested in the ReferenceFileSystem ("reference://") which provides a similar redirection: every path maps to some other file, part of a file or concrete data. We also have the prefixFS ("dir://") which allows you to add a prefix to every URL for arbitrary backend. You sound like you want something just inbetween these two choices. | 
Beta Was this translation helpful? Give feedback.
-
| 
 I'm trying to do a similar thing as you. My solution works but was wondering how you achieved it? class MadFileSystem(fsspec.AbstractFileSystem):
    protocol = "mad"
    def __init__(self, basepath: str, storage_options: dict | None = None, **kwargs):
        options = storage_options or kwargs
        fs, fs_url = fsspec.core.url_to_fs(basepath, **options)
        self._fs: fsspec.AbstractFileSystem = fs
        self._fs_url: str = fs_url
        super().__init__(**options)
    def glob(self, path: str, **kwargs):
        return self._fs.glob(self.fix_path(path), **kwargs)
    def info(self, path: str, **kwargs):
        return self._fs.info(self.fix_path(path), **kwargs)
    def _open(self, path: str, **kwargs):
        return self._fs._open(self.fix_path(path), **kwargs)
    def rm(self, path: str, **kwargs):
        return self._fs.rm(self.fix_path(path), **kwargs)
    def mv(self, path1: str, path2: str, **kwargs):
        return self._fs.mv(self.fix_path(path1), self.fix_path(path2), **kwargs)
    def fix_path(self, path: str):
        # Remove protocol from the path if it exists
        if path.startswith(f"{self.protocol}://"):
            path = path[len(f"{self.protocol}://") :]
        # If path is incomplete, prefix the path with the fs_url
        if not path.startswith(self._fs_url):
            path = f"{self._fs_url.rstrip('/')}/{path}"
        return path | 
Beta Was this translation helpful? Give feedback.
I have absolutely no problem for you using the registry in this way.
You might also be interested in the ReferenceFileSystem ("reference://") which provides a similar redirection: every path maps to some other file, part of a file or concrete data. We also have the prefixFS ("dir://") which allows you to add a prefix to every URL for arbitrary backend. You sound like you want something just inbetween these two choices.