Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add top-level copy/cp function #723

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion fsspec/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from . import caching
from ._version import get_versions
from .callbacks import Callback
from .core import get_fs_token_paths, open, open_files, open_local
from .core import copy, cp, get_fs_token_paths, open, open_files, open_local
from .exceptions import FSTimeoutError
from .mapping import FSMap, get_mapper
from .registry import (
Expand All @@ -37,6 +37,8 @@
"open",
"open_files",
"open_local",
"copy",
"cp",
"registry",
"caching",
"Callback",
Expand Down
29 changes: 29 additions & 0 deletions fsspec/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -468,6 +468,35 @@ def open_local(url, mode="rb", **storage_options):
return paths


def cp(filea, fileb: OpenFile, callback=None, **open_kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the filesystem API cp has recursive and other parameters - it can do batch jobs. I had imagined the code here as a file-specific copy method of a filesystem which could be overidden in various specific implementations (like FTP!).

This code doesn't use the given callback.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, I think it will be great to have this useful feature!

Sorry for the nitpick, but might be nice to use src and dst as argument names similar to the python shutil.copyfile function, also maybe mark them as positional only as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly fine with src/dst. I'm less sure there is a need to make them position-only.

"""Copy ``filea`` to ``fileb``.

Parameters
----------
filea: str
The path to a local or remote file.
fileb: OpenFile
The target to write to. Must be in write mode.
callback:

open_kwargs: dict
Additional arguments passed to ``fsspec.open``
when opening ``filea``.
"""
with open(filea, **open_kwargs) as a:
with fileb as b:
while True:
data = a.read(a.fs.blocksize)
if not data:
break
b.write(data)


def copy():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy = cp

"""Alias for `cp`"""
pass


def get_compression(urlpath, compression):
if compression == "infer":
compression = infer_compression(urlpath)
Expand Down
15 changes: 15 additions & 0 deletions fsspec/tests/test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,21 @@ def test_chained_fs():
assert os.listdir(d2) == ["f1"]


def test_cp():
d = tempfile.mkdtemp()
f1 = os.path.join(d, "f1")
f2 = os.path.join(d, "f2")
with open(f1, "wb") as f:
f.write(b"test")

target = fsspec.open(f2, mode="wb")
fsspec.cp(f1, target)
target.close()

with fsspec.open(f2) as f:
assert f.read() == b"test"


@pytest.mark.xfail(reason="see issue #334", strict=True)
def test_multilevel_chained_fs():
"""This test reproduces intake/filesystem_spec#334"""
Expand Down