Skip to content

Conversation

@fjansson
Copy link
Contributor

Proposed as fix for #3201.

http(s) and swift filesystems need the scheme in the path. This commit adds a special case for the swift filesystem, when checking for the scheme presence and generating an exception if the scheme is present.

Also remove one if for the http(s) schemes. Here all schemes can call fs._strip_protocol(path). For the filesystems that need the scheme in the path, this function will not remove it.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jul 28, 2025
@codecov
Copy link

codecov bot commented Jul 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.69%. Comparing base (a0c56fb) to head (e9a9b12).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3302   +/-   ##
=======================================
  Coverage   60.68%   60.69%           
=======================================
  Files          78       78           
  Lines        9356     9355    -1     
=======================================
  Hits         5678     5678           
+ Misses       3678     3677    -1     
Files with missing lines Coverage Δ
src/zarr/storage/_fsspec.py 68.15% <100.00%> (+0.37%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR - could you also add a test to make sure this is working as intended? Just constructing a FsspecStore with a valid swift URL should be good enough.

http(s) and swift filesystems need the scheme in the path.
This commit adds a special case for the swift filesystem,
when checking for the scheme presence.

Also remove one if for the http(s) schemes. Here all schemes can
call fs._strip_protocol(path). For the filesystems that
need the scheme in the path, this function will not remove it.
@fjansson fjansson force-pushed the fix-swift-path-check branch from 0378530 to e9a9b12 Compare July 29, 2025 12:39
@fjansson
Copy link
Contributor Author

fjansson commented Jul 29, 2025

I tried adding a test, but I don't really know what I'm doing here.

def test_swift_store() -> None:
    """
    Test creating an FsspecStore with a SWIFT URL.
    """
    store = FsspecStore.from_url("swift://swift.dkrz.de/dkrz_xxx/testcontainer/test.zarr")
  • The URL doesn't exist, I think this is fine for testing the logic of leaving swift:// in the path.
  • I think the swiftspec package is optional. Does the test need to be marked as requiring this package somehow? The test now gives "ValueError: Protocol not known: swift" which I think is what happens if swiftspec is not there
  • The pre-commit test complains about assigning to an unused variable, I guess I can assign to _ to show the result will not be used?

@dstansby
Copy link
Contributor

The pre-commit test complains about assigning to an unused variable, I guess I can assign to _ to show the result will not be used?

I think you could just not do any assignment at all, and it would fix this.

The URL doesn't exist, I think this is fine for testing the logic of leaving swift:// in the path.

Yes, that's fine.

I think the swiftspec package is optional. Does the test need to be marked as requiring this package somehow? The test now gives "ValueError: Protocol not known: swift" which I think is what happens if swiftspec is not there

This is a bit of a pain. In order to avoid us having to test different protocols, could we just get rid of the two blocks of code that read:

		# fsspec is not consistent about removing the scheme from the path, so check and strip it here
        # https://github.com/fsspec/filesystem_spec/issues/1722
        if "://" in path and not path.startswith("http"):
            # `not path.startswith("http")` is a special case for the http filesystem (¯\_(ツ)_/¯)
            path = fs._strip_protocol(path)

and instead rely on fsspec to handle raising errors on invalid paths?

I've tried removing them from my local copy of zarr and running the tests, and they all seem to pass. I don't know if @jhamman has anything to add as the author of these blocks of code - is there a reason we shouldn't remove them and depend on fsspec to error on invalid paths?

@fjansson
Copy link
Contributor Author

@dstansby sounds good to me. However, I think in the case of from_url(), the line path = fs._strip_protocol(path) should be kept, to give the file systems that need it a chance to get rid of the protocol in the path. That line can safely be run for http(s) and swift (which want the protocol preserved), those file systems will just return the path unchanged.

On the other hand the error checking, e.g. generate an exception when there is a protocol in the path and the file system doesn't want it, seems fine to leave for the individual file systems.

@dstansby dstansby added this to the 3.1.2 milestone Jul 30, 2025
@fjansson
Copy link
Contributor Author

fjansson commented Aug 5, 2025

@dstansby I found out that url_to_fs already calls _strip_protocol(), so there's no need to call it again.
E.g. here it's stated that url_to_fs splits an url into a filesystem and a path suitable for that filesystem. I think both those blocks can be deleted as you say. I'll make another PR with that to see what happens.

@fjansson fjansson mentioned this pull request Aug 5, 2025
6 tasks
@dstansby
Copy link
Contributor

dstansby commented Aug 5, 2025

Wonderful, thanks for following up on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs release notes Automatically applied to PRs which haven't added release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants