You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given the glob pattern s3://root_bucket/**/target, I expect to obtain s3://root_bucket/something/target and s3://root_bucket/target. However, only s3://root_bucket/target is returned.
This happens in AWS S3, and it is easily reproducible with moto3:
fromtypingimportGenerator, NamedTupleimportpytestimports3fsclassFixture(NamedTuple):
base_url: strmock_s3_client: s3fs.S3FileSystem@pytest.fixture(scope="function")defbuild(mock_s3_client: s3fs.S3FileSystem) ->Generator[Fixture, None, None]:
# setupbase_url="s3://mock-bucket/fake-project"mock_s3_client.mkdir(base_url)
# First subfolder, with no objects after target.mock_s3_client.mkdir(f"{base_url}/something")
mock_s3_client.mkdir(f"{base_url}/something/target")
mock_s3_client.mkdir(f"{base_url}/something/target/folder1")
mock_s3_client.touch(f"{base_url}/something/target/folder1/file.xml")
mock_s3_client.mkdir(f"{base_url}/something/target/folder2")
mock_s3_client.touch(f"{base_url}/something/target/folder2/file.xml")
# Second subfolder, with objects after target.mock_s3_client.mkdir(f"{base_url}/target")
mock_s3_client.touch(f"{base_url}/target/example.txt")
mock_s3_client.mkdir(f"{base_url}/target/folder3")
mock_s3_client.touch(f"{base_url}/target/folder3/file.xml")
mock_s3_client.mkdir(f"{base_url}/target/folder4")
mock_s3_client.touch(f"{base_url}/target/folder4/file.xml")
# verify folder structureassertset(mock_s3_client.ls(base_url)) == {
"mock-bucket/fake-project/something",
"mock-bucket/fake-project/target",
}
# runyieldFixture(
base_url=base_url,
mock_s3_client=mock_s3_client,
)
deftest_find_subfolders(build: Fixture) ->None:
glob_pattern=f"{build.base_url}/**/target"result=build.mock_s3_client.glob(glob_pattern)
assertlen(result) ==2assertset(result) == {
"s3://mock-bucket/fake-project/something/target",
"s3://mock-bucket/fake-project/target",
}
The text was updated successfully, but these errors were encountered:
I should note that mkdir doesn't do anything, because S3 does not have folders. In some tools a folder is equivalent to a zero-length-file with the same name plus "/" suffix; however the API creates implicit folders when they contain something, and this is the convention that s3fs uses.
Thanks for the suggestion. I cannot promise a PR anytime soon due to time constraints on my current projects.
I should note that mkdir doesn't do anything, because S3 does not have folders. In some tools a folder is equivalent to a zero-length-file with the same name plus "/" suffix; however the API creates implicit folders when they contain something, and this is the convention that s3fs uses.
Yes, working with "folders-like" objects in S3 is always tricky. However, I behave similarly when using a real AWS S3 bucket with the "folders" created manually from UI.
Given the following S3 bucket structure:
Given the glob pattern
s3://root_bucket/**/target
, I expect to obtains3://root_bucket/something/target
ands3://root_bucket/target
. However, onlys3://root_bucket/target
is returned.This happens in AWS S3, and it is easily reproducible with moto3:
The text was updated successfully, but these errors were encountered: