-
Notifications
You must be signed in to change notification settings - Fork 4
fix for empty s3 object return #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an issue with empty S3 object return by updating the regex pattern used to filter directory listings in both default and AWS S3 file listing functions.
- Updated regex pattern to filter out empty S3 objects (lines ending with "0 ")
- Applied the fix consistently to both
listFiles.defaultandlistFiles.awss3functions - Incremented package version to 25.10.1
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| R/listFiles.R | Updated regex patterns in both listing functions to exclude empty objects |
| DESCRIPTION | Bumped version number to reflect the bug fix |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| to <- getDest(db) | ||
| args <- getArgs(db) | ||
|
|
||
| dir <- rsynccli(NULL, to, args = args, pre = pre, intern = TRUE) |
Copilot
AI
Oct 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern (0\\s$)|(\\.$) lacks explanation. Consider adding a comment to document what patterns are being filtered (e.g., empty S3 objects ending with '0 ' and directory entries ending with '.').
| dir <- rsynccli(NULL, to, args = args, pre = pre, intern = TRUE) | |
| dir <- rsynccli(NULL, to, args = args, pre = pre, intern = TRUE) | |
| # Filter out empty S3 objects (lines ending with '0 ') and directory entries (lines ending with '.') |
| args <- if (recursive) "ls --recursive" else "ls" | ||
| dir <- awscli(NULL, dest, args = args, profile = profile, endpoint_url = endpoint_url, intern = TRUE) | ||
| dir <- dat::extract(dir, ~ !grepl("\\.$", .)) | ||
| dir <- dat::extract(dir, ~!grepl("(0\\s$)|(\\.$)", .)) |
Copilot
AI
Oct 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern (0\\s$)|(\\.$) lacks explanation. Consider adding a comment to document what patterns are being filtered (e.g., empty S3 objects ending with '0 ' and directory entries ending with '.').
jan-abel-inwt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the comment? That would make it more understandable.
|
@jan-abel-inwt and I discussed the underlying use case yesterday. When retrieving the list of files from an AWS s3 bucket, the aws cli returned the object plus a non-existing empty file without name. 2025-06-04 13:40:43 0
2025-10-17 14:03:11 229408483 Deutschland_Bundesliga.RdataThe idea of the PR is to remove the empty file. However, since this has not happened before and seems to happen only for this folder and given the migration to Hetzner, we decided to keep it as it is for the moment. In case this happens again, we can reopen the PR. |
No description provided.