🧹 refactor: reduce complexity in download_worker of git-fetch.py#236
🧹 refactor: reduce complexity in download_worker of git-fetch.py#236
Conversation
Addresses a code health issue where `download_worker` had a nesting depth of 9. Refactors the inner download block (connection retries, content writing, request error handling) into a single-purpose helper function `process_single_download`. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Current Aviator status
This pull request is currently open (not queued). How to mergeTo merge this PR, comment
See the real-time status of this PR on the
Aviator webapp.
Use the Aviator Chrome Extension
to see the status of your PR within GitHub.
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
❌ Lint/Format Check Failed Please run |
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (1 file)
This PR is a clean refactoring that extracts the download logic from
The refactoring is well-executed and maintains identical behavior. Reviewed by minimax-m2.5-20260211 · 297,043 tokens |
There was a problem hiding this comment.
Code Review
This pull request effectively refactors the download_worker function by extracting the complex download logic into a new process_single_download helper. This significantly improves readability and maintainability in the worker. I've added a comment on the new function with suggestions to further enhance its maintainability by addressing a magic number and duplicated code blocks.
| def process_single_download( | ||
| conn: http.client.HTTPSConnection, | ||
| host: str, | ||
| url_path: str, | ||
| local_path: Path, | ||
| display_path: str, | ||
| headers: dict[str, str], | ||
| ) -> http.client.HTTPSConnection: | ||
| """Process a single file download with retries.""" | ||
| retries = 3 | ||
| while retries > 0: | ||
| try: | ||
| conn.request("GET", url_path, headers=headers) | ||
| resp = conn.getresponse() | ||
|
|
||
| # Check if the server wants to close the connection | ||
| connection_header = resp.getheader("Connection", "").lower() | ||
| should_close = connection_header == "close" | ||
|
|
||
| if resp.status == 200: | ||
| # Create parent directory to avoid FileNotFoundError | ||
| local_path.parent.mkdir(parents=True, exist_ok=True) | ||
| with open(local_path, "wb") as f: | ||
| while True: | ||
| chunk = resp.read(65536) | ||
| if not chunk: | ||
| break | ||
| f.write(chunk) | ||
| print(f"✓ {display_path}") | ||
|
|
||
| if should_close: | ||
| conn.close() | ||
| conn = http.client.HTTPSConnection(host, timeout=30) | ||
|
|
||
| break | ||
| elif resp.status in (301, 302, 307, 308): | ||
| loc = resp.getheader("Location") | ||
| resp.read() # Consume body | ||
| if loc: | ||
| print(f"✗ {display_path}: Redirect to {loc} not handled in persistent mode") | ||
| else: | ||
| print(f"✗ {display_path}: HTTP {resp.status}") | ||
|
|
||
| if should_close: | ||
| conn.close() | ||
| conn = http.client.HTTPSConnection(host, timeout=30) | ||
| break | ||
| else: | ||
| print(f"✗ {display_path}: HTTP {resp.status}") | ||
| resp.read() # Consume body | ||
| if should_close: | ||
| conn.close() | ||
| conn = http.client.HTTPSConnection(host, timeout=30) | ||
|
|
||
| # Non-retriable client errors: fail fast | ||
| if resp.status in (401, 403, 404): | ||
| break | ||
|
|
||
| # Retry on transient server errors and rate limiting | ||
| if 500 <= resp.status < 600 or resp.status == 429: | ||
| retries -= 1 | ||
| if retries > 0: | ||
| continue | ||
| # Out of retries, give up | ||
| break | ||
|
|
||
| # Default: treat other statuses as non-retriable | ||
| break | ||
| except (http.client.HTTPException, OSError) as e: | ||
| # Connection might have been closed by server unexpectedly | ||
| conn.close() | ||
| retries -= 1 | ||
| if retries > 0: | ||
| conn = http.client.HTTPSConnection(host, timeout=30) | ||
| else: | ||
| print(f"✗ {display_path}: {e}") | ||
|
|
||
| return conn |
There was a problem hiding this comment.
While this refactoring is a good step in reducing complexity in download_worker, the extracted process_single_download function itself has some maintainability issues that could be improved:
-
Magic Number: The timeout value
30is hardcoded in multiple places (here and elsewhere in the file). This should be defined as a module-level constant, e.g.,HTTP_TIMEOUT = 30, for better maintainability. -
Duplicated Logic (DRY violation): The logic for closing and reopening a connection when
should_closeis true is repeated three times within this function.if should_close: conn.close() conn = http.client.HTTPSConnection(host, timeout=30)
This could be extracted into a small, private helper function (e.g.,
_reconnect(conn, host)) to reduce code duplication and make the intent clearer.
There was a problem hiding this comment.
Pull request overview
Refactors git-fetch.py’s threaded download path by extracting the per-file HTTP download/retry logic out of download_worker() into a dedicated helper, reducing nesting and separating connection logic from queue consumption.
Changes:
- Added
process_single_download()to encapsulate per-file GET/retry/status handling and connection re-creation. - Simplified
download_worker()to focus on queue management and delegate download behavior to the helper.
You can also share your feedback on Copilot code review. Take the survey.
| def process_single_download( | ||
| conn: http.client.HTTPSConnection, | ||
| host: str, | ||
| url_path: str, | ||
| local_path: Path, | ||
| display_path: str, | ||
| headers: dict[str, str], | ||
| ) -> http.client.HTTPSConnection: | ||
| """Process a single file download with retries.""" |
🎯 What: Extracted the deeply nested download logic in
Cachyos/Scripts/WIP/gh/git-fetch.py'sdownload_workerfunction into a new helper functionprocess_single_download.💡 Why:
download_workerpreviously had a nesting depth of 9, heavily populated with nestedtry-except,whileloops, andif-elif-elseconditional branches. Extracting the core download algorithm—including retries, header management, and file writing—into a standalone function reduces the depth significantly. This isolates the connection loop logic from the queue management logic, greatly improving readability and maintainability.✅ Verification:
python3 -m py_compile Cachyos/Scripts/WIP/gh/git-fetch.py.ruff check Cachyos/Scripts/WIP/gh/git-fetch.py.python3 Cachyos/Scripts/WIP/gh/test_git_fetch_mock.py(which passed 12/12 tests in ~0.010s).✨ Result: Improved maintainability, cleaner control flow in
download_worker, and reduction of maximum structural nesting from 9 to 4. Behavior is preserved, enabling easier modifications to download management logic down the line.PR created automatically by Jules for task 12605209367485809336 started by @Ven0m0