Skip to content

Conversation

@siccous
Copy link

@siccous siccous commented Oct 19, 2025

User description

Hello,

few improvements for Gitea integration mostly to catch the features from other integrations.

  • gitea_app renamed to gitea in configuration (gitea_app not used in the codebase)
  • PR ignore logic
  • persistent comment support
  • honor handle_push_trigger and push_commands implementation

PR Type

Enhancement


Description

  • Rename gitea_app to gitea in configuration for consistency

  • Add PR ignore logic to filter PRs by repository, author, title, labels, and branches

  • Implement persistent comment support via publish_persistent_comment method

  • Honor handle_push_trigger and push_commands implementation for push events

  • Add repository cloning support with token authentication

  • Improve command handling with settings updates and auto-command execution


Diagram Walkthrough

flowchart LR
  A["Gitea Webhook Events"] --> B["PR/Push/Comment Handler"]
  B --> C["should_process_pr_logic"]
  C --> D["Filter by Repo/Author/Title/Labels/Branches"]
  D --> E["Execute Commands"]
  E --> F["Persistent Comments"]
  E --> G["Clone with Token"]
Loading

File Walkthrough

Relevant files
Enhancement
gitea_provider.py
Add persistent comments and clone support with token auth

pr_agent/git_providers/gitea_provider.py

  • Store gitea_access_token as instance variable for reuse in clone
    operations
  • Add get_latest_commit_url() and get_comment_url() methods for URL
    retrieval
  • Implement publish_persistent_comment() wrapper method for persistent
    comment support
  • Add get_pr_id() method to generate PR identifier
  • Add get_git_repo_url() method to construct repository clone URL
  • Implement _prepare_clone_url_with_token() method to embed
    authentication token in clone URL
  • Fix return type annotation in create_inline_comment() method
  • Remove unused hashlib import
  • Clean up whitespace and formatting
+58/-7   
gitea_app.py
Add PR filtering and push command handling logic                 

pr_agent/servers/gitea_app.py

  • Add should_process_pr_logic() function to filter PRs based on
    repository, author, title, labels, and branches
  • Implement _perform_commands_gitea() function to execute commands with
    settings updates and auto-command support
  • Add PR ignore logic checking before processing PR events
  • Implement handle_push_trigger and push_commands support for push
    events
  • Add repository settings application via apply_repo_settings()
  • Replace hardcoded /review --incremental with configurable
    push_commands
  • Import re module for regex pattern matching and
    update_settings_from_args utility
  • Remove unused asyncio import
  • Clean up whitespace
+99/-6   
Configuration changes
configuration.toml
Rename gitea_app to gitea and add push_commands                   

pr_agent/settings/configuration.toml

  • Rename configuration section from [gitea_app] to [gitea] for
    consistency
  • Add push_commands configuration array to [gitea] section
  • Maintain existing url, handle_push_trigger, and pr_commands settings
+5/-1     

@qodo-merge-for-open-source
Copy link
Contributor

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Token in clone URL

Description: The method '_prepare_clone_url_with_token' embeds the personal access token directly into
the git clone URL, which risks token leakage via process lists, shell history, logs, or
remote URL exposure; consider using credential helpers or header-based auth instead.
gitea_provider.py [712-738]

Referred Code
#For example, to clone:
#https://github.com/Codium-ai/pr-agent-pro.git
#Need to embed inside the github token:
#https://<token>@github.com/Codium-ai/pr-agent-pro.git

gitea_token = self.gitea_access_token
gitea_base_url = self.base_url
scheme = gitea_base_url.split("://")[0]
scheme += "://"
if not all([gitea_token, gitea_base_url]):
    get_logger().error("Either missing auth token or missing base url")
    return None
base_url = gitea_base_url.split(scheme)[1]
if not base_url:
    get_logger().error(f"Base url: {gitea_base_url} has an empty base url")
    return None
if base_url not in repo_url_to_clone:
    get_logger().error(f"url to clone: {repo_url_to_clone} does not contain {base_url}")
    return None
repo_full_name = repo_url_to_clone.split(base_url)[-1]
if not repo_full_name:


 ... (clipped 6 lines)
ReDoS via regex

Description: The PR ignore logic uses unbounded regular expressions from configuration on multiple
request fields, which could enable ReDoS or overly broad matches if attackers control
titles or branches; validate or bound regex complexity and consider timeouts.
gitea_app.py [151-207]

Referred Code
def should_process_pr_logic(body) -> bool:
    try:
        pull_request = body.get("pull_request", {})
        title = pull_request.get("title", "")
        pr_labels = pull_request.get("labels", [])
        source_branch = pull_request.get("head", {}).get("ref", "")
        target_branch = pull_request.get("base", {}).get("ref", "")
        sender = body.get("sender", {}).get("login")
        repo_full_name = body.get("repository", {}).get("full_name", "")

        # logic to ignore PRs from specific repositories
        ignore_repos = get_settings().get("CONFIG.IGNORE_REPOSITORIES", [])
        if ignore_repos and repo_full_name:
            if any(re.search(regex, repo_full_name) for regex in ignore_repos):
                get_logger().info(f"Ignoring PR from repository '{repo_full_name}' due to 'config.ignore_repositories' setting")
                return False

        # logic to ignore PRs from specific users
        ignore_pr_users = get_settings().get("CONFIG.IGNORE_PR_AUTHORS", [])
        if ignore_pr_users and sender:
            if any(re.search(regex, sender) for regex in ignore_pr_users):


 ... (clipped 36 lines)
Ticket Compliance
🎫 No ticket provided
- [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true -->

</details></td></tr>
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Consistent Naming Conventions

Objective: All new variables, functions, and classes must follow the project's established naming
standards

Status: Passed

Single Responsibility for Functions

Objective: Each function should have a single, well-defined responsibility

Status: Passed

When relevant, utilize early return

Objective: In a code snippet containing multiple logic conditions (such as 'if-else'), prefer an
early return on edge cases than deep nesting

Status: Passed

No Dead or Commented-Out Code

Objective: Keep the codebase clean by ensuring all submitted code is active and necessary

Status:
Commented code: New code includes multiple commented-out command handling lines that appear to be leftover
alternatives rather than documentation.

Referred Code
    # commands = get_settings().get("gitea.pr_commands", [])
    await _perform_commands_gitea("pr_commands", agent, body, api_url)
    # for command in commands:
    #     await agent.handle_request(api_url, command)
elif action == "synchronized":
    # Handle push to PR
    commands_on_push = get_settings().get(f"gitea.push_commands", {})
    handle_push_trigger = get_settings().get(f"gitea.handle_push_trigger", False)
    if not commands_on_push or not handle_push_trigger:
        get_logger().info("Push event, but no push commands found or push trigger is disabled")
        return
    get_logger().debug(f'A push event has been received: {api_url}')
    await _perform_commands_gitea("push_commands", agent, body, api_url)
    # for command in commands_on_push:
    #     await agent.handle_request(api_url, command)
Robust Error Handling

Objective: Ensure potential errors and edge cases are anticipated and handled gracefully throughout
the code

Status:
Bare except: Function get_pr_id() uses a bare except without logging or specifying exception types,
which can hide errors.

Referred Code
def get_pr_id(self):
    try:
        pr_id = f"{self.repo}/{self.pr_number}"
        return pr_id
    except:
        return ""
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-merge-for-open-source
Copy link
Contributor

qodo-merge-for-open-source bot commented Oct 19, 2025

PR Code Suggestions ✨

Latest suggestions up to bf1cc50

CategorySuggestion                                                                                                                                    Impact
Possible issue
Add defensive null checks

Guard against cases where self.last_commit may be None or missing html_url to avoid
AttributeError. Return an empty string or log and fallback safely when data is
unavailable.

pr_agent/git_providers/gitea_provider.py [225-226]

 def get_latest_commit_url(self) -> str:
-    return self.last_commit.html_url
+    if not getattr(self, "last_commit", None):
+        self.logger.error("No last commit available")
+        return ""
+    return getattr(self.last_commit, "html_url", "") or ""
Suggestion importance[1-10]: 7

__

Why: Good defensive check to avoid AttributeError if self.last_commit or its html_url are missing; improves robustness with minimal change.

Medium
General
Replace bare except with checks

Avoid a bare except that hides real errors and provide explicit checks to prevent
NameError/AttributeError. Log when values are missing and return a consistent
fallback.

pr_agent/git_providers/gitea_provider.py [530-535]

 def get_pr_id(self):
-    try:
-        pr_id = f"{self.repo}/{self.pr_number}"
-        return pr_id
-    except:
+    repo = getattr(self, "repo", None)
+    pr_number = getattr(self, "pr_number", None)
+    if repo is None or pr_number is None:
+        self.logger.error("Missing repo or pr_number for PR id")
         return ""
+    return f"{repo}/{pr_number}"
Suggestion importance[1-10]: 6

__

Why: Replacing a bare except with explicit attribute checks and logging improves error transparency and maintainability; moderate impact.

Low
Remove unnecessary return value

_perform_commands_gitea is async and currently returns {} which is not awaited by
the caller; also returning a value is unnecessary for a handler. Simplify by
returning early without a value for consistent coroutine behavior.

pr_agent/servers/gitea_app.py [135-136]

-if not should_process_pr_logic(body): # Here we already updated the configuration with the repo settings
-    return {}
+if not should_process_pr_logic(body):  # Here we already updated the configuration with the repo settings
+    return
Suggestion importance[1-10]: 5

__

Why: Returning {} from an async helper is unused and inconsistent; changing to return is a small correctness/style improvement with low impact.

Low

Previous suggestions

Suggestions up to commit bf1cc50
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix config key paths

The config keys use upper-case sections that don't match the TOML ([gitea]/[config]
vs "CONFIG."). Normalize to the correct lowercase "config." paths to ensure
filters actually apply.

pr_agent/servers/gitea_app.py [162-205]

-ignore_repos = get_settings().get("CONFIG.IGNORE_REPOSITORIES", [])
+ignore_repos = get_settings().get("config.ignore_repositories", [])
 ...
-ignore_pr_users = get_settings().get("CONFIG.IGNORE_PR_AUTHORS", [])
+ignore_pr_users = get_settings().get("config.ignore_pr_authors", [])
 ...
-ignore_pr_title_re = get_settings().get("CONFIG.IGNORE_PR_TITLE", [])
+ignore_pr_title_re = get_settings().get("config.ignore_pr_title", [])
 ...
-ignore_pr_labels = get_settings().get("CONFIG.IGNORE_PR_LABELS", [])
+ignore_pr_labels = get_settings().get("config.ignore_pr_labels", [])
 ...
-ignore_pr_source_branches = get_settings().get("CONFIG.IGNORE_PR_SOURCE_BRANCHES", [])
-ignore_pr_target_branches = get_settings().get("CONFIG.IGNORE_PR_TARGET_BRANCHES", [])
+ignore_pr_source_branches = get_settings().get("config.ignore_pr_source_branches", [])
+ignore_pr_target_branches = get_settings().get("config.ignore_pr_target_branches", [])
Suggestion importance[1-10]: 8

__

Why: The code uses uppercase "CONFIG.*" keys while settings and similar usage elsewhere indicate lowercase "config.*"; correcting ensures filters actually work, impacting core behavior.

Medium
Guard missing commit data

Guard against a missing last_commit or absent html_url to avoid attribute errors.
Return an empty string and log a warning when the commit information isn't
available.

pr_agent/git_providers/gitea_provider.py [225-226]

 def get_latest_commit_url(self) -> str:
+    if not getattr(self, "last_commit", None) or not getattr(self.last_commit, "html_url", None):
+        self.logger.warning("Latest commit URL is unavailable")
+        return ""
     return self.last_commit.html_url
Suggestion importance[1-10]: 7

__

Why: Adds a simple guard to prevent attribute errors if last_commit or html_url is missing; correct and improves robustness with minimal risk.

Medium
General
Robust URL token injection

The URL parsing is brittle and may mis-handle subpaths or differing schemes. Use
urllib.parse.urlsplit to reliably inject the token and rebuild the URL, preserving
host and path safely.

pr_agent/git_providers/gitea_provider.py [711-738]

+from urllib.parse import urlsplit, urlunsplit
+
 def _prepare_clone_url_with_token(self, repo_url_to_clone: str) -> str | None:
-    ...
-    scheme = gitea_base_url.split("://")[0]
-    scheme += "://"
-    ...
-    base_url = gitea_base_url.split(scheme)[1]
-    if not base_url:
-        get_logger().error(f"Base url: {gitea_base_url} has an empty base url")
+    gitea_token = self.gitea_access_token
+    gitea_base_url = self.base_url
+    if not all([gitea_token, gitea_base_url, repo_url_to_clone]):
+        get_logger().error("Either missing auth token, base url, or repo url")
         return None
-    if base_url not in repo_url_to_clone:
-        get_logger().error(f"url to clone: {repo_url_to_clone} does not contain {base_url}")
+
+    try:
+        base_parts = urlsplit(gitea_base_url)
+        repo_parts = urlsplit(repo_url_to_clone)
+        if base_parts.netloc != repo_parts.netloc:
+            get_logger().error(f"url to clone: {repo_url_to_clone} does not match base host {base_parts.netloc}")
+            return None
+
+        netloc_with_token = f"{gitea_token}@{repo_parts.netloc}"
+        clone_url = urlunsplit((repo_parts.scheme, netloc_with_token, repo_parts.path, repo_parts.query, repo_parts.fragment))
+        return clone_url
+    except Exception as e:
+        get_logger().error(f"Failed to prepare clone url: {e}")
         return None
-    repo_full_name = repo_url_to_clone.split(base_url)[-1]
-    ...
-    clone_url = scheme
-    clone_url += f"{gitea_token}@{base_url}{repo_full_name}"
-    return clone_url
Suggestion importance[1-10]: 6

__

Why: Replacing brittle string splits with urlsplit/urlunsplit is a solid improvement to handle varied URLs safely; beneficial but not critical, and requires adding imports.

Low
Suggestions up to commit bf1cc50
CategorySuggestion                                                                                                                                    Impact
Possible issue
Handle missing latest commit

Guard against self.last_commit being None to avoid AttributeError. Return an empty
string or a fallback when commit data isn't available.

pr_agent/git_providers/gitea_provider.py [225-226]

 def get_latest_commit_url(self) -> str:
+    if not getattr(self, "last_commit", None) or not getattr(self.last_commit, "html_url", None):
+        self.logger.error("Latest commit URL is unavailable")
+        return ""
     return self.last_commit.html_url
Suggestion importance[1-10]: 7

__

Why: This guards against self.last_commit or its html_url being missing, preventing an AttributeError and returning a safe default. It's a reasonable robustness improvement aligned with nearby code that already checks self.last_commit elsewhere.

Medium
Safely access comment URL

Validate comment structure before accessing html_url to prevent runtime errors. Log
and return a safe default if the attribute is missing.

pr_agent/git_providers/gitea_provider.py [228-230]

 def get_comment_url(self, comment) -> str:
-    return comment.html_url
+    try:
+        url = getattr(comment, "html_url", None) or (comment.get("html_url") if isinstance(comment, dict) else None)
+        if not url:
+            self.logger.error("Comment URL is unavailable")
+            return ""
+        return url
+    except Exception as e:
+        self.logger.error(f"Failed to get comment URL: {e}")
+        return ""
Suggestion importance[1-10]: 6

__

Why: Validates the comment argument before accessing html_url, avoiding potential runtime errors and providing a fallback. It's a minor but sensible defensive change; impact is moderate.

Low
General
Unify return type in handler

Ensure consistent return types in async handlers; returning a dict from this branch
while other branches return None can confuse callers. Return None to match other
early exits.

pr_agent/servers/gitea_app.py [135-136]

-if not should_process_pr_logic(body): # Here we already updated the configuration with the repo settings
-    return {}
+if not should_process_pr_logic(body):  # Here we already updated the configuration with the repo settings
+    return
Suggestion importance[1-10]: 5

__

Why: Returning None instead of {} in _perform_commands_gitea keeps early exits consistent with other branches. The change is minor and stylistic but can reduce confusion for callers expecting None.

Low
Suggestions up to commit bf1cc50
CategorySuggestion                                                                                                                                    Impact
Possible issue
Defer filtering until configured

Move config-based filtering after applying repo-specific settings to ensure filters
use the correct per-repo configuration. Otherwise, PRs may be incorrectly ignored
when repo overrides exist.

pr_agent/servers/gitea_app.py [75-78]

 if event == "pull_request":
-    if not should_process_pr_logic(body):
-        get_logger().debug(f"Request ignored: PR logic filtering")
-        return {}
+    # Defer filtering until after repo settings are applied in the handler
+    if action in ["opened", "reopened", "synchronized"]:
+        await handle_pr_event(body, event, action, agent)
Suggestion importance[1-10]: 8

__

Why: The PR already introduces repo-setting application and filtering inside _perform_commands_gitea; removing early filtering here avoids using stale config and prevents incorrectly skipping PRs. This aligns with the PR's new flow and fixes a potential logic bug.

Medium
Guard against missing commit

Guard against self.last_commit being None to avoid AttributeError when a PR has no
commits yet or the value hasn't been set. Return an empty string or log
appropriately when the URL is unavailable.

pr_agent/git_providers/gitea_provider.py [225-226]

 def get_latest_commit_url(self) -> str:
+    if not getattr(self, "last_commit", None) or not getattr(self.last_commit, "html_url", None):
+        self.logger.debug("No latest commit URL available")
+        return ""
     return self.last_commit.html_url
Suggestion importance[1-10]: 7

__

Why: Adding a None-check for self.last_commit prevents a potential AttributeError and is consistent with nearby safety checks (e.g., in publish_inline_comments). It's a small but practical robustness improvement.

Medium
General
Robust tokenized clone URL building

Use url parsing to robustly inject the token and handle URLs with paths, ports, or
credentials. The current split-based approach can fail on edge cases and is
error-prone.

pr_agent/git_providers/gitea_provider.py [711-738]

+from urllib.parse import urlparse, urlunparse
+
 def _prepare_clone_url_with_token(self, repo_url_to_clone: str) -> str | None:
-    ...
-    scheme = gitea_base_url.split("://")[0]
-    scheme += "://"
-    ...
-    base_url = gitea_base_url.split(scheme)[1]
-    if not base_url:
-        get_logger().error(f"Base url: {gitea_base_url} has an empty base url")
+    gitea_token = self.gitea_access_token
+    gitea_base_url = self.base_url
+    if not gitea_token or not gitea_base_url:
+        get_logger().error("Either missing auth token or missing base url")
         return None
-    if base_url not in repo_url_to_clone:
-        get_logger().error(f"url to clone: {repo_url_to_clone} does not contain {base_url}")
+
+    try:
+        base_parts = urlparse(gitea_base_url)
+        repo_parts = urlparse(repo_url_to_clone)
+        # Ensure the repo host matches the configured base host
+        if base_parts.hostname != repo_parts.hostname or base_parts.scheme != repo_parts.scheme:
+            get_logger().error(f"url to clone: {repo_url_to_clone} does not match base url {gitea_base_url}")
+            return None
+        # Build netloc with token
+        netloc = repo_parts.hostname
+        if repo_parts.port:
+            netloc = f"{netloc}:{repo_parts.port}"
+        netloc = f"{gitea_token}@{netloc}"
+        injected = repo_parts._replace(netloc=netloc)
+        return urlunparse(injected)
+    except Exception as e:
+        get_logger().error(f"Failed to prepare clone url: {e}")
         return None
-    repo_full_name = repo_url_to_clone.split(base_url)[-1]
-    ...
-    clone_url = scheme
-    clone_url += f"{gitea_token}@{base_url}{repo_full_name}"
-    return clone_url
Suggestion importance[1-10]: 6

__

Why: Replacing string splits with urllib.parse improves correctness for ports and edge cases. It's a solid maintainability and reliability enhancement, though not critical to core functionality.

Low
Suggestions up to commit bf1cc50
CategorySuggestion                                                                                                                                    Impact
Possible issue
Load repository-specific settings before checks

Modify should_process_pr_logic to accept api_url and call apply_repo_settings at
the beginning. This ensures repository-specific ignore settings are loaded and
applied correctly.

pr_agent/servers/gitea_app.py [151-159]

-def should_process_pr_logic(body) -> bool:
+def should_process_pr_logic(body, api_url: str) -> bool:
     try:
+        apply_repo_settings(api_url)
         pull_request = body.get("pull_request", {})
         title = pull_request.get("title", "")
         pr_labels = pull_request.get("labels", [])
         source_branch = pull_request.get("head", {}).get("ref", "")
         target_branch = pull_request.get("base", {}).get("ref", "")
         sender = body.get("sender", {}).get("login")
         repo_full_name = body.get("repository", {}).get("full_name", "")
 ...

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a critical bug where should_process_pr_logic fails to load repository-specific settings, leading to incorrect filtering logic. Applying this fix is essential for the feature to work as intended.

Medium
High-level
Abstract PR filtering logic to avoid duplication

To improve maintainability and avoid code duplication, centralize the PR
filtering logic. Instead of the Gitea-specific should_process_pr_logic function,
create a generic function that operates on a GitProvider object, making it
reusable across all git providers.

Examples:

pr_agent/servers/gitea_app.py [151-207]
def should_process_pr_logic(body) -> bool:
    try:
        pull_request = body.get("pull_request", {})
        title = pull_request.get("title", "")
        pr_labels = pull_request.get("labels", [])
        source_branch = pull_request.get("head", {}).get("ref", "")
        target_branch = pull_request.get("base", {}).get("ref", "")
        sender = body.get("sender", {}).get("login")
        repo_full_name = body.get("repository", {}).get("full_name", "")


 ... (clipped 47 lines)

Solution Walkthrough:

Before:

# in pr_agent/servers/gitea_app.py

async def handle_request(body: Dict[str, Any], event: str):
    # ...
    if event == "pull_request":
        if not should_process_pr_logic(body): # Gitea-specific logic
            return {}
        # ...
        await handle_pr_event(body, event, action, agent)
    # ...

def should_process_pr_logic(body) -> bool:
    # Extracts PR details directly from Gitea webhook `body`
    pull_request = body.get("pull_request", {})
    title = pull_request.get("title", "")
    # ... more direct extractions
    
    # ... filtering logic based on extracted details
    return True

After:

# in pr_agent/git_providers/utils.py (or similar shared location)

def should_process_pr(provider: GitProvider) -> bool:
    # Generic filtering logic using provider methods
    title = provider.get_pr_title()
    author = provider.get_pr_author()
    # ...
    if title in get_settings().config.ignore_pr_title:
        return False
    # ...
    return True

# in pr_agent/servers/gitea_app.py
async def handle_request(body: Dict[str, Any], event: str):
    # ...
    if event == "pull_request":
        pr_url = body.get("pull_request", {}).get("url")
        provider = GiteaProvider(pr_url) # Instantiate provider early
        if not should_process_pr(provider): # Generic logic
            return {}
        # ...
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the new PR filtering logic in should_process_pr_logic is tightly coupled to the Gitea webhook payload, which will lead to code duplication across providers, and proposes a valid architectural improvement to centralize this logic.

Medium
Learned
best practice
Use robust URL parsing and normalization

Use urlparse to parse/validate URLs and build clone URLs safely; enforce scheme
and strip trailing slashes when constructing repository URLs.

pr_agent/git_providers/gitea_provider.py [711-738]

+from urllib.parse import urlparse, urlunparse
+
 def get_git_repo_url(self, issues_or_pr_url: str) -> str:
-    return f"{self.base_url}/{self.owner}/{self.repo}.git" #base_url / <OWNER>/<REPO>.git
+    # Ensure base_url has scheme and no trailing slash; construct canonical repo URL
+    parsed = urlparse(self.base_url)
+    scheme = parsed.scheme or "https"
+    netloc = parsed.netloc or parsed.path  # handle cases where base_url lacked scheme
+    path = f"/{self.owner}/{self.repo}.git"
+    return urlunparse((scheme, netloc.rstrip("/"), path, "", "", ""))
 
 def _prepare_clone_url_with_token(self, repo_url_to_clone: str) -> str | None:
-    #For example, to clone:
-    #https://github.com/Codium-ai/pr_agent-pro.git
-    #Need to embed inside the github token:
-    #https://<token>@github.com/Codium-ai/pr_agent-pro.git
-
-    gitea_token = self.gitea_access_token
-    gitea_base_url = self.base_url
-    scheme = gitea_base_url.split("://")[0]
-    scheme += "://"
-    if not all([gitea_token, gitea_base_url]):
+    if not self.gitea_access_token or not self.base_url:
         get_logger().error("Either missing auth token or missing base url")
         return None
-    base_url = gitea_base_url.split(scheme)[1]
-    if not base_url:
-        get_logger().error(f"Base url: {gitea_base_url} has an empty base url")
-        return None
-    if base_url not in repo_url_to_clone:
-        get_logger().error(f"url to clone: {repo_url_to_clone} does not contain {base_url}")
-        return None
-    repo_full_name = repo_url_to_clone.split(base_url)[-1]
-    if not repo_full_name:
-        get_logger().error(f"url to clone: {repo_url_to_clone} is malformed")
+
+    repo_parsed = urlparse(repo_url_to_clone)
+    base_parsed = urlparse(self.base_url)
+
+    # Normalize in case base_url misses scheme
+    base_netloc = base_parsed.netloc or base_parsed.path
+    if not base_netloc:
+        get_logger().error(f"Base url: {self.base_url} is invalid")
         return None
 
-    clone_url = scheme
-    clone_url += f"{gitea_token}@{base_url}{repo_full_name}"
+    if base_netloc not in (repo_parsed.netloc or ""):
+        get_logger().error(f"url to clone: {repo_url_to_clone} does not match base host {base_netloc}")
+        return None
+
+    # Build clone URL with token in netloc userinfo
+    scheme = repo_parsed.scheme or (base_parsed.scheme or "https")
+    netloc = f"{self.gitea_access_token}@{repo_parsed.netloc}"
+    clone_url = urlunparse((scheme, netloc, repo_parsed.path, "", "", ""))
     return clone_url

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - Replace brittle string URL parsing with robust parsing and enforce/normalize formats (Patterns 1 and 4).

Low
Normalize and validate commands configuration

Ensure commands is a list of strings and normalize its type; validate boolean
flags and provide safe defaults before iteration.

pr_agent/servers/gitea_app.py [137-149]

-commands = get_settings().get(f"gitea.{commands_conf}")
+commands_raw = get_settings().get(f"gitea.{commands_conf}", [])
+# Normalize to list[str]
+if isinstance(commands_raw, str):
+    commands = [commands_raw]
+elif isinstance(commands_raw, list):
+    commands = [c for c in commands_raw if isinstance(c, str)]
+else:
+    commands = []
+
 if not commands:
-    get_logger().info(f"New PR, but no auto commands configured")
+    get_logger().info("New PR, but no auto commands configured")
     return
+
 get_settings().set("config.is_auto_command", True)
-for command in commands:
-    split_command = command.split(" ")
-    command = split_command[0]
-    args = split_command[1:]
+for cmd in commands:
+    parts = cmd.split()
+    if not parts:
+        continue
+    command, *args = parts
     other_args = update_settings_from_args(args)
-    new_command = ' '.join([command] + other_args)
+    new_command = " ".join([command] + other_args)
     get_logger().info(f"{commands_conf}. Performing auto command '{new_command}', for {api_url=}")
     await agent.handle_request(api_url, new_command)
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why:
Relevant best practice - Validate and normalize configuration values before use; guard nested settings access (Patterns 2 and 3).

Low
General
Avoid using a bare except

In get_pr_id, replace the broad try...except block with hasattr checks for
self.repo and self.pr_number to avoid catching unexpected exceptions and make
the code's intent clearer.

pr_agent/git_providers/gitea_provider.py [530-535]

 def get_pr_id(self):
-    try:
+    if hasattr(self, 'repo') and hasattr(self, 'pr_number'):
         pr_id = f"{self.repo}/{self.pr_number}"
         return pr_id
-    except:
-        return ""
+    return ""
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies a bare except clause, which is poor practice, and proposes a more explicit and safer check using hasattr, improving code robustness and readability.

Low
  • More

@LuChron
Copy link

LuChron commented Oct 22, 2025

@CodiumAI-Agent /improve

1 similar comment
@exa-mohamedi
Copy link

@CodiumAI-Agent /improve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants