Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert SCP-style URLs (no explicit scheme) into proper SSH URLs #1061

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

Listener430
Copy link
Collaborator

@Listener430 Listener430 commented Feb 13, 2025

what

Done:

  1. Sometimes vendoring urls are provided in a non-standard, SCP-style Git URLs formt which omits a scheme and use a colon for separation. In order Go’s URL parser can process them, they have to be converted into fully qualified URLs (using SSH or HTTPS).
  2. Vendoring now honors tokens for Gitlab and Bitbucket for https vendoring
  3. Masking of sensative data in debug statements in Custom Detector

This is a spin off of #984 that futher extends custom detector logic

Testing

non-standard SCP-style links handling
github ssh vendor pull

Token injections were tested wtih bitbucket and gitlab (http) for private and public repos + ssh vendoring for both.
Listing them here as there are no dedicated tests/repos available for testing at bitbucket/gitlab.

gitlab over ssh private repo
gitlab over https private repo with a token
bitbucket public repo over ssh
bitbucket private repo over ssh
bitbucket https public repo with token set and no token set works
bitbucket https private repo
gitlab over https public repo no auth

why

  1. Links without explicit scheme were indication were not handled correctly, e.g. this one failed
    git::[email protected]:cloudposse/terraform-null-label.git?ref={{.Version}}
  2. credentials for http vendoring were read from the token only for github, but not fot bitbucket and gitlab

references

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Improved Git URL processing with support for converting SCP-style addresses into valid SSH or HTTPS links.
    • Expanded token authentication, now accommodating multiple Git-hosting services for seamless credential management.
    • Enhanced logging provides clearer visibility and more informative error messaging during Git operations.
    • Added new test cases for simulating dry-run vendoring operations and ensuring credential security in logs.
    • New YAML configuration files enhance management of vendor dependencies and CLI configurations.
    • New environment variables introduced for Bitbucket and GitLab authentication, enhancing CLI configuration options.
    • Introduction of a function to mask sensitive information in URLs.
  • Refactor

    • Unified Git detection to offer broader support across various Git providers.
    • Transitioned to structured logging for improved clarity and usability of log messages.

@Listener430 Listener430 added the enhancement New feature or request label Feb 13, 2025
@Listener430 Listener430 requested a review from osterman February 13, 2025 07:23
@Listener430 Listener430 self-assigned this Feb 13, 2025
@Listener430 Listener430 requested a review from a team as a code owner February 13, 2025 07:23
Copy link
Contributor

coderabbitai bot commented Feb 13, 2025

📝 Walkthrough

Walkthrough

The changes involve a significant refactoring of the CustomGitHubDetector, now renamed to CustomGitDetector, to handle various Git URL formats, including those from GitHub, Bitbucket, and GitLab. The Detect method has been enhanced with detailed logging and improved error handling. New logic accommodates SCP-style URLs and expands token injection for multiple platforms. Additionally, new test cases and configuration files have been introduced to support these changes, enhancing the overall functionality and logging capabilities of the system.

Changes

File Change Summary
internal/exec/go_getter_utils.go Renamed CustomGitHubDetector to CustomGitDetector; enhanced the Detect method with detailed logging, introduced regex detection and conversion for SCP-style URLs, expanded token injection logic for GitHub, Bitbucket, and GitLab; updated error handling for SSH scheme checks.
tests/snapshots/TestCLICommands_atmos_vendor_pull_ssh.stderr.golden Added logging statements for command execution visibility; logs include processing details and warnings for TTY detection.
tests/test-cases/demo-vendoring.yaml Added two new test cases: "atmos vendor pull ssh" (disabled by default) simulating a dry-run vendoring operation, and "atmos vendor pull custom detector credentials leakage" (enabled) ensuring injected credentials are not leaked in logs.
tests/fixtures/scenarios/vendor-pulls-ssh/atmos.yaml Introduced structured configuration for CLI settings, including base_path, vendor, components, stacks, workflows, logs, validate, commands, integrations, schemas, templates, and settings.
tests/fixtures/scenarios/vendor-pulls-ssh/vendor.yaml Added a new YAML configuration file for managing vendor dependencies, defining AtmosVendorConfig with a component sourced from a Git repository.
website/docs/cli/configuration/configuration.mdx Added new environment variables for Bitbucket and GitLab authentication, enhancing CLI configuration options.
pkg/utils/url_utils.go Introduced a new function MaskBasicAuth for masking basic authentication credentials in URLs.
tests/snapshots/TestCLICommands_atmos_vendor_pull_custom_detector_credentials_leakage.stderr.golden Enhanced logging for vendoring process, detailing command execution and confirming successful component vendoring.
internal/exec/vendor_utils.go Added import for github.com/charmbracelet/log and updated logging in generateSkipFunction to use structured logging.

Possibly related PRs

Suggested reviewers

  • aknysh
  • osterman
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary or @auto-summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @auto-title anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
internal/exec/go_getter_utils.go (5)

15-15: Consider using a more descriptive alias
Short aliases like l might hamper clarity. Renaming to log or logger could improve readability.


70-70: Suggest clarifying the field name
source could be named more explicitly, such as sourceURI or originalSource, for improved code clarity.


90-112: Add optional support for custom ports
SCP-style URLs sometimes specify a custom port (e.g., git@host:port/repo). The current regex won’t match those.

- scpPattern := regexp.MustCompile(`^(([\w.-]+)@)?([\w.-]+\.[\w.-]+):([\w./-]+)(\.git)?(.*)$`)
+ scpPattern := regexp.MustCompile(`^(([\w.-]+)@)?([\w.-]+\.[\w.-]+)(:[0-9]+)?:([\w./-]+)(\.git)?(.*)$`)

136-138: Convert TBC comment into a TODO
Documenting pending enhancements is good. Consider adding a // TODO: or opening an issue.

Would you like me to open an issue for broadening token injection support?


139-164: Ensure consistent config toggles
Only GitHub token injection is governed by InjectGithubToken. Consider adding similar toggles for Bitbucket and GitLab for uniformity.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 161c074 and a43fd72.

📒 Files selected for processing (1)
  • internal/exec/go_getter_utils.go (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (17)
  • GitHub Check: [mock-macos] tests/fixtures/scenarios/complete
  • GitHub Check: [mock-macos] examples/demo-vendoring
  • GitHub Check: [mock-macos] examples/demo-context
  • GitHub Check: [mock-macos] examples/demo-component-versions
  • GitHub Check: [mock-macos] examples/demo-atlantis
  • GitHub Check: [mock-windows] examples/demo-context
  • GitHub Check: [mock-windows] examples/demo-component-versions
  • GitHub Check: [mock-windows] examples/demo-atlantis
  • GitHub Check: [mock-linux] tests/fixtures/scenarios/complete
  • GitHub Check: [mock-linux] examples/demo-vendoring
  • GitHub Check: [mock-linux] examples/demo-context
  • GitHub Check: Acceptance Tests (windows-latest, windows)
  • GitHub Check: Acceptance Tests (ubuntu-latest, linux)
  • GitHub Check: Docker Lint
  • GitHub Check: [k3s] demo-helmfile
  • GitHub Check: [localstack] demo-localstack
  • GitHub Check: Summary
🔇 Additional comments (11)
internal/exec/go_getter_utils.go (11)

11-11: Looks good
No immediate concerns. The regexp import is necessary for the new SCP-style detection.


61-61: Confirmed
Adding git::ssh to the list of valid schemes aligns with go-getter usage.


66-68: Renaming is consistent
This rename clarifies support for multiple Git hosting platforms.


81-88: Excellent documentation
The inline comments clearly explain how SCP-style URLs are handled.


115-115: Duplicate concern: potential credential exposure
Similar to the previous logging of src, sensitive info may be leaked.


123-126: Clear error handling
Returning an error when no SSH agent is available is straightforward.


128-133: Non-standard host scenario well-handled
Skipping token injection for unrecognized hosts is logical.


166-175: Token injection logic is robust
Injecting the token only if credentials aren’t already present is correct.


177-185: Subdir detection
Appending //. for top-level repos is a known go-getter approach. Looks good.


187-197: Default shallow clone
Specifying depth=1 improves performance but may break use cases needing full history. Confirm this suits your workflows.


208-208: Registering the new CustomGitDetector
Ensuring the custom detector runs first is correct.

coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 13, 2025
@osterman
Copy link
Member

To test vendoring of SSH style URLs, without SSH key, test instead —-dry-run with log level debug, and a snapshot. Note, tokens are not valid for SSH authentication.

@osterman
Copy link
Member

Windows tests are failing:

=== RUN   TestCLICommands/atmos_vendor_pull_(no_tty)
    cli_test.go:901: Stderr diff mismatch for "D:\\a\\atmos\\atmos\\tests\\snapshots\\TestCLICommands_atmos_vendor_pull_(no_tty).stdout.golden":
--- expected
+++ actual
@@ -1,7 +1,12 @@
 INFO Vendoring from 'vendor.yaml'
 WARN No TTY detected. Falling back to basic output. This can happen when no terminal is attached or when commands are pipelined.
-INFO ✓ github/stargazers (main)
-INFO ✓ weather (main)
-INFO ✓ ipinfo (main)
+ERRO Failed to vendor github/stargazers: error : failed to download package: subdir "examples%5Cdemo-library%5Cgithub%5Cstargazers" not found
+INFO x github/stargazers (main)
+ERRO Failed to vendor weather: error : failed to download package: subdir "examples%5Cdemo-library%5Cweather" not found
+INFO x weather (main)
+ERRO Failed to vendor ipinfo: error : failed to download package: subdir "examples%5Cdemo-library%5Cipinfo" not found
+INFO x ipinfo (main)
+INFO Vendored 0 components. Failed to vendor 3 components.
+

@Listener430
Copy link
Collaborator Author

To test vendoring of SSH style URLs, without SSH key, test instead —-dry-run with log level debug, and a snapshot. Note, tokens are not valid for SSH authentication.

Have added the test, but it is in disabled state now as current dry run implementation in internal\exec\vendor_component_utils.go exits before hitting custom detectors and go getters. Let me know there are any ojectiosn to adjust vendor_component_utils.go code for the dry run to operate differently.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test-cases/demo-vendoring.yaml (1)

58-59: Clean up YAML formatting.

Fix the following formatting issues:

  • Line 58: Remove trailing spaces
  • Line 59: Add newline at end of file
-        - "No SSH authentication method found"  
-      exit_code: 0
\ No newline at end of file
+        - "No SSH authentication method found"
+      exit_code: 0
+
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 58-58: trailing spaces

(trailing-spaces)


[error] 59-59: no new line character at the end of file

(new-line-at-end-of-file)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a43fd72 and ef4a300.

📒 Files selected for processing (3)
  • internal/exec/go_getter_utils.go (3 hunks)
  • tests/snapshots/TestCLICommands_atmos_vendor_pull_ssh.stderr.golden (1 hunks)
  • tests/test-cases/demo-vendoring.yaml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/exec/go_getter_utils.go
🧰 Additional context used
🪛 YAMLlint (1.35.1)
tests/test-cases/demo-vendoring.yaml

[error] 58-58: trailing spaces

(trailing-spaces)


[error] 59-59: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build (windows-latest, windows)
  • GitHub Check: Summary
🔇 Additional comments (1)
tests/test-cases/demo-vendoring.yaml (1)

43-59: LGTM! Well-structured test case for SSH URL handling.

The test case effectively validates:

  • SSH-style URL handling
  • Dry-run behavior
  • Error messaging for missing SSH authentication
  • Debug logging output

This aligns perfectly with the PR objectives for SCP-style URL support.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 58-58: trailing spaces

(trailing-spaces)


[error] 59-59: no new line character at the end of file

(new-line-at-end-of-file)

coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 14, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
internal/exec/go_getter_utils.go (2)

81-118: Strong implementation of SCP-style URL handling.

The regex pattern and transformation logic effectively handle SCP-style Git URLs while maintaining security through URL masking.

However, consider adding support for IPv6 hosts in the SCP pattern:

-scpPattern := regexp.MustCompile(`^(([\w.-]+)@)?([\w.-]+\.[\w.-]+):([\w./-]+)(\.git)?(.*)$`)
+scpPattern := regexp.MustCompile(`^(([\w.-]+)@)?(\[[:0-9a-fA-F]+\]|[\w.-]+\.[\w.-]+):([\w./-]+)(\.git)?(.*)$`)

152-209: Comprehensive token injection implementation.

The token injection logic effectively handles multiple Git hosting services with proper fallback mechanisms.

Consider adding Azure DevOps support:

 switch host {
+case "dev.azure.com":
+    tokenSource = "AZURE_DEVOPS_TOKEN"
+    token = os.Getenv(tokenSource)
+    if token == "" {
+        tokenSource = "ATMOS_AZURE_DEVOPS_TOKEN"
+        token = os.Getenv(tokenSource)
+    }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a038294 and 75cca41.

📒 Files selected for processing (1)
  • internal/exec/go_getter_utils.go (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
internal/exec/go_getter_utils.go (1)
Learnt from: Listener430
PR: cloudposse/atmos#1061
File: internal/exec/go_getter_utils.go:74-75
Timestamp: 2025-02-13T07:30:28.946Z
Learning: In the `CustomGitDetector.Detect` method of `internal/exec/go_getter_utils.go`, verbose debug logging of raw URLs is intentionally kept for debugging purposes, despite potential credential exposure risks.
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build (windows-latest, windows)
  • GitHub Check: Summary
🔇 Additional comments (5)
internal/exec/go_getter_utils.go (5)

11-15: LGTM! Good choice of logging library.

The addition of the regexp package and charmbracelet/log library enhances the code's capabilities for URL parsing and structured logging.


66-71: LGTM! Clear and descriptive type definition.

The renaming from CustomGitHubDetector to CustomGitDetector better reflects its broader scope of handling multiple Git hosting services.


136-142: Good defensive programming with SSH agent check.

The SSH agent check with debug logging helps users identify potential authentication issues early.


224-234: LGTM! Efficient shallow clone configuration.

Setting depth=1 by default is a good optimization for reducing bandwidth and improving clone speed.


235-242:

❓ Verification inconclusive

Verify URL masking error handling.

While the URL masking logic is good, the error case only logs but doesn't propagate the error.


🏁 Script executed:

#!/bin/bash
# Check if there are any other instances where URL masking errors are handled similarly
rg -A 2 "MaskBasicAuth.*err.*:=" .

Length of output: 36


URL Masking Error Handling Verification

In internal/exec/go_getter_utils.go (lines 235–242), the error returned from u.MaskBasicAuth is logged without being propagated. Our grep search didn't reveal additional instances where this pattern appears, so it's unclear whether this behavior is intentional or if error propagation is expected for consistency. Please manually verify if propagating the error here (or elsewhere) would better serve the error-handling strategy across the codebase.

  • Verify if propagating the error instead of only logging it is consistent with the rest of the project’s error-handling approach.
  • Confirm that this pattern isn't isolated to this function and that similar operations handle errors appropriately.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/fixtures/scenarios/vendor-pulls-ssh/atmos.yaml (2)

6-13: Terraform Components Configuration Added.
The components section clearly configures Terraform settings such as the base path, auto-approval, and initialization behavior, which enhances clarity in component management. Note: There are trailing spaces detected on line 13; please remove them for cleaner formatting.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 13-13: trailing spaces

(trailing-spaces)


26-30: Remove Excessive Blank Lines.
There are several consecutive blank lines at the end of the file (notably around line 29), which might affect readability. Please remove the extra blank lines.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 29-29: too many blank lines

(5 > 0) (empty-lines)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75cca41 and a9f7366.

📒 Files selected for processing (1)
  • tests/fixtures/scenarios/vendor-pulls-ssh/atmos.yaml (1 hunks)
🧰 Additional context used
🪛 YAMLlint (1.35.1)
tests/fixtures/scenarios/vendor-pulls-ssh/atmos.yaml

[error] 13-13: trailing spaces

(trailing-spaces)


[warning] 29-29: too many blank lines

(5 > 0) (empty-lines)

⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: Build (windows-latest, windows)
  • GitHub Check: website-deploy-preview
  • GitHub Check: Analyze (go)
  • GitHub Check: Summary
🔇 Additional comments (3)
tests/fixtures/scenarios/vendor-pulls-ssh/atmos.yaml (3)

1-3: New Base Path and Settings Section.
The configuration now defines a base path and enables GitHub token injection. This aligns well with the PR’s objectives for handling token-driven authentication.


14-21: Stacks Configuration Section.
The stacks section is neatly defined with a base path, inclusion/exclusion patterns, and a naming convention. This modular approach will help maintain consistency across environments.


22-25: Log Configuration Added.
The logs section sets the output file and log level explicitly, which should improve troubleshooting and log management.

coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 19, 2025
@aknysh aknysh added minor New features that do not break anything and removed enhancement New feature or request labels Feb 19, 2025
@aknysh
Copy link
Member

aknysh commented Feb 19, 2025

@Listener430 please review the failing tests

coderabbitai[bot]

This comment was marked as outdated.

coderabbitai[bot]

This comment was marked as outdated.

@Listener430
Copy link
Collaborator Author

@Listener430 please review the failing tests

@aknysh done, please kindly take a look.

@Listener430 Listener430 requested a review from osterman February 21, 2025 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor New features that do not break anything needs-cloudposse Needs Cloud Posse assistance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vendor issue when using SSH-formatted Git URLs
3 participants