Skip to content

OCPBUGS-83514: fix(aws): use stable error message for endpoint service adoption failure#8925

Open
sdminonne wants to merge 1 commit into
openshift:mainfrom
sdminonne:OCPBUGS-83514
Open

OCPBUGS-83514: fix(aws): use stable error message for endpoint service adoption failure#8925
sdminonne wants to merge 1 commit into
openshift:mainfrom
sdminonne:OCPBUGS-83514

Conversation

@sdminonne

@sdminonne sdminonne commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds unit tests for endpoint service creation error paths in reconcileAWSEndpointServiceStatus, covering adoption success/failure and non-InvalidParameter API errors
  • Adds AGENTS.md documenting the condition message stability rules for the AWS endpoint service controller

The controller.go fix (using errors.New(apiErr.ErrorCode()) instead of fmt.Errorf with variable error content) was merged in a prior PR. This PR adds the missing test coverage and documentation.

Test plan

  • Verify new tests pass: go test ./hypershift-operator/controllers/platform/aws/...
  • Verify existing tests are not broken

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation

    • Added guidance on keeping AWS endpoint service condition messages stable and deterministic to prevent status churn.
  • Bug Fixes

    • Improved how AWS endpoint service status updates report errors, avoiding variable AWS SDK details that could cause repeated/unstable condition changes.
    • Ensured adopted endpoint service names are preserved on successful adoption, with clearer failures when adoption does not apply.
  • Tests

    • Added table-driven test coverage for AWS endpoint service creation failures, including adoption success, adoption describe/adoption-miss scenarios, and non-adoption errors.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jul 3, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@sdminonne: This pull request references Jira Issue OCPBUGS-83514, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • Adds unit tests for endpoint service creation error paths in reconcileAWSEndpointServiceStatus, covering adoption success/failure and non-InvalidParameter API errors
  • Adds AGENTS.md documenting the condition message stability rules for the AWS endpoint service controller

The controller.go fix (using errors.New(apiErr.ErrorCode()) instead of fmt.Errorf with variable error content) was merged in a prior PR. This PR adds the missing test coverage and documentation.

Test plan

  • Verify new tests pass: go test ./hypershift-operator/controllers/platform/aws/...
  • Verify existing tests are not broken

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added do-not-merge/needs-area area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/aws PR/issue for AWS (AWSPlatform) platform and removed do-not-merge/needs-area labels Jul 3, 2026
@openshift-ci openshift-ci Bot requested review from bryan-cox and devguyio July 3, 2026 17:24
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 0007cb6d-8be6-4af9-9885-7a9a20281ad5

📥 Commits

Reviewing files that changed from the base of the PR and between b418677 and 7ab4f80.

📒 Files selected for processing (2)
  • hypershift-operator/controllers/platform/aws/AGENTS.md
  • hypershift-operator/controllers/platform/aws/controller_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • hypershift-operator/controllers/platform/aws/AGENTS.md
  • hypershift-operator/controllers/platform/aws/controller_test.go

📝 Walkthrough

Walkthrough

This PR adds AWS Endpoint Service controller guidance for stable, deterministic condition messages instead of returning raw or wrapped AWS SDK errors. It also adds a smithy-go-based unit test covering reconcileAWSEndpointServiceStatus creation failures, including InvalidParameter adoption success and failure paths plus non-InvalidParameter errors, and verifies the returned error codes and EndpointServiceName status updates.

🚥 Pre-merge checks | ✅ 11
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title matches the PR's core AWS endpoint service adoption/error-stability theme, even though this patch mainly adds tests and documentation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed New subtest titles are static, descriptive strings; no pod names, timestamps, UUIDs, namespaces, or other generated data appear in titles.
Test Structure And Quality ✅ Passed The new tests are isolated table-driven unit tests, use gomock/fake clients consistently, and add no cluster resources, waits, or cleanup concerns.
Topology-Aware Scheduling Compatibility ✅ Passed Only AWS controller tests and AGENTS.md were added; no manifests, replicas, affinity, nodeSelector, tolerations, or topology-dependent scheduling logic were introduced.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PASS: The PR adds a Go unit test and AGENTS.md only; no new Ginkgo e2e tests, hardcoded IPv4, or external connectivity were introduced.
No-Weak-Crypto ✅ Passed The PR only adds AWS controller docs and tests; no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB, custom crypto, or secret comparisons appear in the diff.
Container-Privileges ✅ Passed PASS: The PR only changes a markdown doc and Go unit tests; no container/K8s manifests were added or modified, and no privileged/root/allowPrivilegeEscalation settings appear in the diff.
No-Sensitive-Data-In-Logs ✅ Passed Patch only adds tests and an AGENTS doc; no runtime logging of secrets/PII was added, and diff search found no sensitive-data log statements.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@openshift-ci

openshift-ci Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sdminonne
Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot

Copy link
Copy Markdown

@sdminonne: This pull request references Jira Issue OCPBUGS-83514, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary

  • Adds unit tests for endpoint service creation error paths in reconcileAWSEndpointServiceStatus, covering adoption success/failure and non-InvalidParameter API errors
  • Adds AGENTS.md documenting the condition message stability rules for the AWS endpoint service controller

The controller.go fix (using errors.New(apiErr.ErrorCode()) instead of fmt.Errorf with variable error content) was merged in a prior PR. This PR adds the missing test coverage and documentation.

Test plan

  • Verify new tests pass: go test ./hypershift-operator/controllers/platform/aws/...
  • Verify existing tests are not broken

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation

  • Added guidance for consistent error messages in AWS endpoint service status updates, helping avoid unnecessary status churn and unstable conditions.

  • Bug Fixes

  • Improved handling of AWS creation errors so end-user status updates now use stable, predictable messages.

  • Preserved adopted endpoint service names when adoption succeeds, and surfaced clearer failures when adoption checks do not complete.

  • Tests

  • Added coverage for AWS endpoint service creation error scenarios, including adoption success, adoption failure, and non-adoption errors.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hypershift-operator/controllers/platform/aws/controller_test.go`:
- Around line 234-247: The adoption-failure test cases in
reconcileAWSEndpointServiceStatus are expecting the wrong error text: the
current flow still returns the original InvalidParameter from
CreateVpcEndpointServiceConfiguration unless the implementation wraps it. Update
these controller_test.go cases to assert the actual returned code/message from
the create failure, or change the reconcileAWSEndpointServiceStatus
adoption-failure path to wrap the error with the adoption context before
returning it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 301c558d-954e-46d4-b3a8-cfd7c5ecc3f8

📥 Commits

Reviewing files that changed from the base of the PR and between 8e1aa48 and b418677.

📒 Files selected for processing (2)
  • hypershift-operator/controllers/platform/aws/AGENTS.md
  • hypershift-operator/controllers/platform/aws/controller_test.go

Comment thread hypershift-operator/controllers/platform/aws/controller_test.go
@codecov

codecov Bot commented Jul 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.37%. Comparing base (9aeb1f3) to head (7ab4f80).
⚠️ Report is 23 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8925      +/-   ##
==========================================
+ Coverage   43.28%   43.37%   +0.09%     
==========================================
  Files         771      771              
  Lines       95503    95534      +31     
==========================================
+ Hits        41335    41441     +106     
+ Misses      51284    51209      -75     
  Partials     2884     2884              

see 9 files with indirect coverage changes

Flag Coverage Δ
cmd-support 36.87% <ø> (+0.20%) ⬆️
cpo-hostedcontrolplane 45.31% <ø> (-0.01%) ⬇️
cpo-other 45.10% <ø> (ø)
hypershift-operator 53.71% <ø> (+0.12%) ⬆️
other 31.68% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The error returned from reconcileAWSEndpointServiceStatus when endpoint
service adoption fails now includes the underlying cause from
findExistingVpcEndpointService. The error is stable across reconcile
loops because the adoption lookup only fails with deterministic messages
(API error codes, static strings, or fixed LB ARNs) and never includes
variable content like AWS request IDs.

Add tests covering the endpoint service creation error paths:
InvalidParameter with successful adoption, failed adoption, describe
failure, and non-InvalidParameter API errors.

OCPBUGS-83514

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

@sdminonne: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/aws PR/issue for AWS (AWSPlatform) platform jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants