Skip to content

Conversation

@chetan-rns
Copy link
Collaborator

@chetan-rns chetan-rns commented Nov 28, 2025

What does this PR do / why we need it:
Currently, the AppProjects coming from the autonomous agents are prefixed with the agent name to avoid duplicate resource names. When the agent/principal restarts, they exchange certain resync events to sync both the components. The principal needs to ensure that it is looking for the prefixed names when looking for resources locally.

  1. Use prefixed names instead of the AppProject name from the autonomous agent
  2. Add additional e2e tests to verify different resync scenarios of an autonomous agent

Which issue(s) this PR fixes:

Fixes #660

How to test changes / Special notes to the reviewer:

Checklist

  • Documentation update is required by this PR (and has been updated) OR no documentation update is required.

Summary by CodeRabbit

  • New Features

    • Use agent-identifying name instead of raw client ID for resync operations; better handling of autonomous AppProject naming.
    • Added trace logging for resource resync events.
  • Bug Fixes

    • Improved error handling for agent identity extraction.
    • Ensure AppProject keys are added/removed under agent name for autonomous flows.
  • Tests

    • Extensive end-to-end tests for autonomous AppProject resyncs covering restarts, updates, and deletions.

✏️ Tip: You can customize this high-level summary in your review settings.

Assisted-by: Cursor
Signed-off-by: Chetan Banavikalmutt <[email protected]>
Signed-off-by: Chetan Banavikalmutt <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Nov 28, 2025

Walkthrough

Extract agent name from JSON-encoded remote ClientID and pass agentName into resync handlers; update principal callbacks and event processing to handle SourceNamespaces for autonomous agents; add E2E tests covering AppProject resync scenarios across principal/agent restarts.

Changes

Cohort / File(s) Summary
Agent inbound handling
agent/inbound.go, agent/inbound_test.go
Parse JSON-encoded AuthSubject from remote.ClientID(), extract agentName, handle unmarshal errors, and use agentName instead of raw ClientID when invoking resync handler methods. Test updated to set JSON-encoded AuthSubject as ClientID.
Resync handler signatures & logging
internal/resync/resync.go
Handler method signatures updated to accept agentName (replacing clientID parameter) across ProcessSyncedResourceListRequest, ProcessIncomingSyncedResource, ProcessRequestUpdateEvent, and ProcessIncomingResourceResyncRequest. Adds trace-level log for processed resource resync updates.
Principal callbacks & event processing
principal/callbacks.go, principal/event.go
Creation/deletion callbacks for AppProject now also add/remove resource keys under SourceNamespaces[0] for autonomous agents; event processing prefixes AppProject names for autonomous mode before calling resync handler.
E2E tests (autonomous resync scenarios)
test/e2e/resync_test.go
Adds numerous end-to-end tests covering AppProject resync across principal/agent restart, update, and delete scenarios in autonomous mode; introduces createAutonomousAppProject() and createAutonomousApp() helpers.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

  • Pay attention to JSON unmarshalling and error paths in agent/inbound.go.
  • Verify all resync handler callsites and signatures in internal/resync are consistently updated.
  • Review principal/callbacks.go for correct handling of SourceNamespaces[0] (bounds and semantics).
  • Validate name-prefixing logic in principal/event.go integrates with existing cache/lookup behavior.
  • Inspect new E2E tests for flakiness and correct setup/teardown.

Suggested reviewers

  • jgwest
  • mikeshng
  • jannfis

Poem

🐰 In tunnels bright I hop and sing,
Agent names now find their ring,
Projects stitched from far-off lands,
Recreated by careful hands,
Hooray — no more surprise goodbyes! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely describes the main change—using prefixed AppProject names during resync operations to fix the deletion issue.
Linked Issues check ✅ Passed The changes implement all key objectives: extracting agent names, using prefixed AppProject names in resync handlers, updating method signatures, and adding comprehensive e2e tests.
Out of Scope Changes check ✅ Passed All changes are focused on fixing the resync issue: agent name extraction, method signature updates, prefixed name handling, and relevant test coverage.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 72e539f and d558fea.

📒 Files selected for processing (1)
  • test/e2e/resync_test.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/e2e/resync_test.go (3)
test/e2e/fixture/goreman.go (3)
  • StopProcess (10-13)
  • IsProcessRunning (23-54)
  • StartProcess (16-19)
test/e2e/fixture/toxyproxy.go (1)
  • CheckReadiness (109-127)
test/e2e/fixture/fixture.go (1)
  • EnsureUpdate (174-198)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Run end-to-end tests
  • GitHub Check: Run unit tests
  • GitHub Check: Lint Go code
  • GitHub Check: Build & cache Go code
  • GitHub Check: Build and push image
  • GitHub Check: Analyze (go)
🔇 Additional comments (11)
test/e2e/resync_test.go (11)

20-20: LGTM!

The reflect import is correctly added to support DeepEqual comparisons in the new test assertions.


831-879: LGTM!

The test correctly fetches the agent's AppProject spec, derives the expected spec with autonomous transformations (SourceNamespaces and Destinations), and verifies the principal's spec matches after restart.


881-926: LGTM!

The test correctly verifies that AppProject resources remain synchronized on both clusters after an agent restart, using the agent's spec as the source of truth.


928-964: LGTM!

The test correctly verifies that updates made to the principal's copy of an autonomous AppProject are reverted when the principal restarts, since the agent is the source of truth in autonomous mode.


966-1002: LGTM!

The test correctly verifies that updates made on the agent are synchronized to the principal when the principal restarts, respecting the agent as source of truth in autonomous mode.


1004-1039: LGTM!

The test correctly verifies that updates made on the agent while it's down are synchronized to the principal when the agent reconnects.


1041-1078: LGTM!

The test correctly verifies that updates made to the principal's copy are reverted when the agent reconnects, maintaining the agent as the source of truth.


1127-1159: LGTM!

The test correctly verifies that when an AppProject is deleted from the agent, it is also deleted from the principal when the principal restarts.


1161-1194: LGTM!

The test correctly verifies that AppProject deletions on the agent are propagated to the principal when the agent reconnects.


1196-1245: LGTM!

The test correctly fetches the agent's AppProject spec and verifies that the principal recreates the AppProject with the proper transformations when the agent reconnects.


1384-1407: LGTM!

The helper correctly creates an autonomous AppProject on the agent cluster and verifies it's synchronized to the principal with the expected prefix.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chetan-rns chetan-rns changed the title Use prefixed resync fix: use prefixed AppProject name while handling resyncs Nov 28, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
principal/event.go (1)

586-607: Critical: Prefixing logic is unreachable due to conflicting mode check.

The condition at line 587 returns an error if agentMode != types.AgentModeManaged, which means only managed mode requests proceed past that point. However, line 599 checks for agentMode == types.AgentModeAutonomous, which will never be true since autonomous requests are rejected at line 588.

This defeats the purpose of the fix for issue #660 - the prefixing logic for autonomous agents will never execute.

The mode check at line 587 should likely allow autonomous mode for EventRequestUpdate events, or the prefixing block should be relocated before the mode check:

 	case event.EventRequestUpdate:
-		if agentMode != types.AgentModeManaged {
-			return fmt.Errorf("principal can only handle request update in the managed mode")
-		}
-
 		incoming := &event.RequestUpdate{}
 		if err := ev.DataAs(incoming); err != nil {
 			return err
 		}

 		// For autonomous agents, the agent sends RequestUpdate with the local AppProject name (e.g., "sample"),
 		// but the principal stores it with a prefixed name (e.g., "agent-autonomous-sample").
 		// We need to add the prefix before looking it up locally.
 		if agentMode == types.AgentModeAutonomous && incoming.Kind == "AppProject" {
 			prefixedName, err := agentPrefixedProjectName(incoming.Name, agentName)
 			if err != nil {
 				return fmt.Errorf("could not prefix project name: %w", err)
 			}
 			incoming.Name = prefixedName
 		}

 		return resyncHandler.ProcessRequestUpdateEvent(ctx, agentName, incoming)
🧹 Nitpick comments (2)
agent/inbound_test.go (1)

1473-1482: Remove debug print statement.

Line 1481 contains a debug print statement that should be removed before merging.

 	subjectJSON, err := json.Marshal(subject)
 	if err != nil {
 		t.Fatalf("Failed to marshal subject: %v", err)
 	}
 	a.remote.SetClientID(string(subjectJSON))
-	fmt.Println("a.remote.ClientID()", a.remote.ClientID())
 	err = a.queues.Create(a.remote.ClientID())
agent/inbound.go (1)

432-440: Consider using strings.TrimPrefix for cleaner prefix stripping.

The manual prefix stripping logic is correct but could be simplified using strings.TrimPrefix.

+	"strings"
 ...
 		// For autonomous agents, the principal stores AppProjects with a prefixed name (agent-name + "-" + project-name).
 		// When the principal sends a RequestUpdate, it uses the prefixed name. We need to strip the prefix
 		// before looking up the resource locally.
 		if incoming.Kind == "AppProject" {
 			prefix := agentName + "-"
-			if len(incoming.Name) > len(prefix) && incoming.Name[:len(prefix)] == prefix {
-				incoming.Name = incoming.Name[len(prefix):]
-			}
+			incoming.Name = strings.TrimPrefix(incoming.Name, prefix)
 		}

Note: strings.TrimPrefix returns the original string unchanged if the prefix is not present, so the behavior is equivalent but cleaner.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 101d4c8 and 72e539f.

📒 Files selected for processing (6)
  • agent/inbound.go (5 hunks)
  • agent/inbound_test.go (3 hunks)
  • internal/resync/resync.go (1 hunks)
  • principal/callbacks.go (2 hunks)
  • principal/event.go (1 hunks)
  • test/e2e/resync_test.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
internal/resync/resync.go (2)
internal/logging/logfields/logfields.go (2)
  • Kind (58-58)
  • Name (59-59)
internal/logging/logging.go (1)
  • Trace (285-287)
agent/inbound_test.go (1)
internal/auth/interface.go (1)
  • AuthSubject (19-22)
principal/event.go (2)
pkg/types/types.go (1)
  • AgentModeAutonomous (31-31)
internal/logging/logfields/logfields.go (2)
  • Kind (58-58)
  • Name (59-59)
principal/callbacks.go (2)
internal/resources/resources.go (1)
  • NewResourceKeyFromAppProject (62-71)
internal/backend/interface.go (1)
  • Namespace (124-127)
test/e2e/resync_test.go (3)
internal/backend/interface.go (2)
  • Namespace (124-127)
  • AppProject (98-108)
test/e2e/fixture/goreman.go (3)
  • StopProcess (10-13)
  • IsProcessRunning (23-54)
  • StartProcess (16-19)
test/e2e/fixture/toxyproxy.go (1)
  • CheckReadiness (109-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Run unit tests
  • GitHub Check: Run end-to-end tests
  • GitHub Check: Lint Go code
  • GitHub Check: Build & cache Go code
  • GitHub Check: Build and push image
  • GitHub Check: Analyze (go)
🔇 Additional comments (6)
internal/resync/resync.go (1)

159-159: LGTM!

The trace log addition is consistent with the existing logging pattern at line 198 and properly uses the logfields constants.

principal/callbacks.go (2)

183-193: LGTM!

The resource key handling correctly differentiates between autonomous and non-autonomous agents. For autonomous agents, using SourceNamespaces[0] as the agent name aligns with how the principal sets this field in processAppProjectEvent, ensuring proper resource tracking for resync operations.


302-312: LGTM!

The deletion callback mirrors the creation callback logic, properly removing resource keys under the correct agent name for autonomous agents via SourceNamespaces[0].

agent/inbound.go (1)

391-397: LGTM on agent name extraction.

The JSON unmarshalling of AuthSubject correctly extracts the agent name from the client ID. Error handling properly propagates failures.

test/e2e/resync_test.go (2)

831-878: LGTM on autonomous AppProject resync tests.

The test correctly validates that AppProjects are retained on both clusters after principal restart. The expected spec transformation logic properly accounts for the autonomous-specific modifications (SourceNamespaces set to agent name, destinations pointing to agent cluster).


1380-1403: LGTM on createAutonomousAppProject helper.

The helper correctly creates an AppProject on the autonomous agent cluster and waits for it to propagate to the principal with the expected prefixed name.

@codecov-commenter
Copy link

codecov-commenter commented Nov 28, 2025

Codecov Report

❌ Patch coverage is 29.62963% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.56%. Comparing base (101d4c8) to head (d558fea).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
principal/callbacks.go 0.00% 8 Missing ⚠️
agent/inbound.go 53.84% 4 Missing and 2 partials ⚠️
principal/event.go 0.00% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #666      +/-   ##
==========================================
- Coverage   45.62%   45.56%   -0.06%     
==========================================
  Files          90       90              
  Lines        9991    10011      +20     
==========================================
+ Hits         4558     4562       +4     
- Misses       4965     4978      +13     
- Partials      468      471       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Chetan Banavikalmutt <[email protected]>
Copy link
Member

@jgwest jgwest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @chetan-rns!

@jgwest jgwest merged commit c8217aa into argoproj-labs:main Nov 30, 2025
18 checks passed
chetan-rns added a commit to chetan-rns/argocd-agent that referenced this pull request Dec 2, 2025
chetan-rns added a commit to chetan-rns/argocd-agent that referenced this pull request Dec 2, 2025
jannfis pushed a commit that referenced this pull request Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AppProjects are deleted from principal after principal restart (autonomous mode)

3 participants