Skip to content

Conversation

@pappz
Copy link
Contributor

@pappz pappz commented Oct 31, 2025

Describe your changes

  • Only execute the update when there is human interaction with the "Connect" button in the UI.
  • Run the installation process from a detached binary

Issue ticket number and link

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

@pappz pappz changed the title Feat/auto upgrade fork [client] auto-update Oct 31, 2025
@pappz pappz force-pushed the feat/auto-upgrade-fork branch from a93f3b7 to d3fd3f0 Compare November 7, 2025 21:54
@pappz pappz force-pushed the feat/auto-upgrade-fork branch from 60598f8 to 6a52694 Compare November 10, 2025 15:03
@pappz pappz force-pushed the feat/auto-upgrade-fork branch from e332550 to f15c80c Compare November 11, 2025 13:50
@pappz pappz marked this pull request as ready for review November 12, 2025 11:33
@pappz
Copy link
Contributor Author

pappz commented Nov 12, 2025

@coderabbitai review this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 12, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Get the list of files changed in the PR
echo "=== Files changed in PR #4732 ==="
gh pr view 4732 --repo netbirdio/netbird --json files --jq '.files[] | .path'

echo -e "\n=== PR diff ==="
# Get the full diff of the PR
gh pr diff 4732 --repo netbirdio/netbird

Length of output: 264151


@pappz I'll review this PR for you. This is a substantial feature addition implementing client auto-update functionality. Let me analyze the key aspects:

Architecture Overview

The PR implements a two-process architecture for self-updates:

  1. Service Process: Initiates updates via RunInstallation()
  2. Updater Process: Detached child that performs actual installation and restarts services

This separation is well-designed to handle the limitation that a service cannot update itself while running.

Key Strengths

Strong Cryptographic Foundation: The reposign package implements a robust two-tier key hierarchy (root keys → artifact keys) with Ed25519 signatures and BLAKE2s hashing

Security Features:

  • Timestamp binding prevents replay attacks
  • Key expiration and revocation support
  • Signature verification before installation
  • Multiple root key support for key rotation

Good Documentation: Comprehensive package documentation in doc.go files

Result Communication: Clean IPC mechanism via JSON files between updater and service

Concerns & Recommendations

1. Security: Missing Signature Verification Before Execution

The RunInstallation() downloads and verifies signatures, but I don't see explicit verification that the downloaded installer matches the signature before the updater binary is copied and launched. The verification happens in Setup() which runs in the updater process.

Recommendation: Move artifact verification to happen in the service process before copying the updater binary.

2. Error Handling: Silent Failures

In client/cmd/root.go:

func Execute() error {
	if isUpdateBinary() {
		return updateCmd.Execute()
	}
	return rootCmd.Execute()
}

If updateCmd.Execute() fails, the error is returned but there's no logging. Users won't know why the update failed.

Recommendation: Add logging before returning errors from update execution.

3. Resource Leak: Missing Cleanup

In installer_run_darwin.go, the startUIAsUser() function:

launchCmd := exec.Command("launchctl", "asuser", userInfo.Uid, "open", "-a", uiBinary)
if err := launchCmd.Start(); err != nil {
    return fmt.Errorf("failed to start UI process: %w", err)
}
if err := launchCmd.Process.Release(); err != nil {
    log.Warnf("failed to release UI process: %v", err)
}

The command could fail after Start() but before Release(), leaving the process handle unreleased.

Recommendation: Use defer for cleanup:

if err := launchCmd.Start(); err != nil {
    return fmt.Errorf("failed to start UI process: %w", err)
}
defer func() {
    if err := launchCmd.Process.Release(); err != nil {
        log.Warnf("failed to release UI process: %v", err)
    }
}()

4. Race Condition: Update State Management

In manager.go, the update state is persisted but there's a window between checking the result and deleting the state where a crash could leave stale state:

func (m *Manager) CheckUpdateSuccess(ctx context.Context) {
    reason := m.lastResultErrReason()
    if reason != "" {
        // Publish error event
    }
    
    updateState, err := m.loadAndDeleteUpdateState(ctx)
    // If process crashes here, state is deleted but success event not published

Recommendation: Delete state only after successfully publishing events.

5. Platform-Specific: macOS Homebrew Handling

The Homebrew update path sends SIGTERM to the current process which may not guarantee a clean shutdown if there are active connections:

err = process.Signal(syscall.SIGTERM)

Recommendation: Consider implementing graceful shutdown with a timeout, similar to how Docker handles SIGTERM + SIGKILL.

6. Testing: Limited Integration Tests

I see unit tests for the signing system but no integration tests for the full update flow. The update process involves multiple steps across processes.

Recommendation: Add integration tests that verify:

  • Full update cycle from check → download → install → verify
  • Failure recovery scenarios
  • Service restart behavior

7. Observability: Missing Metrics

There's no telemetry for update operations. Consider tracking:

  • Update attempts vs. successes
  • Time to complete updates
  • Failure reasons
  • Version distributions

8. Documentation: Installation Flow Unclear

The installer_common.go comment says:

// updateCmd.SysProcAttr = &syscall.SysProcAttr{
//     CreationFlags: syscall.CREATE_NEW_PROCESS_GROUP | 0x00000008,
// }

This commented code suggests incomplete Windows process detachment. Is this intentional?

Recommendation: Either implement properly or document why it's not needed.

Minor Issues

  • Hardcoded paths: /var/lib/netbird/tmp-install should be configurable
  • Magic numbers: 5 * time.Minute rate limiting could be a constant
  • Error messages: Some errors don't include enough context for debugging
  • Missing null checks: updateManager is checked for nil in some places but not others

Verdict

This is well-architected code with strong security foundations, but it needs refinement in error handling, resource management, and testing before

merging. The cryptographic implementation is solid, but the update orchestration layer needs hardening.

Suggested Priority:

  1. 🔴 High: Fix signature verification timing (Security)
  2. 🔴 High: Add integration tests (Reliability)
  3. 🟡 Medium: Improve error handling and logging
  4. 🟡 Medium: Fix resource leaks
  5. 🟢 Low: Add observability metrics

Would you like me to focus on any specific area in more detail?

Make auto-update behavior configurable by adding a parameter to menuUpClick.
When connecting normally, auto-update is enabled (true). When switching profiles,
auto-update is disabled (false) to prevent automatic updates during profile changes.
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants