Skip to content

Conversation

@jra3
Copy link
Collaborator

@jra3 jra3 commented Oct 20, 2025

Summary

Implements ENG-2245 to publish antimetal.agent.v1.Instance resources to CloudInventory with proper relationships to enable targeted agent configuration.

Changes

Resource Publishing

  • Instance resources now published to CloudInventory at agent startup
  • Periodic refresh every 1 minute to maintain 5-minute TTL
  • Non-fatal errors - agent continues if initial publish fails

Relationships Created

  • Pod → Instance via kubernetes.v1.RegisteredAs predicate
  • SystemNode ↔ Instance bidirectional via antimetal.runtime.v1.Contains/ContainedBy

Configuration

  • Added Pod metadata environment variables via Kubernetes downward API:
    • POD_NAME - for Pod → Instance relationships
    • POD_NAMESPACE - for proper namespace scoping
    • POD_UID - for unique pod identification

Implementation

  • internal/runtime/publish.go - Instance publishing logic with relationship creation
  • pkg/config/environment/environment.go - Pod metadata helpers
  • cmd/main.go - Startup publish + periodic refresh goroutine
  • config/agent/agent.yaml - Downward API configuration

Behavior

Kubernetes Environment

  • ✅ Instance resource published with full metadata
  • ✅ Pod → Instance relationship created
  • ✅ SystemNode ↔ Instance relationships created
  • ✅ Resources refreshed every 1 minute (5-minute TTL)

Non-Kubernetes Environment

  • ✅ Instance resource published
  • ⚠️ Pod relationships skipped (graceful degradation)
  • ✅ SystemNode relationships created (if machine-id available)

Testing

  • Build passes
  • Linter clean
  • KIND cluster testing
  • CloudInventory query verification
  • Relationship traversal testing

Related

Closes ENG-2245

Wiki Documentation

Companion wiki documentation committed separately:

  • Environment-Configuration.md - POD_* environment variables
  • Resource-Types.md - Instance resource type and relationships
  • Kubernetes-Deployment.md - Downward API configuration

@jra3 jra3 force-pushed the feat/instance_cloudinventory_node branch 4 times, most recently from 9fff0d3 to 1a4ddec Compare October 20, 2025 22:36
@jra3 jra3 requested a review from haq204 October 21, 2025 15:37
@jra3 jra3 marked this pull request as ready for review October 21, 2025 15:37
jra3 and others added 5 commits November 5, 2025 16:31
…and node relationships

Publish antimetal.agent.v1.Instance resources to CloudInventory with
graph relationships connecting Pod → Instance and SystemNode ↔ Instance.
This enables the control plane to target agent configuration by pod
name and track which system node hosts each agent instance.

Implementation includes:
- Add Pod metadata environment variables via Kubernetes downward API
- Create environment package helpers for accessing pod metadata
- Implement PublishInstance function with relationship creation
- Use kubernetes.v1.RegisteredAs predicate for Pod → Instance edges
- Use runtime.v1.Contains/ContainedBy for bidirectional SystemNode edges
- Gracefully degrade when not running in Kubernetes
- Refactor periodic publishing from bare goroutine to Instance Manager
- Add K8s DNS-1123 validation for pod names and namespaces
- Add comprehensive unit tests for publish.go (6 test cases)
- Add debug logging when pod metadata unavailable
- Document K8s Pod TypeUrl constant with API reference link
- Add comment explaining Instance ResourceRef has no namespace

The Instance resource is always published even if relationship creation
fails, ensuring the agent is visible in CloudInventory regardless of
deployment environment. The Instance Manager follows the controller-
runtime manager pattern for lifecycle management and runs on all agent
instances (not leader-elected).

Closes ENG-2245

Co-Authored-By: Claude <[email protected]>
Signed-off-by: John Allen <[email protected]>
Fix systemd-container package dependency conflict by upgrading
libsystemd-shared and systemd packages before LVH VM setup.

Error was: systemd-container depends on libsystemd-shared (= 255.4-1ubuntu8.11)
but 255.4-1ubuntu8.10 was cached.
Address PR 233 review feedback with three refinements:

1. Adjust "Published Instance relationships" to debug log level (V(1))
   for consistency with other relationship publishing operations

2. Add bidirectional Pod↔Instance relationship using both RegisteredAs
   (Pod→Instance) and Underlying (Instance→Pod) predicates to enable
   traversal from either direction in the resource graph

3. Replace inline machine ID logic with host.CanonicalName() for system
   node identification, providing proper fallback chain (MachineInfo →
   CloudProviderID → FQDN → MachineID) and consistency with other
   host identification code

These changes improve codebase consistency and relationship model
completeness without altering functional behavior.

Co-Authored-By: Claude <[email protected]>
Signed-off-by: John Allen <[email protected]>
@jra3 jra3 force-pushed the feat/instance_cloudinventory_node branch from d3e10ff to ee6c4f3 Compare November 5, 2025 21:46
jra3 added 2 commits November 5, 2025 17:09
…ationships

Update TestCreatePodRelationships to correctly expect 2 bidirectional
relationships (RegisteredAs and Underlying) instead of just 1. The test
now validates both the forward relationship (Pod → Instance) and the
inverse relationship (Instance → Pod), matching the pattern used in
TestCreateSystemNodeRelationships.

This aligns the test expectations with the actual implementation that
creates bidirectional relationships for graph traversal in both
directions.
Remove periodic instance resource republishing in favor of single
publish at startup. The intake worker already handles TTL extension
by sending delta version heartbeats to the backend, making periodic
republishing unnecessary.

This change removes the instance.Manager component and its 1-minute
refresh interval, reducing architectural complexity and eliminating
redundant overhead.

Addresses PR #233 review feedback.
@jra3 jra3 requested a review from haq204 November 14, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants