Skip to content

Conversation

@yaseenisolated
Copy link

@yaseenisolated yaseenisolated commented Nov 24, 2025

Stacked PRs:


feat: add pyrocope continuous profiling to activator

add dependencies

feat: add pyrocope continuous profiling to activator

add dependencies

feat: Add feature gates for debug dashboard, event persistence, and r… (#78)

feat: add command line args for pyroscope tags

@yaseenisolated yaseenisolated mentioned this pull request Nov 24, 2025
yaseenisolated added a commit that referenced this pull request Nov 24, 2025
add dependencies

stack-info: PR: #82, branch: yaseenisolated/stack/1
@yaseenisolated yaseenisolated changed the base branch from release-1.18-cerebrium to yaseenisolated/stack/2 November 24, 2025 14:12
@yaseenisolated yaseenisolated changed the base branch from yaseenisolated/stack/2 to release-1.18-cerebrium November 24, 2025 14:18
yaseenisolated added a commit that referenced this pull request Nov 24, 2025
add dependencies

stack-info: PR: #82, branch: yaseenisolated/stack/1
@yaseenisolated yaseenisolated changed the base branch from release-1.18-cerebrium to yaseenisolated/stack/2 November 24, 2025 14:18
@CerebriumAI CerebriumAI deleted a comment from claude bot Nov 24, 2025
@CerebriumAI CerebriumAI deleted a comment from claude bot Nov 24, 2025
@CerebriumAI CerebriumAI deleted a comment from claude bot Nov 24, 2025
@yaseenisolated yaseenisolated changed the base branch from yaseenisolated/stack/2 to release-1.18-cerebrium November 24, 2025 14:22
yaseenisolated added a commit that referenced this pull request Nov 24, 2025
add dependencies

stack-info: PR: #82, branch: yaseenisolated/stack/1
@yaseenisolated yaseenisolated changed the base branch from release-1.18-cerebrium to yaseenisolated/stack/2 November 24, 2025 14:22
yaseenisolated added a commit that referenced this pull request Dec 1, 2025
add dependencies

stack-info: PR: #82, branch: yaseenisolated/stack/1
@CerebriumAI CerebriumAI deleted a comment from claude bot Dec 1, 2025
@CerebriumAI CerebriumAI deleted a comment from claude bot Dec 1, 2025
@yaseenisolated yaseenisolated changed the base branch from yaseenisolated/stack/2 to release-1.18-cerebrium December 1, 2025 17:40
yaseenisolated added a commit that referenced this pull request Dec 1, 2025
add dependencies

feat: add pyrocope continuous profiling to activator

add dependencies

stack-info: PR: #82, branch: yaseenisolated/stack/1

feat: Add feature gates for debug dashboard, event persistence, and r… (#78)

feat: add command line args for pyroscope tags
@yaseenisolated yaseenisolated changed the base branch from release-1.18-cerebrium to yaseenisolated/stack/3 December 1, 2025 17:43
@claude
Copy link

claude bot commented Dec 1, 2025

Pull Request Review: Pyroscope Continuous Profiling Integration

Thank you for adding continuous profiling support to the activator! This is a valuable observability feature. Below is my detailed review:


Positive Aspects

  1. Clean Integration: The Pyroscope integration is well-isolated and optional, only activating when PYROSCOPE_SERVER_ADDRESS is configured
  2. Comprehensive Profile Coverage: All relevant profile types are enabled (CPU, allocations, goroutines, mutex, block)
  3. Secure Credentials: Basic auth credentials are properly handled through environment variables
  4. Graceful Degradation: Profiler startup failures are logged but don't crash the application
  5. Good Tag Support: The parsePyroscopeTags function provides flexible custom tagging with proper parsing

🔍 Issues & Concerns

1. Critical: Missing Tests

Severity: High

The parsePyroscopeTags function has no unit tests. According to the repository's testing guidelines in CLAUDE.md, all new functionality should include comprehensive tests.

Recommendation: Add a test file cmd/activator/main_test.go with test cases:

func TestParsePyroscopeTags(t *testing.T) {
    tests := []struct {
        name     string
        input    string
        expected map[string]string
    }{
        {
            name:     "empty string",
            input:    "",
            expected: map[string]string{},
        },
        {
            name:     "single tag",
            input:    "env=prod",
            expected: map[string]string{"env": "prod"},
        },
        {
            name:     "multiple tags",
            input:    "env=prod;cluster=canary;region=us-west",
            expected: map[string]string{"env": "prod", "cluster": "canary", "region": "us-west"},
        },
        {
            name:     "tags with spaces",
            input:    "env = prod ; cluster = canary",
            expected: map[string]string{"env": "prod", "cluster": "canary"},
        },
        {
            name:     "empty tag segments",
            input:    "env=prod;;cluster=canary",
            expected: map[string]string{"env": "prod", "cluster": "canary"},
        },
        {
            name:     "empty key ignored",
            input:    "=value;env=prod",
            expected: map[string]string{"env": "prod"},
        },
        {
            name:     "missing value allowed",
            input:    "env=;cluster=canary",
            expected: map[string]string{"env": "", "cluster": "canary"},
        },
        {
            name:     "no equals sign ignored",
            input:    "invalid;env=prod",
            expected: map[string]string{"env": "prod"},
        },
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            result := parsePyroscopeTags(tt.input)
            if \!reflect.DeepEqual(result, tt.expected) {
                t.Errorf("parsePyroscopeTags(%q) = %v, want %v", tt.input, result, tt.expected)
            }
        })
    }
}

2. Security: Credential Exposure in Logs

Severity: Medium

Line 184 logs the server address when profiling starts:

log.Printf("Pyroscope profiler started, sending to %s", env.PyroscopeServerAddress)

If the PyroscopeServerAddress contains embedded credentials (e.g., https://user:[email protected]), they would be logged. While the code uses separate basic auth fields, users might still include credentials in the URL.

Recommendation: Sanitize the URL before logging:

// Parse and sanitize URL for logging
sanitizedAddr := env.PyroscopeServerAddress
if u, err := url.Parse(env.PyroscopeServerAddress); err == nil && u.User \!= nil {
    u.User = nil
    sanitizedAddr = u.String()
}
log.Printf("Pyroscope profiler started, sending to %s", sanitizedAddr)

3. Performance: Profile Rate Defaults

Severity: Low

The default values for mutex and block profiling may have performance implications:

  • MutexProfileRate: 100 (line 104)
  • BlockProfileRate: 10000 (line 105)

Context:

  • MutexProfileRate = 1 means profile every mutex contention event (high overhead)
  • BlockProfileRate in nanoseconds means profile blocks > 10μs

For the activator (a critical hot-path component handling all traffic), these defaults might introduce measurable overhead.

Recommendation: Document the performance impact in comments and consider more conservative defaults:

// MutexProfileRate controls mutex profiling sampling rate (1 = profile every event)
// Higher values reduce overhead. Recommended: 100-1000 for production
MutexProfileRate int `split_words:"true" default:"1000"`

// BlockProfileRate controls block profiling threshold in nanoseconds
// Only blocks longer than this value are profiled. Recommended: 100000+ for production
BlockProfileRate int `split_words:"true" default:"100000"`

4. Code Quality: Missing Input Validation

Severity: Low

The parsePyroscopeTags function doesn't validate or limit the size of the input string or number of tags. A malicious or misconfigured environment variable could cause excessive memory allocation.

Recommendation: Add reasonable limits:

const (
    maxTagStringLength = 4096  // Reasonable limit for tag string
    maxTags           = 50     // Reasonable limit for number of tags
)

func parsePyroscopeTags(tagsStr string) map[string]string {
    tags := make(map[string]string)
    if tagsStr == "" || len(tagsStr) > maxTagStringLength {
        return tags
    }
    
    pairs := strings.Split(tagsStr, ";")
    if len(pairs) > maxTags {
        pairs = pairs[:maxTags] // Truncate to max
    }
    
    // ... rest of parsing logic
}

5. Documentation: Missing Configuration Examples

Severity: Low

The PR lacks documentation on how to configure Pyroscope in a Kubernetes deployment. Users need to know:

  • How to set environment variables in the activator deployment
  • Example ConfigMap or deployment patch
  • How to verify profiling is working

Recommendation: Add a comment block or update relevant documentation:

# Example: Enable Pyroscope profiling for activator
# Add to config-deployment ConfigMap or activator deployment:
env:
- name: PYROSCOPE_SERVER_ADDRESS
  value: "http://pyroscope.monitoring.svc.cluster.local:4040"
- name: PYROSCOPE_TAGS
  value: "environment=production;cluster=us-west-2"
# Optional: Add basic auth
- name: PYROSCOPE_BASIC_AUTH_USER
  valueFrom:
    secretKeyRef:
      name: pyroscope-creds
      key: username
- name: PYROSCOPE_BASIC_AUTH_PASS
  valueFrom:
    secretKeyRef:
      name: pyroscope-creds
      key: password

6. Minor: Inconsistent Error Handling

Severity: Very Low

Line 181 uses log.Printf for errors while the rest of the file uses structured logging with logger. However, this is before the structured logger is initialized, so it's acceptable—but worth noting for consistency.


📊 Performance Considerations

  1. Runtime Overhead: Continuous profiling has minimal overhead (~1-2% CPU) when properly configured, which is acceptable for observability
  2. Memory Impact: The pyroscope-go client buffers profile data in memory. Under high load, this could add a few MB of memory usage
  3. Network Impact: Profile data is sent periodically (default: every 10s). Ensure network bandwidth is sufficient

🔒 Security Assessment

  1. ✅ Credentials properly separated from URL
  2. ✅ Optional feature (off by default)
  3. ⚠️ Consider sanitizing logged URLs
  4. ✅ No hardcoded credentials
  5. ✅ Uses established, reputable library (Grafana Pyroscope)

📝 Recommendations Summary

Must Address Before Merge:

  1. Add unit tests for parsePyroscopeTags

Should Address:
2. Sanitize URLs in log output
3. Document performance implications of profile rates
4. Add input validation/limits to parsePyroscopeTags

Nice to Have:
5. Add deployment configuration examples
6. Consider more conservative default profile rates for production


Overall Assessment

This is a solid feature addition that will provide valuable production insights. The implementation is clean and follows good practices. With the addition of tests and minor security improvements, this will be ready to merge.

Suggested Action: Address the test coverage issue (#1) before merging. Other issues can be addressed in follow-up PRs if needed.


Great work on this feature! Let me know if you'd like help implementing any of these recommendations.

@CerebriumAI CerebriumAI deleted a comment from claude bot Dec 1, 2025
add dependencies

feat: add pyrocope continuous profiling to activator

add dependencies

stack-info: PR: #82, branch: yaseenisolated/stack/1

feat: Add feature gates for debug dashboard, event persistence, and r… (#78)

feat: add command line args for pyroscope tags
@yaseenisolated yaseenisolated changed the base branch from yaseenisolated/stack/3 to release-1.18-cerebrium December 1, 2025 18:20
@yaseenisolated yaseenisolated changed the base branch from release-1.18-cerebrium to yaseenisolated/stack/3 December 1, 2025 18:21
@claude
Copy link

claude bot commented Dec 1, 2025

Code Review: Pyroscope Continuous Profiling Integration

Thank you for adding continuous profiling support to the activator! This is a valuable addition for production observability. Below is my detailed review:


Strengths

  1. Optional & Non-Breaking: The integration is entirely optional (only activates when PYROSCOPE_SERVER_ADDRESS is set), ensuring backward compatibility.

  2. Comprehensive Profile Types: Excellent coverage of profile types including CPU, memory allocations, goroutines, mutex, and block profiles - this will provide deep insight into activator performance.

  3. Good Error Handling: Non-fatal error handling for Pyroscope initialization prevents the activator from failing to start if profiling setup fails.

  4. Flexible Configuration: Support for custom tags and basic auth provides good flexibility for different deployment scenarios.

  5. Clean Helper Function: The parsePyroscopeTags function is well-implemented with proper error handling and edge cases (empty strings, whitespace).


🔍 Code Quality Issues

1. Security: Credentials in Environment Variables

Location: cmd/activator/main.go:101-102

Issue: Storing credentials in environment variables is not ideal for production use.

Recommendation: Consider using Kubernetes secrets mounted as files, or integrate with a secrets manager. Document this security consideration in comments:

// SECURITY NOTE: For production use, consider using Kubernetes secrets 
// or a secrets manager instead of environment variables for credentials
PyroscopeBasicAuthUser string `split_words:"true"`
PyroscopeBasicAuthPass string `split_words:"true"`

2. Inconsistent Logging

Location: cmd/activator/main.go:206-210

Issue: Using log.Printf instead of the structured logger (logger).

Problem: The structured zap logger is created at line 166, but Pyroscope initialization at line 206 uses the standard log package, creating inconsistent logging.

Fix:

if _, err := pyroscope.Start(pyroscopeConfig); err != nil {
    logger.Warnw("Failed to start Pyroscope profiler", zap.Error(err))
} else {
    logger.Infow("Pyroscope profiler started", zap.String("server", env.PyroscopeServerAddress))
}

3. Missing Configuration Validation

Location: cmd/activator/main.go:173

Issue: No validation that PyroscopeServerAddress is a valid URL.

Recommendation: Add URL validation:

if env.PyroscopeServerAddress != "" {
    if _, err := url.Parse(env.PyroscopeServerAddress); err != nil {
        logger.Warnw("Invalid Pyroscope server address, profiling disabled", 
            zap.String("address", env.PyroscopeServerAddress), 
            zap.Error(err))
        goto skipPyroscope
    }
    // ... rest of pyroscope setup
}
skipPyroscope:

4. Default Profile Rates Not Documented

Location: cmd/activator/main.go:104-105

Issue: Default values (100, 10000) are not explained.

Recommendation: Add comments explaining the implications:

// MutexProfileRate controls mutex profiling sampling rate (1 in N events)
// Higher = less overhead but less detail. Default 100 captures 1% of mutex events.
MutexProfileRate int `split_words:"true" default:"100"`

// BlockProfileRate sets the fraction of goroutine blocking events recorded (nanoseconds)
// Only events blocking >= this duration are recorded. Default 10000ns = 10µs.
BlockProfileRate int `split_words:"true" default:"10000"`

⚠️ Potential Issues

1. Performance Impact Not Documented

Concern: Continuous profiling, especially with all profile types enabled, can have non-trivial overhead (typically 1-5% CPU).

Recommendation:

  • Add comments documenting expected overhead
  • Consider making profile types configurable rather than enabling all by default
  • Document testing performed to validate acceptable overhead

2. No Graceful Shutdown

Issue: Pyroscope profiler is started but never explicitly stopped on shutdown.

Consideration: While Pyroscope likely handles this internally, explicitly stopping it during graceful shutdown would be cleaner:

// After line 210, store the profiler instance
var profilerSession *pyroscope.Profiler
if env.PyroscopeServerAddress != "" {
    // ... setup code ...
    profilerSession, err = pyroscope.Start(pyroscopeConfig)
    // ... error handling ...
}

// In shutdown section (after line 423):
if profilerSession != nil {
    logger.Info("Stopping Pyroscope profiler")
    profilerSession.Stop()
}

3. Hostname Tag May Not Be Unique

Location: cmd/activator/main.go:179

Issue: Using PodName as "hostname" tag is good, but in large deployments, additional context (namespace, cluster) would be helpful.

Recommendation:

tags["hostname"] = env.PodName
tags["namespace"] = system.Namespace() // Add namespace context

🧪 Testing Considerations

Missing Test Coverage

Critical: No unit tests for the new functionality.

Recommendations:

  1. Test parsePyroscopeTags:
// Test cases needed:
// - Empty string
// - Valid single tag
// - Multiple tags
// - Tags with whitespace
// - Tags without values
// - Malformed tags (no '=' sign)
// - Semicolons at start/end
  1. Integration Test: Mock Pyroscope server to test initialization paths (success/failure).

  2. Configuration Test: Validate all environment variable combinations.


📝 Documentation Recommendations

  1. Add configuration documentation to a README or deployment guide:

    • Required vs optional environment variables
    • Expected format for PYROSCOPE_TAGS
    • Performance implications
    • Security best practices for credentials
  2. Add example deployment manifest:

env:
- name: PYROSCOPE_SERVER_ADDRESS
  value: "https://pyroscope.example.com"
- name: PYROSCOPE_TAGS
  value: "env=prod;cluster=us-east-1"
- name: PYROSCOPE_BASIC_AUTH_USER
  valueFrom:
    secretKeyRef:
      name: pyroscope-credentials
      key: username

🔒 Security Considerations

  1. Credential Exposure: As mentioned above, environment variables are logged and visible in process listings.

  2. Network Security: Ensure Pyroscope server communication uses TLS (validate https:// scheme).

  3. Data Sensitivity: Profile data can contain sensitive information (function names, stack traces). Ensure Pyroscope server is properly secured.


🎯 Summary & Recommendations

Severity Levels:

  • 🔴 High: Inconsistent logging, missing tests
  • 🟡 Medium: Credential security, missing validation, no graceful shutdown
  • 🟢 Low: Documentation, default rate explanations

Suggested Changes Before Merge:

  1. Must Fix:

    • Use structured logger (logger) instead of log.Printf
    • Add unit tests for parsePyroscopeTags
  2. 🤔 Should Fix:

    • Add URL validation for PyroscopeServerAddress
    • Document performance impact and default profile rates
    • Add graceful shutdown for profiler
  3. 💭 Nice to Have:

    • Make profile types configurable
    • Add namespace to tags
    • Document security best practices
    • Provide example Kubernetes manifests

Overall Assessment

This is a solid implementation that follows good practices for optional feature integration. The code is clean and well-structured. With the suggested fixes (especially logging consistency and test coverage), this will be a valuable addition for production debugging and performance analysis of the activator component.

Great work on enabling continuous profiling - this will be extremely valuable for understanding activator behavior under load! 🚀

@yaseenisolated yaseenisolated merged commit 57637a3 into yaseenisolated/stack/3 Dec 2, 2025
51 checks passed
yaseenisolated added a commit that referenced this pull request Dec 2, 2025
* go mod vendor

stack-info: PR: #90, branch: yaseenisolated/stack/3

* feat: add pyrocope continuous profiling to activator (#82)

add dependencies

feat: add pyrocope continuous profiling to activator

add dependencies

stack-info: PR: #82, branch: yaseenisolated/stack/1

feat: Add feature gates for debug dashboard, event persistence, and r… (#78)

feat: add command line args for pyroscope tags
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants