Skip to content

PLTF-2513: add laminar services to default NO_PROXY in openhands chart#636

Merged
aivong-openhands merged 2 commits into
mainfrom
av/laminar-no-proxy
May 15, 2026
Merged

PLTF-2513: add laminar services to default NO_PROXY in openhands chart#636
aivong-openhands merged 2 commits into
mainfrom
av/laminar-no-proxy

Conversation

@aivong-openhands
Copy link
Copy Markdown
Contributor

@aivong-openhands aivong-openhands commented May 15, 2026

Summary

When proxy_enabled=1 and analytics_enabled=1, conversation traces from the enterprise-server and runtime pods fail to reach Laminar. The trace POST goes to http://laminar-app-server-service:8000/v1/traces, but the bare in-cluster short name is not in the computed NO_PROXY allowlist, so the request is routed through HTTP_PROXY and dropped when the proxy cannot resolve cluster-internal DNS.

Root cause

replicated/openhands.yaml seeds $computedNoProxy with an allowlist of in-cluster short names plus .svc/.cluster.local suffix rules. laminar-app-server-service is a bare hostname (no dots), so neither suffix rule matches, and it was not in the short-name allowlist. The enterprise-server pod therefore proxies trace POSTs out of the cluster.

Observed on the replicated-01 AWS embedded-cluster install with an ngrok→mitmproxy HTTP_PROXY. mitmdump logged:

POST http://laminar-app-server-service:8000/v1/traces
 << [Errno 8] nodename nor servname provided, or not known

Fix

Append the 12 laminar in-cluster short names to $computedNoProxy, gated on analytics_enabled=1 so installs without Laminar do not get a bloated NO_PROXY:

  • laminar-app-server-service
  • laminar-clickhouse
  • laminar-frontend
  • laminar-postgres
  • laminar-query-engine
  • laminar-quickwit-control-plane
  • laminar-quickwit-indexer
  • laminar-quickwit-janitor
  • laminar-quickwit-metastore
  • laminar-quickwit-searcher
  • laminar-rabbitmq
  • laminar-redis

Test plan

Verified on replicated-01 AWS embedded-cluster install with proxy_enabled=1, analytics_enabled=1, and HTTP_PROXY/HTTPS_PROXY pointed at http://8.tcp.ngrok.io:25057 (ngrok→mitmproxy). User no_proxy field left unset.

  • NO_PROXY on the enterprise-server pod contains all 12 laminar short names

    $ kubectl -n openhands exec openhands-7f8b5c5499-qcl9r -- env | grep ^NO_PROXY
    NO_PROXY=127.0.0.1,cluster.local,keycloak,keycloak-headless,kubernetes,localhost,
      oh-main-lite-llm,oh-main-runtime-api,openhands-integrations-service,openhands-litellm,
      openhands-mcp-service,openhands-minio,openhands-runtime-api,openhands-service,svc,
      replicated-01.aws.aivong.platform-team.all-hands.dev,
      app.replicated-01.aws.aivong.platform-team.all-hands.dev,
      auth.app.replicated-01.aws.aivong.platform-team.all-hands.dev,
      llm-proxy.replicated-01.aws.aivong.platform-team.all-hands.dev,
      runtime-api.replicated-01.aws.aivong.platform-team.all-hands.dev,
      runtime.replicated-01.aws.aivong.platform-team.all-hands.dev,
      laminar-app-server-service,laminar-clickhouse,laminar-frontend,laminar-postgres,
      laminar-query-engine,laminar-quickwit-control-plane,laminar-quickwit-indexer,
      laminar-quickwit-janitor,laminar-quickwit-metastore,laminar-quickwit-searcher,
      laminar-rabbitmq,laminar-redis,
      localhost,127.0.0.1,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.svc,.cluster.local,
      169.254.169.254,github.com,.github.com
    

    OH_AGENT_SERVER_ENV (propagated to runtime pods) contains the same set.

  • Conversation started, trace visible in Laminar UI

    Conversation: https://app.replicated-01.aws.aivong.platform-team.all-hands.dev/conversations/d596ecc6300e4cebadc023c20ea199be
    Trace in Laminar: https://analytics.app.replicated-01.aws.aivong.platform-team.all-hands.dev/project/9c91ea8c-ae9c-455e-a51a-e659f300fb61/traces?pastHours=24&traceId=17dc9597-e350-cc00-27a8-0cbc3df4a8da&chat=true

  • laminar-app-server logs show /v1/traces POSTs from inside the cluster

    $ kubectl -n openhands logs laminar-app-server-676c765fd4-jg7pw --all-containers \
        | grep "v1/traces"
    2026-05-15T05:23:45Z INFO actix_web::middleware::logger:
      10.244.43.114 "POST /v1/traces HTTP/1.1" 200 0 "-" "OTel-OTLP-Exporter-Python/1.39.1"
    2026-05-15T05:23:50Z INFO actix_web::middleware::logger:
      10.244.43.114 "POST /v1/traces HTTP/1.1" 200 0 "-" "OTel-OTLP-Exporter-Python/1.39.1"
    2026-05-15T05:23:55Z INFO actix_web::middleware::logger:
      10.244.43.114 "POST /v1/traces HTTP/1.1" 200 0 "-" "OTel-OTLP-Exporter-Python/1.39.1"
    

    Source IP 10.244.43.114 is the agent runtime pod runtime-phnaihibosdnvmtt-588ff889b9-x2w5q, confirming traces are flowing pod→service over the in-cluster network (not via the external proxy). All responses are 200 OK. No matching POST http://laminar-app-server-service:8000/v1/traces failures in mitmdump.

  • Install with proxy_enabled=1 and analytics_enabled=0; confirm NO_PROXY does not include the laminar names

    Not yet retested on this install (would require disabling analytics and reinstalling). Confirmed by inspection of the template: the laminar append is gated on {{repl if ConfigOptionEquals "analytics_enabled" "1" }}, matching the gate that conditionally deploys the laminar services themselves (replicated/openhands.yaml:752).

Workaround (existing installs)

Append laminar-app-server-service to the NO_PROXY field in the Replicated Admin Console config and redeploy.

When proxy_enabled=1 and analytics_enabled=1, the enterprise-server pod
posts traces to http://laminar-app-server-service:8000/v1/traces. The
bare in-cluster short name did not match the existing NO_PROXY seed or
its .svc/.cluster.local suffix rules, so the request was routed through
HTTP_PROXY and dropped when the proxy could not resolve cluster DNS.

Append the 12 laminar in-cluster short names to $computedNoProxy when
analytics_enabled=1, matching how the laminar services are conditionally
deployed.
@aivong-openhands aivong-openhands marked this pull request as ready for review May 15, 2026 05:27
Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean fix with excellent evidence.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW

Single-line config change that correctly adds in-cluster laminar services to NO_PROXY when analytics is enabled. Well-gated, matches the condition for deploying laminar services, and thoroughly tested with production verification showing traces now flow directly to laminar-app-server without being routed through the external proxy.

VERDICT:
Worth merging: Solves a real production bug with the right approach.

KEY INSIGHT:
Proper NO_PROXY configuration for in-cluster services is essential when using HTTP proxies in Kubernetes - this fix correctly exempts internal services from external proxy routing.

@aivong-openhands aivong-openhands merged commit cf8ab16 into main May 15, 2026
2 checks passed
@aivong-openhands aivong-openhands deleted the av/laminar-no-proxy branch May 15, 2026 05:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants