-
Notifications
You must be signed in to change notification settings - Fork 3
Adding Envoy Gateway as control plane ingress #312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
db255ec
Adding charts for envoy-gateway
3f2fa53
Fixing envoy-proxy repo url
23b43bd
Fixing name of envoy-gateway chart and aliasing
fdb6a5d
Updating chart version
247ea21
Adding envoy gateway and routes
907f8ec
Cleaning up charts to remove unused blocks
a190f34
Adding missing EG variables
39bd801
Updating global variables to new format
cab6c6a
Fixing port for flyteconsole
5443050
Removing auth-proxy reference in selfhosted
f330f62
Adding gate check for enabled features
f495938
Moving to using values.yaml
f73c6bd
Refactoring charts for envoy gateway
923f4df
Adding backend policy for http/2 on grpc for selfmanaged/hosted
e47c926
Adding timeouts and buffer limits to connections
5485f25
Adding bypass for unprotected endpoint and identity filter
5859dae
Fixing validation error on timeouts
7808e35
Adding redis caching for rate limiting
36b774f
Renaming some configs
826f473
Fixing usage of dig
26c9fb4
Refactoring config names
5486748
Fixing naming of services
115065b
Removing unused services
581ca34
Fix redis url
0307858
Fixing backend traffic policy to merge rate limit and connenction tim…
30de1d8
Bumping up the default value for rps
5df5c6d
Testing rate limiting
0df936c
Reverting rate limit rps back to desired value
5f32f59
Cleaning up rate limit config
afa9731
Fixing grpc routes for self-managed/hosted to match ingress-nginx
5ff7c7f
Fixing control plane auth plugins deployment
eaf5e12
Fixing filter plugins...I hope
1d7bc3e
Updating the comment in values files
4a377e3
Cleaning up the values files
834b5b8
Giving gateway service a consistent name
afb1c61
Adding self-signed cert and route for handling intra cluster communic…
a03985d
Removing comment
1b500fe
Adding loginUrl for redirect
947ffc9
Removing v2 gating
3d3d179
Updating tests
7604013
Updating keep alive settings
f1a4483
Http2 keepalive is not valid
552f933
Adding missing rate limiting policy on protected grpc routes
330abab
Fixing values inconsistency for intracluster
df97756
Updating readme files
b981d25
Fixing expected helm charts for test
fd20e2f
Merge branch 'main' into laura/pii-108-add-gateway-api
aviator-app[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
108 changes: 108 additions & 0 deletions
108
charts/controlplane/templates/common/_backendtrafficpolicy.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,108 @@ | ||
| {{- define "control-plane-library.backendtrafficpolicy" }} | ||
| # BackendTrafficPolicy — configures Envoy→gRPC backend connection settings. | ||
| # Replaces nginx grpc_connect_timeout, grpc_read_timeout, grpc_send_timeout. | ||
| # Two policies (one per GRPCRoute) so h2c and timeouts are scoped to gRPC traffic only. | ||
| # | ||
| # requestTimeout applies to unary calls; maxStreamDuration applies to streaming calls. | ||
| # "0s" for maxStreamDuration means no limit (equivalent to grpc_read_timeout 604800s on streaming routes). | ||
| # Both protected and unprotected GRPCRoutes contain streaming methods so both get the same config. | ||
| # | ||
| # Rate limit is also included here (when enabled) because route-level BTPs override gateway-level ones, | ||
| # so the gateway-level rate-limit BTP below would be suppressed for these two GRPCRoutes without it. | ||
| apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
| kind: BackendTrafficPolicy | ||
| metadata: | ||
| name: {{ template "flyte.name" . }}-grpc-protected-h2c | ||
| namespace: {{ template "flyte.namespace" . }} | ||
| spec: | ||
| targetRefs: | ||
| - group: gateway.networking.k8s.io | ||
| kind: GRPCRoute | ||
| name: {{ template "flyte.name" . }}-grpc-protected | ||
| timeout: | ||
| tcp: | ||
| connectTimeout: "1200s" # grpc_connect_timeout 1200s | ||
| http: | ||
| requestTimeout: "1200s" # grpc_read_timeout 1200s (unary calls) | ||
| maxStreamDuration: "0s" # no limit for streaming (grpc_read_timeout 604800s on streaming routes) | ||
| tcpKeepalive: | ||
| probes: 9 | ||
| idleTime: "15s" | ||
| interval: "15s" | ||
| http2: {} | ||
| {{- if .Values.envoyGateway.rateLimit.enabled }} | ||
| rateLimit: | ||
| type: Global | ||
| global: | ||
| rules: | ||
| - clientSelectors: | ||
| - sourceCIDR: | ||
| type: Distinct | ||
| value: "0.0.0.0/0" | ||
| limit: | ||
| requests: {{ .Values.envoyGateway.rateLimit.requestsPerUnit | default 100 }} | ||
| unit: {{ .Values.envoyGateway.rateLimit.unit | default "Second" }} | ||
| {{- end }} | ||
| --- | ||
| apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
| kind: BackendTrafficPolicy | ||
| metadata: | ||
| name: {{ template "flyte.name" . }}-grpc-unprotected-h2c | ||
| namespace: {{ template "flyte.namespace" . }} | ||
| spec: | ||
| targetRefs: | ||
| - group: gateway.networking.k8s.io | ||
| kind: GRPCRoute | ||
| name: {{ template "flyte.name" . }}-grpc-unprotected | ||
| timeout: | ||
| tcp: | ||
| connectTimeout: "1200s" # grpc_connect_timeout 1200s | ||
| http: | ||
| requestTimeout: "1200s" # grpc_read_timeout 1200s (unary calls) | ||
| maxStreamDuration: "0s" # no limit for WatchExecutionStatusUpdates streaming | ||
| tcpKeepalive: | ||
| probes: 9 | ||
| idleTime: "15s" | ||
| interval: "15s" | ||
| http2: {} | ||
| {{- if .Values.envoyGateway.rateLimit.enabled }} | ||
| rateLimit: | ||
| type: Global | ||
| global: | ||
| rules: | ||
| - clientSelectors: | ||
| - sourceCIDR: | ||
| type: Distinct | ||
| value: "0.0.0.0/0" | ||
| limit: | ||
| requests: {{ .Values.envoyGateway.rateLimit.requestsPerUnit | default 100 }} | ||
| unit: {{ .Values.envoyGateway.rateLimit.unit | default "Second" }} | ||
| {{- end }} | ||
| {{- if .Values.envoyGateway.rateLimit.enabled }} | ||
| --- | ||
| # Global per-source-IP rate limit — replaces nginx.ingress.kubernetes.io/limit-rps annotation. | ||
| # Requires EG rateLimit backend (envoyproxy/ratelimit + Redis) to be running. | ||
| # Enable via envoyGateway.rateLimit.enabled: true once the backend is confirmed healthy. | ||
| apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
| kind: BackendTrafficPolicy | ||
| metadata: | ||
| name: {{ template "flyte.name" . }}-global-rate-limit | ||
| namespace: {{ template "flyte.namespace" . }} | ||
| spec: | ||
| targetRefs: | ||
| - group: gateway.networking.k8s.io | ||
| kind: Gateway | ||
| name: {{ template "flyte.name" . }} | ||
| rateLimit: | ||
| type: Global | ||
| global: | ||
| rules: | ||
| - clientSelectors: | ||
| - sourceCIDR: | ||
| type: Distinct | ||
| value: "0.0.0.0/0" | ||
| limit: | ||
| requests: {{ .Values.envoyGateway.rateLimit.requestsPerUnit | default 100 }} | ||
| unit: {{ .Values.envoyGateway.rateLimit.unit | default "Second" }} | ||
| {{- end }} | ||
| {{- end }} |
23 changes: 23 additions & 0 deletions
23
charts/controlplane/templates/common/_clienttrafficpolicy.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| {{- define "control-plane-library.clienttrafficpolicy" }} | ||
| # ClientTrafficPolicy — configures inbound client connection settings on the Gateway. | ||
| # Replaces nginx server-snippet: client_header_timeout, client_body_timeout, | ||
| # client_header_buffer_size, and large_client_header_buffers. | ||
| apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
| kind: ClientTrafficPolicy | ||
| metadata: | ||
| name: {{ template "flyte.name" . }}-client-policy | ||
| namespace: {{ template "flyte.namespace" . }} | ||
| spec: | ||
| targetRefs: | ||
| - group: gateway.networking.k8s.io | ||
| kind: Gateway | ||
| name: {{ template "flyte.name" . }} | ||
| timeout: | ||
| http: | ||
| requestReceivedTimeout: "0s" # client_header_timeout 604800 | ||
| streamIdleTimeout: "0s" # client_body_timeout 604800 | ||
| connection: | ||
| # large_client_header_buffers 64 32k = 2Mi total; mitigates 400 errors from large cookies | ||
| # at the /me auth endpoint (see PE-1101). | ||
| bufferLimit: "2Mi" | ||
| {{- end }} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a section somewhere that explains a bit more on how the setup looks like with both enabled? The pre and post migration states are pretty clear, but it's a bit fuzzy how ingress will function during the dual deployment phase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the comment in a follow-up PR. But essentially, they aren't meant to operate both for long. This is so when we are in the process of migrating we can switch back if something goes wrong. There is a flag (set the cloud repo in the terraform) which will update the external-dns to either use "ingress" or "httproute"/"grpcroute" as it's target which tell the dns to to either use nginx or envoy. I have also setup weighted routing so we can filter only a small percentage of requests during testing if we would like