-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an OVN Observability API enhancement #1693
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,290 @@ | ||
--- | ||
title: ovn-observability-api | ||
authors: | ||
- npinaeva | ||
reviewers: # Include a comment about what domain expertise a reviewer is expected to bring and what area of the enhancement you expect them to focus on. For example: - "@networkguru, for networking aspects, please look at IP bootstrapping aspect" | ||
- dceara | ||
- msherif1234 | ||
- jotak | ||
approvers: # A single approver is preferred, the role of the approver is to raise important questions, help ensure the enhancement receives reviews from all applicable areas/SMEs, and determine when consensus is achieved such that the EP can move forward to implementation. Having multiple approvers makes it difficult to determine who is responsible for the actual approval. | ||
- trozet | ||
api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" | ||
- "None" | ||
creation-date: 2024-10-02 | ||
last-updated: 2024-10-02 | ||
tracking-link: # link to the tracking ticket (for example: Jira Feature or Epic ticket) that corresponds to this enhancement | ||
- https://issues.redhat.com/browse/SDN-5226 | ||
--- | ||
|
||
# OVN Observability API | ||
|
||
## Summary | ||
|
||
OVN Observability is a feature that provides visibility for ovn-kubernetes-managed network by using sampling mechanism. | ||
That means, that network packets generate samples with user-friendly description to give more information about what and why is happening in the network. | ||
In openshift, this feature is integrated with the Network Observability Operator. | ||
|
||
## Motivation | ||
|
||
The purpose of this enhancement is to define an API to configure OVN observability in an openshift cluster. | ||
|
||
### User Stories | ||
|
||
- Observability: As a cluster admin, I want to enable OVN Observability integration with netobserv so that I can use network policy correlation feature. | ||
- Debuggability: As a cluster admin, I want to debug a networking issue using OVN Observability feature. | ||
|
||
### Goals | ||
|
||
- Design a CRD to configure OVN Observability for given user stories | ||
|
||
### Non-Goals | ||
|
||
- Discuss OVN Observability feature in general | ||
|
||
## Proposal | ||
|
||
There is a number of k8s features that can be used to configure OVN Observability right now, and some more that we expect | ||
to be added in the future. | ||
Currently supported features are: | ||
- NetworkPolicy | ||
- (Baseline)AdminNetworkPolicy | ||
- EgressFirewall | ||
- UDN Isolation | ||
- Multicast ACLs | ||
|
||
In the future more observability features, like egressIP and services, but also networking-specific features | ||
like logical port ingress/egress will be added. | ||
|
||
Every feature may set a sampling probability, that defines how often a packet will be sampled, in percent. 100% means every packet | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is per feature granularity sufficient? Should it be per feature per namespace instead? If we were to extend this to UDNs, I can imagine we want to have some per tenant control on the observability config. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do you expect an admin to disable observability in some namespaces, but enable in others? |
||
will be sampled, 0% means no packets will be sampled. | ||
|
||
Currently only on/off mode is supported, where on mode turns all available features with 100% probability. | ||
|
||
### Observability use case | ||
|
||
For observability use case, a cluster admin can set a sampling probability for each feature across the whole cluster. | ||
As sampling configuration may affect cluster performance, not every admin may want all features enabled with 100% probability. | ||
We expect this type of configuration to change rarely. | ||
|
||
This configuration should be handled by ovn-kubernetes to configure sampling, and by netobserv ebpf agent to "subscribe" to | ||
the samples enabled by this configuration. | ||
|
||
### Debuggability use case | ||
|
||
Debuggability is expected to be turned on during debugging sessions, and turned off after the session is finished. | ||
It will most likely have 100% probability for required features, and also may enable more networking-related features, like | ||
logical port ingress/egress for cluster admins. Debugging configuration is expected to have more granular filters to | ||
debug specific problems with minimal overhead. For example, per-node or per-namespace filtering may be required. | ||
|
||
Debuggability sampling may be used with or without Network Observability Operator being installed: | ||
- without netobserv, samples may be printed or forwarded to a file by ovnkube-observ binary, which is a part of ovnkube-node pod on every node. | ||
- with neobserv, samples may be displayed in a similar to observability use case way, but may need another interface to ensure | ||
cluster users won't be distracted by samples generated for debugging. | ||
|
||
This configuration should be handled by ovn-kubernetes to configure sampling, and by netobserv ebpf agent OR ovnkube-observ script | ||
to "subscribe" to the samples enabled by this configuration. | ||
|
||
### Workflow Description | ||
|
||
1. Cluster admin wants to enable OVN Observability for admin-managed features. | ||
2. Cluster admin created a CR with 100% probabilities for (Baseline)AdminNetworkPolicy and EgressFirewall. | ||
3. Messages like "Allowed by admin network policy test, direction Egress" appear on netobserv UI. | ||
4. Cluster admin sees CPU usage by ovnkube pods grows up 10%, and reduces the sampling probability to 50%. | ||
5. Still most of the flows have correct enrichment with lower overhead. | ||
6. Admin is happy. Until... | ||
7. A user from project "bigbank" reports that they can't access the internet from pod A. | ||
8. Cluster admin enables debugging for project "bigbank" with 100% probability for NetworkPolicies. | ||
9. netobserv UI or ovnkube-oberv script shows a sample with <`source pod A`, `dst_IP`> saying "Denied by network policy in namespace bigbank, direction Egress" | ||
10. Admin now has to convince the user that they are missing an allow egress rule to `dst_IP` for pod A. | ||
11. Network policy is fixed, debugging can be turned off. | ||
|
||
|
||
### API Extensions | ||
|
||
#### Observability filtering for debugging | ||
|
||
Mostly for debugging (but also can be used for observability if needed), we may need more granular filtering, | ||
like per-node or per-namespace filtering. | ||
|
||
- Per-node filtering may be implemented by adding a node selector to `ObservabilityConfig`, and is easy to implement in ovn-k | ||
as every node has its own ovn-k instance with OVN-IC. | ||
- Per-namespace filtering can only be applied to the namespaced Observability Features, like NetworkPolicy and EgressFirewall. | ||
Cluster-scoped features, like (Baseline)AdminNetworkPolicy, can only be additionally filtered based on name. Label selectors | ||
can't be used, as observability module doesn't watch any objects. | ||
|
||
#### ObservabilityConfig CRD | ||
|
||
The CRD for this feature will be a part of ovn-kubernetes to allow using it upstream. | ||
|
||
```go | ||
|
||
// ObservabilityConfig described OVN Observability configuration. | ||
// | ||
// +genclient | ||
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object | ||
// +kubebuilder:resource:path=observabilityconfig,scope=cluster | ||
// +kubebuilder:singular=observabilityconfig | ||
// +kubebuilder:object:root=true | ||
// +kubebuilder:subresource:status | ||
type ObservabilityConfig struct { | ||
metav1.TypeMeta `json:",inline"` | ||
metav1.ObjectMeta `json:"metadata,omitempty"` | ||
// +kubebuilder:validation:Required | ||
// +required | ||
Spec ObservabilitySpec `json:"spec"` | ||
// +optional | ||
Status ObservabilityStatus `json:"status,omitempty"` | ||
} | ||
|
||
type ObservabilitySpec struct { | ||
// CollectorID is the ID of the collector that should be unique across the cluster, and will be used to receive configured samples. | ||
// +kubebuilder:validation:Required | ||
// +kubebuilder:validation:Minimum=1 | ||
// +required | ||
CollectorID int32 `json:"collectorID"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is the value in the user providing the collector ID? Why not let OVNK choose a random value and insert it? Is it due to coordination with netobserv? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this ID is used by netobserv and the debug script, to say which observConfig we want to collect. Sometimes you may want 2 different ObservabilityConfigs (e.g. for netobserv and for debug script), then you need to distinguish them. As debug script doesn't have kube-api access it needs to know the ID There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So when this CR is created, couldn't OVNK assign an ID, rather than the user having to provide one? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we could do that, but it will require a centralized (aka cluster-manager) watcher to assign IDs. Then we will need to report that ID back via status, so that it can be used with debug script. |
||
// Features is a list of Observability features that can generate samples and their probabilities for a given collector. | ||
// +kubebuilder:validation:Required | ||
// +kubebuilder:validation:MinItems=1 | ||
// +required | ||
Features []FeatureConfig `json:"features"` | ||
|
||
// Filter allows to apply ObservabilityConfig in a granular manner. | ||
// +optional | ||
Filter *Filter `json:"filter,omitempty"` | ||
} | ||
|
||
type FeatureConfig struct { | ||
// Probability is the probability of the feature being sampled in percent. | ||
// +kubebuilder:validation:Required | ||
// +kubebuilder:validation:Minimum=0 | ||
// +kubebuilder:validation:Maximum=100 | ||
// +required | ||
Probability int32 | ||
// Feature is the Observability feature that should be sampled. | ||
// +kubebuilder:validation:Required | ||
// +required | ||
Feature ObservabilityFeature | ||
} | ||
|
||
// ObservabilityStatus contains the observed status of the ObservabilityConfig. | ||
type ObservabilityStatus struct { | ||
// +listType=map | ||
// +listMapKey=type | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` | ||
} | ||
|
||
// Filter allows to apply ObservabilityConfig in a granular manner. | ||
// Currently, it supports node and namespace based filtering. | ||
// If both node and namespace filters are specifies, they are logically ANDed. | ||
// +kubebuilder:validation:MinProperties=1 | ||
type Filter struct { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would it make sense to make the filter be on a per feature (that supports it) level? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah potentially. The idea is that if you have a problem on a given node or in a given namespace(s), you want to enable 100% sampling for all features on that node and/or namespace. So it is potentially more convenient to specify it once per config vs per every feature. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm but the sampling configuration is per feature. So I guess with this design if I want to sample a specific feature on 1 node, and another feature on another node, I would make 2 CRs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, the problem here is that I don't know what kind of filtering will actually be useful, and I would like to avoid over-complicating the API if possible. |
||
// nodeSelector applies ObservabilityConfig only to nodes that match the selector. | ||
// +optional | ||
NodeSelector *metav1.LabelSelector `json:"nodeSelector,omitempty"` | ||
// namespaces is a list of namespaces to which the ObservabilityConfig should be applied. | ||
// It only applies to the namespaced features, currently that includes NetworkPolicy and EgressFirewall. | ||
// +kubebuilder:MinItems=1 | ||
// +optional | ||
Namespaces *[]string `json:"namespaces,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe this is more of a future topic for discussion, but I was just thinking...wouldn't another good filter be on UDN? That way i know my users in network A are having problems i just want to turn on sampling for netpol in network A, which would then translate to on the backend to every network policy for namespaces where that network is active? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I think it is technically doable, and also can be considered an extension for config- or feature-wide filter |
||
} | ||
|
||
// ObservabilityStatus contains the observed status of the ObservabilityConfig. | ||
type ObservabilityStatus struct { | ||
// +listType=map | ||
// +listMapKey=type | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` | ||
} | ||
|
||
// +kubebuilder:validation:Enum=NetworkPolicy;AdminNetworkPolicy;EgressFirewall;UDNIsolation;MulticastIsolation | ||
type ObservabilityFeature string | ||
|
||
const ( | ||
NetworkPolicy ObservabilityFeature = "NetworkPolicy" | ||
AdminNetworkPolicy ObservabilityFeature = "AdminNetworkPolicy" | ||
EgressFirewall ObservabilityFeature = "EgressFirewall" | ||
UDNIsolation ObservabilityFeature = "UDNIsolation" | ||
MulticastIsolation ObservabilityFeature = "MulticastIsolation" | ||
) | ||
``` | ||
|
||
For now, we only have use cases for 2 different `ObservabilityConfig`s in a cluster, and ovn-kubernetes will likely have a | ||
limit for the number of CRs that will be handled. | ||
|
||
For the integration with netobserv, we could use a label on `ObservabilityConfig` to signal that configured samples should be | ||
reflected via netobserv. For example, `network.openshift.io/observability: "true"`. | ||
|
||
### Topology Considerations | ||
|
||
#### Hypershift / Hosted Control Planes | ||
|
||
As long as netobserv can be installed on the management or hosted cluster, integration should work fine. | ||
If we need to make it work for shared-cluster netobserv deployment, more work.testing will be needed. | ||
Netobserv team confirmed that separate installations for management and hosted clusters are supported as a result of https://issues.redhat.com/browse/NETOBSERV-1196. | ||
|
||
For debuggability mode without netobserv integration, everything should work separately on every cluster with its own ovn-kubernetes instance. | ||
|
||
#### Standalone Clusters | ||
#### Single-node Deployments or MicroShift | ||
|
||
SNO should be supported with netobserv, Microshift is not. | ||
Microshift most likely will not be supported even without netobserv, as observability requires new features to be enabled | ||
and adds overhead to the cluster. | ||
|
||
### Implementation Details/Notes/Constraints | ||
|
||
Filtering for debugging use case may be tricky as explained in "Observability filtering for debugging" section. | ||
But it is important to implement, as OVN implementation works in a way, where performance optimizations can only be | ||
applied when there is only 1 `ObservabilityConfig` for a given feature. That means, adding a second `ObservabilityConfig` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately core OVN documentation doesn't mention the performance impact with multiple collectors attached to a single Sample/ACL but it should. I'll try to post a patch in ovn-org/ovn for that. If that lands in time it would be good to link to the core OVN documentation from here. |
||
for a feature will have a bigger performance impact than creating a new one. | ||
|
||
### Risks and Mitigations | ||
|
||
The biggest risk of enabling observability is performance overhead. | ||
This will be mitigated by scale-testing and sampling percentage configuration. | ||
|
||
### Drawbacks | ||
|
||
- e2e testing upstream is not possible until github runners allow using kernel 6.11 | ||
|
||
## Test Plan | ||
|
||
- e2e downstream | ||
- perf/scale testing for dataplane and control plane as a GA plan | ||
- netobserv testing for correct GUI representation | ||
|
||
## Graduation Criteria | ||
|
||
### Dev Preview -> Tech Preview | ||
|
||
None | ||
|
||
### Tech Preview -> GA | ||
|
||
- More testing (upgrade, downgrade) | ||
- Available by default, but no overhead until observability config CR is created | ||
- Run perf/scale tests and document related metrics to track perf overhead, write down any limitations/recommendations | ||
based on the workload and cluster size | ||
- User facing documentation created in openshift-docs and netobserv docs | ||
npinaeva marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Enable netobserv metrics based on samples | ||
|
||
### Removing a deprecated feature | ||
|
||
## Upgrade / Downgrade Strategy | ||
|
||
No upgrade impact is expected, as this feature will only be enabled when a new CR is created. | ||
|
||
## Version Skew Strategy | ||
|
||
All components were designed to be backwards-compatible | ||
- old kernel + new OVS: we control these versions in a cluster and make sure to bump OVS version on supported kernel | ||
- old OVS + new OVN: OVN uses "sample" action that is available in older OVS versions | ||
- old OVN + new ovn-k: rolled out at the same time, no version skew is expected | ||
- old kernel/OVS/ovn-kubernetes + new netobserv: netobserv can detect if it's running on older ocp version and warn when the users try to enable the feature. | ||
|
||
## Operational Aspects of API Extensions | ||
|
||
Perf/scale considerations will be documented as a part of GA plan. | ||
|
||
## Support Procedures | ||
|
||
## Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should describe exactly what this means:
"default network pod trying to talk to the UDN pod on the default-network interface"