Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDN-5519: OVN-K: Localnet API support #1750

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ormergi
Copy link
Contributor

@ormergi ormergi commented Feb 4, 2025

Introduce API for Localnet topology networks.

This proposal outlines the Localnet API utilizing OVN-Kubernetes ClusterUserDefinedNetwork (CUDN) CRD.
And extending the CUDN CRD to allow creating localnet topology networks.

https://issues.redhat.com/browse/SDN-5519

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2025
Copy link
Contributor

openshift-ci bot commented Feb 4, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ormergi ormergi force-pushed the localnet-api-ext branch 7 times, most recently from f552b5c to 466d214 Compare February 5, 2025 13:37
@ormergi ormergi marked this pull request as ready for review February 6, 2025 17:32
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2025
@openshift-ci openshift-ci bot requested review from danwinship and dougbtv February 6, 2025 17:33
@ormergi
Copy link
Contributor Author

ormergi commented Feb 6, 2025

/cc @maiqueb @ricardomaraschini

Copy link
Contributor

@maiqueb maiqueb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes requested.

From my perspective, you can start fishing for an SDN reviewer, and an API reviewer.

- "@ormergi"
reviewers:
- "@maiqueb"
- TBD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start networking this PR in the SDN team, and find an API reviewer from them (I'd suggest @npinaeva ) and an approver.

We also want someone involved from the API team asap. We don't want them getting involved too late.


## Proposal

### Summary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another summary ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to express the proposal in few sentences to enable reviewers realize whats the proposed solution idea.

bridge-mappings:
- localnet: physnet <---- has to match the NAD config.spec.name. OR
the NAD config.spec.physicalNetworkName
bridge: br-ex <--------- OVN-K switch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bridge: br-ex <--------- OVN-K switch
bridge: br-ex <--------- OVS switch

You're not forced to use br-ex. You can use nmstate to create and configure whatever other switch you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


As of today the CUDN CRD controller does not allow mutating `spec.network`.
In order to comply with NAD spec mutation support, the CUDN CRD controller should allow spec.network mutation for
localnet topologies, specifically MTU and VLAN.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus the physicalNetworkName and ignored subnets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


The CUDN `spec.network` follows the [discriminated-union](https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#discriminated-unions),
convention.
The `spec.network.topology` serves as the union discriminator, it should accept `Locannet` option.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `spec.network.topology` serves as the union discriminator, it should accept `Locannet` option.
The `spec.network.topology` serves as the union discriminator, it should accept `Localnet` option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

| VLAN | The network VLAN ID. | Yes |
| Subnets | List of CIDRs used for the pod network across the cluster.Dual-stack clusters may set 2 subnets (one for each IP family), otherwise only 1 subnet is allowed. The format should match standard CIDR notation (for example, "10.128.0.0/16"). This field must be omitted if `ipam.mode` is `Disabled`. | Yes |
| ExcludeSubnets | List of CIDRs removed from the specified CIDRs in `subnets`.The format should match standard CIDR notation (for example, "10.128.0.0/16"). This field must be omitted if `subnetes` is unset or `ipam.mode` is `Disabled`. | Yes |
| IPAM | Contains IPAM-related configuration for the network. When ipam.lifecycle=Persistent enable workloads have persistent IP addresses. For example: Virtual Machines will have the same IP addresses along their lifecycle (stop, start migration, reboots). | Yes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm conflicted. Should we actually move the subnets + excludesubnets attributes to IPAM ? ...

It kind of makes sense ... but would make it diverge from L2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested changing IPAM config in recent IPAM config API changes to have subents but unfortunately it wasn't accepted.

Its like you said if we change it for localnet topology configuration, it diverge from other typologies. I think we should avoid that 😬

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we definitely want all network types to be consistent, and we discussed this (more than once :D) before.
I think one of the reason to leave it separate was our plan to use subnets field for egress setup but with disabled ipam, so subnets isn't necessarily a part of ipam

- When Subnets consist of IPv6 CIDR, minimum MTU should be 1280.
- `VLAN`:
- VLAN IDs `0` (untagged),`1` (default), highest VLAN ID is `4095`.
Minimum: 2, Maximum: 4095.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed. Also added ref.

"topology": "localnet",
"name": "cluster.udn.test-net",
"physicalNetworkName: "tenantblue",
"mtu": 1500,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you fix the formatting ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

ormergi

This comment was marked as duplicate.

@ormergi
Copy link
Contributor Author

ormergi commented Feb 9, 2025

Changes Address #1750 (review)

Copy link
Contributor

@maiqueb maiqueb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting reviews from our peers .

- "@maiqueb"
- TBD
approvers:
- TBD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed could you review this API ? We've tried to keep it really small and focused on the API itself.

Or find someone on your team that could lend a hand.

- "@ormergi"
reviewers:
- "@maiqueb"
- TBD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npinaeva you're my go-to API person in the SDN team.

Can we beg you for some reviews ?... Or has tried to keep this really short.

@maiqueb
Copy link
Contributor

maiqueb commented Feb 10, 2025

@ormergi you need to fix the markdown linter ...

Copy link
Member

@npinaeva npinaeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense!

- `VLAN`: <br/>
According to [dot1q (IEEE 802.1Q)](https://ieeexplore.ieee.org/document/10004498),
VID (VLAN ID) is 12-bits field, providing 4096 values; 0 - 4095. <br/>
The VLAN IDs `0` and `4095` are reserved, `1` is the default. <br/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "1 is the default" mean? I thought default option was unspecified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VLAN ID 1 is also reserved, it is used for management and usually the default. Sorry for confusion.
Fixed.

// +kubebuilder:validation:XValidation:rule="!has(self.ipam) || !has(self.ipam.mode) || self.ipam.mode != 'Disabled' || !has(self.subnets)", message="Subnets must be unset when ipam.mode is Disabled"
// +kubebuilder:validation:XValidation:rule="!has(self.ipam) || !has(self.ipam.mode) || self.ipam.mode != 'Disabled' || self.role == 'Secondary'", message="Disabled ipam.mode is only supported for Secondary network"
// +kubebuilder:validation:XValidation:rule="!has(self.subnets) || !has(self.mtu) || !self.subnets.exists_one(i, isCIDR(i) && cidr(i).ip().family() == 6) || self.mtu >= 1280", message="MTU should be greater than or equal to 1280 when IPv6 subent is used"
// +kubebuilder:validation:XValidation:rule="!has(self.excludeSubnets) || has(self.excludeSubnets) && !has(self.subnets)", message="ExcludeSubnets must be unset when subents is unset"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we want to review CEL validation here or in the implementation PR, but I think this should be
!has(self.excludeSubnets) || has(self.subnets) not sure if we want to add IP family validation, may be costly for 25 items

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed CEL validation rules for now, we can revisit them in the implementation PR.

Thanks for the suggestion on this rule, noted 🙂

// The format should match standard CIDR notation (for example, "10.128.0.0/16").
// This field must be omitted if `subnetes` is unset or `ipam.mode` is `Disabled`.
//
// +optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will need its own type, DualStackCIDRs has MaxItems=2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
I changed it to []CIDR.


// VLAN is the VLAN ID.
//
// +kubebuilder:validation:Minimum=2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what happened to vlan=1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VLAN ID 1 is also reserved by the standard, used for management.
The bridge (the bridging entity, e.g.: switch, linux-bridge) defines each port with VLAN 1if not spesifed.

| VLAN | The network VLAN ID. | Yes |
| Subnets | List of CIDRs used for the pod network across the cluster.Dual-stack clusters may set 2 subnets (one for each IP family), otherwise only 1 subnet is allowed. The format should match standard CIDR notation (for example, "10.128.0.0/16"). This field must be omitted if `ipam.mode` is `Disabled`. | Yes |
| ExcludeSubnets | List of CIDRs removed from the specified CIDRs in `subnets`.The format should match standard CIDR notation (for example, "10.128.0.0/16"). This field must be omitted if `subnetes` is unset or `ipam.mode` is `Disabled`. | Yes |
| IPAM | Contains IPAM-related configuration for the network. When ipam.lifecycle=Persistent enable workloads have persistent IP addresses. For example: Virtual Machines will have the same IP addresses along their lifecycle (stop, start migration, reboots). | Yes |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we definitely want all network types to be consistent, and we discussed this (more than once :D) before.
I think one of the reason to leave it separate was our plan to use subnets field for egress setup but with disabled ipam, so subnets isn't necessarily a part of ipam

// Secondary network is only assigned to pods that use `k8s.v1.cni.cncf.io/networks` annotation to select given network.
//
// +kubebuilder:validation:Required
// +kubebuilder:validation:XValidation:rule="self != 'Secondary'", message="Localnet topology is supported for secondary networks only."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to leave role here explicitly, but is it ever possible to enable this for primary network?

Copy link
Contributor

@maiqueb maiqueb Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have no plans, and IMHO, the use-case is pointless.

It doesn't need egress, it will not have SNAT to the node's IP addresses, etc ...

I don't see it happening.

Copy link
Contributor Author

@ormergi ormergi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npinaeva PTAL

- `VLAN`: <br/>
According to [dot1q (IEEE 802.1Q)](https://ieeexplore.ieee.org/document/10004498),
VID (VLAN ID) is 12-bits field, providing 4096 values; 0 - 4095. <br/>
The VLAN IDs `0` and `4095` are reserved, `1` is the default. <br/>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VLAN ID 1 is also reserved, it is used for management and usually the default. Sorry for confusion.
Fixed.

// +kubebuilder:validation:XValidation:rule="!has(self.ipam) || !has(self.ipam.mode) || self.ipam.mode != 'Disabled' || !has(self.subnets)", message="Subnets must be unset when ipam.mode is Disabled"
// +kubebuilder:validation:XValidation:rule="!has(self.ipam) || !has(self.ipam.mode) || self.ipam.mode != 'Disabled' || self.role == 'Secondary'", message="Disabled ipam.mode is only supported for Secondary network"
// +kubebuilder:validation:XValidation:rule="!has(self.subnets) || !has(self.mtu) || !self.subnets.exists_one(i, isCIDR(i) && cidr(i).ip().family() == 6) || self.mtu >= 1280", message="MTU should be greater than or equal to 1280 when IPv6 subent is used"
// +kubebuilder:validation:XValidation:rule="!has(self.excludeSubnets) || has(self.excludeSubnets) && !has(self.subnets)", message="ExcludeSubnets must be unset when subents is unset"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed CEL validation rules for now, we can revisit them in the implementation PR.

Thanks for the suggestion on this rule, noted 🙂

// The format should match standard CIDR notation (for example, "10.128.0.0/16").
// This field must be omitted if `subnetes` is unset or `ipam.mode` is `Disabled`.
//
// +optional
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
I changed it to []CIDR.


// VLAN is the VLAN ID.
//
// +kubebuilder:validation:Minimum=2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VLAN ID 1 is also reserved by the standard, used for management.
The bridge (the bridging entity, e.g.: switch, linux-bridge) defines each port with VLAN 1if not spesifed.

@ormergi
Copy link
Contributor Author

ormergi commented Feb 12, 2025

Changes: address #1750 (review)

@ormergi
Copy link
Contributor Author

ormergi commented Feb 12, 2025

@trozet could you please have a look? 🙏

Comment on lines 403 to 404
// +required
Network NetworkSpec `json:"network"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is spec.network already a required field today?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```go
// +kubebuilder:validation:XValidation:rule="self == oldSelf", message="Network spec is immutable"
```
This validation should be changed to allow mutation localnet topology configurations, at least MTU and VLAN.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that spec.network should not be immutable only if the spec.network.topology is set to Localnet?

How do you plan to implement that change to the validation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and let me add some context to allowing spec mutations for localnet topology:
Allowing spec mutation for localnet motivation followed by recorded customer cases describing the current API limitations flaws (e.g.: immutable spec) and impact on UX (bad) https://issues.redhat.com/browse/PLMCORE-10896.
For starters we want to allow spec mutation mostly for the following fields: VLAN, MTU, excludeSubnets and physicalNetworkName.
Having it localnet topology settings mutable improve the UX because it save re-creating CRs in case of misconfiguration, and allow changes to the topology settings on day2.

Regarding to CEL rule validations required changes, what I had in mind is inverting the spec mutation validation we have today on spec.network as follows:

  1. Remove mutation restrictions on spec.network.
  2. Add CEL rule validations to restrict mutations of spec.network.topology, spec.network.layer2, and spec.network.layer3.
  3. Add CEL rule validations to restrict mutation of localnet topology settings: spec.network.localnet.role, spec.network.localnet.subnetes and spec.network.topology.ipam.

Depending on OVN-K maintainers feedback we can, do broader change and allow spec mutations for other topologies as well (layer2 and layer3), naturally it will simplify the required CEL rule validations.

I though we better start with hardest restrictions we can that make sense, and upon necessity remove them later on, keeping things backward compatible.
WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that approach sounds reasonable. I do think it would be worth explicitly including this approach in the enhancement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added text about allowing spec mutations for localnet configurations, at the API level.

//
// +kubebuilder:validation:Required
// +required
Role NetworkRole `json:"role"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the NetworkRole a type that already exists today? Does it have the appropriate enum tags to limit the allowed values to only Secondary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, NetworkRole type exist and limited to Primary and Secondary,
please see https://github.com/ovn-kubernetes/ovn-kubernetes/blob/master/go-controller/pkg/crd/userdefinednetwork/v1/shared.go#L170.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you intending for this field to also allow setting Primary and Secondary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #1750 (comment)

For localnet typologies we dont see it happen in the near future, hence Secondary only for now.
Since we want to maintain similar semantics to exiting typologies, this field should remain required.
Make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying here: there is no use case for having a primary UDN whose topology is localnet.

@everettraven what does the openshift API guidelines say about this scenario ?... What is more important ? Consistency across the API (forcing the user to specify always the same value) or make the user provide less input ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My specific concern here is that you specify that a user can only set Secondary in the Godoc, but the type you are using has validation that allows setting both Primary and Secondary. There is no additional validation that I see enforces that this must only be set to Secondary. Without that explicit validation in place your Godoc is not accurate.

Copy link
Contributor Author

@ormergi ormergi Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 422 to 431
// +kubebuilder:validation:Required
// +required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only use the +required tag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, ensure it is clearly stated in the Godoc that the field is required. Users using something like oc explain ... can't see the tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Localnet topology configuration type
```go
type LocalnetConfig struct {
// Role describes the network role in the pod.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

References to fields in godoc should use the serialized field name as per OpenShift API conventions: https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#use-json-field-names-in-godoc

Suggested change
// Role describes the network role in the pod.
// role describes the network role in the pod.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "network role" ? Is this something that users are expected to already be familiar with? Why would they care about setting this value? What does it let them achieve?

These are all questions that I would expect to have answered when reading the godoc for this field.

For more guidance on some things to consider to write user readable documentation in godoc, see https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#write-user-readable-documentation-in-godoc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "network role" ? Is this something that users are expected to already be familiar with? Why would they care about setting this value? What does it let them achieve?

Network role control whether the pod interface will act as the primary or secondary. primary used as the default gateway for the pod. Its up to the user to deciede what kind of network they like to create.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added text to godoc.


// IPAM section contains IPAM-related configuration for the network.
//
// +optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly mention that this field is optional in the godoc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// +optional
IPAM *IPAMConfig `json:"ipam,omitempty"`

// MTU is the maximum transmission unit for a network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// MTU is the maximum transmission unit for a network.
// mtu is the maximum transmission unit for a network.

Why does a user care about setting this field? What does it allow them to achieve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It enable users align the MTU used across the stack, fitting the MTU of the pod network interface to the MTU set on the node interface.

In a scenario physicalNetworkName points to a network interface who has certain MTU settings, e.g: 1600, the pod interface should have the exact same MTU as the node network interface.
Otherwise, the MTU wont be aligned across the stack (pod has MTU X node has MTU Y), it could result in network disruptions and bad performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added text to godoc

Comment on lines 457 to 484
// +kubebuilder:validation:Minimum=576
// +kubebuilder:validation:Maximum=65536
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly mention the minimum and maximum values in the Godoc so users using something like oc explain ... can see the minimum and maximum constraints.

You mentioned previously in the enhancement that this should have a higher minimum value if the subnets are IPv6 - is there any way to enforce that at admission time? if not, how are planning to enforce it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building on this a bit more, if you aren't able to enforce this increased minimum based on subnet types (IPv4 vs IPv6) at admission time is it because of the API structure? is there a change you could make to the API structure to more easily enforce that at admission time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned previously in the enhancement that this should have a higher minimum value if the subnets are IPv6 - is there any way to enforce that at admission time? if not, how are planning to enforce it?

It is enforced using CEL validation rule on Layer2 and Layer3 configuration types, please see
https://github.com/ovn-kubernetes/ovn-kubernetes/blob/master/go-controller/pkg/crd/userdefinednetwork/v1/shared.go#L93
https://github.com/ovn-kubernetes/ovn-kubernetes/blob/master/go-controller/pkg/crd/userdefinednetwork/v1/shared.go#L28

The proposed Localnet topology configuration type should have similar validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added text to godoc about min/max constrains

// +optional
MTU int32 `json:"mtu,omitempty"`

// VLAN is the VLAN ID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// VLAN is the VLAN ID.
// vlan is the VLAN ID.

Why does a user care about setting this field? What does it allow them to achieve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a scenerio physicalNetworkName points to a network who has some VLAN applied, it should be configured in the spec.
Otherwise pods connected to the subject network will not be able to communicate with other hosts on the network.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added text to godoc

Comment on lines 464 to 493
// +kubebuilder:validation:Minimum=2
// +kubebuilder:validation:Maximum=4094
// +optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly mention these constraints in the godoc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@ormergi ormergi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@everettraven thank you so much for the feedback, I am working on resolving you comments.

Comment on lines 403 to 404
// +required
Network NetworkSpec `json:"network"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```go
// +kubebuilder:validation:XValidation:rule="self == oldSelf", message="Network spec is immutable"
```
This validation should be changed to allow mutation localnet topology configurations, at least MTU and VLAN.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and let me add some context to allowing spec mutations for localnet topology:
Allowing spec mutation for localnet motivation followed by recorded customer cases describing the current API limitations flaws (e.g.: immutable spec) and impact on UX (bad) https://issues.redhat.com/browse/PLMCORE-10896.
For starters we want to allow spec mutation mostly for the following fields: VLAN, MTU, excludeSubnets and physicalNetworkName.
Having it localnet topology settings mutable improve the UX because it save re-creating CRs in case of misconfiguration, and allow changes to the topology settings on day2.

Regarding to CEL rule validations required changes, what I had in mind is inverting the spec mutation validation we have today on spec.network as follows:

  1. Remove mutation restrictions on spec.network.
  2. Add CEL rule validations to restrict mutations of spec.network.topology, spec.network.layer2, and spec.network.layer3.
  3. Add CEL rule validations to restrict mutation of localnet topology settings: spec.network.localnet.role, spec.network.localnet.subnetes and spec.network.topology.ipam.

Depending on OVN-K maintainers feedback we can, do broader change and allow spec mutations for other topologies as well (layer2 and layer3), naturally it will simplify the required CEL rule validations.

I though we better start with hardest restrictions we can that make sense, and upon necessity remove them later on, keeping things backward compatible.
WDYT?

// Role describes the network role in the pod.
//
// Allowed value is "Secondary".
// Secondary network is only assigned to pods that use `k8s.v1.cni.cncf.io/networks` annotation to select given network.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The network will be defined as a secondary network, pods will need to specify Mutltus networks annotation in other to connect to that network. For example: given CR with metadata.name mynet, pod should have k8s.v1.cni.cncf.io/networks: mynet annotation.

What makes it work is the fact the CUDN controller creates a NAD at the target namespace, allowing pods to connect to it.
The corresponding NAD meta.name will be the same as the CUDN CR meta.name.

//
// +kubebuilder:validation:Required
// +required
Role NetworkRole `json:"role"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, NetworkRole type exist and limited to Primary and Secondary,
please see https://github.com/ovn-kubernetes/ovn-kubernetes/blob/master/go-controller/pkg/crd/userdefinednetwork/v1/shared.go#L170.

// +required
Role NetworkRole `json:"role"`

// PhysicalNetworkName the OVS bridge-mapping's network-name configured in the nodes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON tag name should be PhysicalNetworkName, sorry for the confusion I will fix it.

There is not interaction with other fields in this API.
There is an interaction with Kubernetes-nmsate NodeNetworkConfigurationPolicy (NNCP) API though:
the CUDN spec.network.localnet.physicalNetworkName should point to the NNCP spec.desiredState.ovn.bridge-mappings as mentioned earlier in the doc.

// +optional
ExcludeSubnets []CIDR `json:"excludeSubnets,omitempty"`

// IPAM section contains IPAM-related configuration for the network.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IPAM stands for OVN-K IPAM configuration for the subject network, its implemented for Layer2 topology configuration type, please see [1] it has prerry nice godoc explaining what it does 🙂

The proposal is to have Localnet topology configuration type has similar ipam cofiguration type as Layer2 has.

Comment on lines 457 to 484
// +kubebuilder:validation:Minimum=576
// +kubebuilder:validation:Maximum=65536
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned previously in the enhancement that this should have a higher minimum value if the subnets are IPv6 - is there any way to enforce that at admission time? if not, how are planning to enforce it?

It is enforced using CEL validation rule on Layer2 and Layer3 configuration types, please see
https://github.com/ovn-kubernetes/ovn-kubernetes/blob/master/go-controller/pkg/crd/userdefinednetwork/v1/shared.go#L93
https://github.com/ovn-kubernetes/ovn-kubernetes/blob/master/go-controller/pkg/crd/userdefinednetwork/v1/shared.go#L28

The proposed Localnet topology configuration type should have similar validation.

// +optional
MTU int32 `json:"mtu,omitempty"`

// VLAN is the VLAN ID.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a scenerio physicalNetworkName points to a network who has some VLAN applied, it should be configured in the spec.
Otherwise pods connected to the subject network will not be able to communicate with other hosts on the network.

// +optional
IPAM *IPAMConfig `json:"ipam,omitempty"`

// MTU is the maximum transmission unit for a network.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It enable users align the MTU used across the stack, fitting the MTU of the pod network interface to the MTU set on the node interface.

In a scenario physicalNetworkName points to a network interface who has certain MTU settings, e.g: 1600, the pod interface should have the exact same MTU as the node network interface.
Otherwise, the MTU wont be aligned across the stack (pod has MTU X node has MTU Y), it could result in network disruptions and bad performance.

Localnet topology configuration type
```go
type LocalnetConfig struct {
// Role describes the network role in the pod.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "network role" ? Is this something that users are expected to already be familiar with? Why would they care about setting this value? What does it let them achieve?

Network role control whether the pod interface will act as the primary or secondary. primary used as the default gateway for the pod. Its up to the user to deciede what kind of network they like to create.

@ormergi ormergi changed the title OVN-K: Localnet API support SDN-5634: OVN-K: Localnet API support Feb 13, 2025
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 18, 2025
@ormergi
Copy link
Contributor Author

ormergi commented Feb 18, 2025

Changes: resolve #1750 (review)

Copy link
Contributor

openshift-ci bot commented Feb 18, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from trozet. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@npinaeva
Copy link
Member

Yes, you can mutate NADs. You only need to restart the workloads attached to it for them to consume the new version - assuming the network is in-sync.

thank you for clarifying! I think this part should go in the enhancement, considering it is the first time we allow to mutate spec. Otherwise /lgtm

@npinaeva its mentioned here, is it fine or you expect something else?

I mean the fact that you need to restart workloads to consume the new changes

@ormergi ormergi force-pushed the localnet-api-ext branch 3 times, most recently from dca226c to c44363a Compare February 19, 2025 14:08
@ormergi
Copy link
Contributor Author

ormergi commented Feb 19, 2025

Changes: resolve #1750 (comment)

@ormergi
Copy link
Contributor Author

ormergi commented Feb 19, 2025

I mean the fact that you need to restart workloads to consume the new changes

@npinaeva I have added text to clarify this point please see https://github.com/openshift/enhancements/compare/dca226ce6916f77252e10f97ad4e5a9093596245..c44363a24ffebf59b4d2cafd3584499da82afb06

@ormergi ormergi force-pushed the localnet-api-ext branch 2 times, most recently from 3bb9bc5 to 6fc5b82 Compare February 19, 2025 15:02
// For Localnet topology only `Secondary` is allowed.
// Secondary network is only assigned to pods that use `k8s.v1.cni.cncf.io/networks` annotation to select given network.
// +required
// +kubebuilder:validation:Enum=Secondary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If NetworkRole already specifies an enum validation tag this won't work. To use an Enum tag this needs to be its own type alias to a string or a string. The only other path to restrict this further in this case would be a CEL expression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using a CEL expression you would end up with a potentially misleading CRD schema though because Primary would show up as an allowed value in the schema but in reality it would be blocked by your XValidation

Copy link
Contributor Author

@ormergi ormergi Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can avoid it by moving the enum validation for NetworkRole definition to each topology config definition (where NetworkRole used), as follows:

  1. Remove the enum validation exist on NetworkRole definition
  2. On Layer2Config definition, add enum validation to role field allowing Primary or Secondary.
  3. On Layer3Config definition, add enum validation to role field allowing Primary or Secondary.
  4. Finally, in the proposed LocalnetConfig definition, add enum validation to role field allowing Secondary only.

@everettraven WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that sounds reasonable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ormergi
Copy link
Contributor Author

ormergi commented Feb 20, 2025

Changes:

@ormergi
Copy link
Contributor Author

ormergi commented Feb 20, 2025

@everettraven @maiqueb @npinaeva I think I resolved all comments, could you please take another look?

@ormergi
Copy link
Contributor Author

ormergi commented Feb 20, 2025

Changes: fix indentations

Copy link
Contributor

@maiqueb maiqueb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, nothing big.

Real close to having the +1 again.

1. `name`
The underlying network name.
- Should match the node OVS bridge-mapping network-name.
- In case Kubernetes-nmstate is used, should match the `NodeNetworkConfigurationPolicy` (NNCP) `spec.desiredState.ovn.bridge-mappings` value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should actually match the localnet attribute of an item on that slice / array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added text to clarify that

2. `physicalNetworkName`
Points to the node OVS bridge-mapping network-name - the network-name mapped to the node OVS bridge that provides access to that network.
(Can be defined using Kubernetes-nmstate NNCP - `spec.desiredState.ovn.bridge-mappings`)
- Overrides the `name` attribute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defined in .1 , for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

As of today the CUDN CRD controller does not allow mutating `spec.network`.

OVN-Kubernetes support NAD spec mutations on day 2.
For the new the spec to take place, restarting the workloads is necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
For the new the spec to take place, restarting the workloads is necessary.
For the new network configuration (defined in the spec) to take place, the user must restart the workloads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

network:
topology: Localnet
localnet:
name: "tenantblue"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still prefer physicalNetworkName but, maybe just name is good enough.

##### Subnets
The subnets and exclude-subnets should be in CIDR form, similar to Layer2 topology subnets.

##### Persistent IPs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss explicitly indicating which is the default. Which I assume to be false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes default is false. Added text to clarify that.

- `Role`:
- Required.
- Can be Secondary only when topology=Localnet.
- Having Role explicitly makes the API predictable and consistent with other topologies. In addition it enables extending localnet for Primary networks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no use case for primary UDN in localnet. It's pointless.

... but I understand your point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the text to omit primary role

localnet:
role: Secondary
physicalNetworkName: tenantblue
subnets: ["192.168.100.0/24", "2001:dbb::/64"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for completeness, let's also exclude the GW IPs here (one for each subnet).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be a /32 ?

Comment on lines +292 to +301
labels:
k8s.ovn.org/user-defined-network: ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this label be on the namespace ?...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CRDs controller label NADs with this label making it easier to aggregate for maintenance by admins (view all UDN NAD, etc..).
The primary UDN label for namespace is the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fair enough.

// Allowed values are "Layer3", "Layer2" and "Localnet".
// Layer3 topology creates a layer 2 segment per node, each with a different subnet. Layer 3 routing is used to interconnect node subnets.
// Layer2 topology creates one logical switch shared by all nodes.
// Localnet topology attach to the overlay local network. Enables egress to the provider's physical network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of always seen this the other way around; Localnet topology (which is an OVN term) attaches to the physical network .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the text saying localnet attach to physical network

2. Add CEL rule validations to restrict mutations of:
`spec.network.topology`, `spec.network.layer2`, and `spec.network.layer3`.
3. Add CEL rule validations to restrict mutation of localnet topology settings:
`spec.network.localnet.role`, `spec.network.localnet.subnetes` and `spec.network.topology.ipam`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`spec.network.localnet.role`, `spec.network.localnet.subnetes` and `spec.network.topology.ipam`.
`spec.network.localnet.role`, `spec.network.localnet.subnets` and `spec.network.topology.ipam`.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

@ormergi
Copy link
Contributor Author

ormergi commented Feb 20, 2025

Changes: address #1750 (review)
@maiqueb PTAL 🙏

@maiqueb
Copy link
Contributor

maiqueb commented Feb 20, 2025

/lgtm

I just think you have a typo (using a /31 subnet instead of a /32)

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 20, 2025
Copy link
Contributor

openshift-ci bot commented Feb 20, 2025

@ormergi: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants