-
Notifications
You must be signed in to change notification settings - Fork 657
Introduce prefix allocation #2611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a3a16f8
to
5a6c64a
Compare
1077c56
to
46e082b
Compare
Hi @aramprice, I just watched the recording of the foundational infra meeting from April 17th. To clarify a bit: The idea for IPv6 is to use as much native network routing as possible, i.e. not use an overlay network for IPv6. This means that the IP addresses that would be assigned to the containers will be delegated from an IPv6 prefix that is assigned to the Diego VM. Your question about whether those IP addresses would move in case of an evacuation: no. These IP addresses are not "sticky" to the app or app instance and would not move. The goal is to make the networking setup for Diego simpler by using "native" IPv6 addressing. Traffic aimed at a particular container will reach the Diego VM via its CIDR range, and then the Diego VM's kernel can forward the traffic to the container's virtual NIC. Please also note that the networks are supposed to be dual stack, not pure IPv6. So you would want to assign multiple (at least one IPv4 and one IPv6) networks to the same VM, as @fmoehler mentioned in the call already. The "prefix" parameter is also the size of the prefix to delegate to each VM from a larger range. Have a look at the discussion while creating the RFC for a more extensive example. The VM is supposed to self-assign an address to itself. Usually this is the |
b24cf94
to
354d3e8
Compare
354d3e8
to
7cdb704
Compare
Hi @peanball, Thanks for that additional context - the info and the RFC were helpful. In thinking about the overall changes to Bosh a few goals or principles (maybe too strong a word) came to mind:
This isn't to imply disagreement about the current state of things, only to captures my thoughts at the moment. |
@aramprice thanks for your feedback! I just want to elaborate a little on our current idea, but of course this is open for ideas. Regarding your points:
|
Thanks @aramprice, I fully agree with you there. The logical unit of "IP address", with or without netmask (i.e. prefix) should be supported in either scenario and contain the logic of representation. As @fmoehler mentioned, omitting the And there should not be assumptions about dual stack but dual stack should be possible. So far, BOSH supported either v4 or v6, not both at the same time. The mechanism to support more than one network is a classic n+1 problem and can be solved as such where possible. We just happen to choose n=2 with one v4 and one v6 network. This should be the mindset. |
@fmoehler I have a general comment on the term In the end this is the prefix (size) delegated to each instance attached to this subnet, right? Can we somehow express this better? My suggestion would be
alternatively |
1156d02
to
9f17169
Compare
9f17169
to
4b6f351
Compare
What is this change about?
As described in RFC https://github.com/cloudfoundry/community/blob/main/toc/rfc/rfc-0038-ipv6-dual-stack-for-cf.md bosh shall be enabled to allocate prefix ip addresses (ipv4 and ipv6). Currently bosh only supports attaching ip addresses with a /32 (ipv4) or /128 (ipv6) prefix, which are single ip addresses. To apply these changes a new property called 'prefix' is introduced in the cloud config networks section. Please refer to the example below for a manual network:
This example network tells bosh that instead of assigning a single ip address, to assign a cidr range. So to slash the /56 network into multiple /80 networks.
The ip address allocation from the bosh director is adapted to also consider these prefixes (previously the director was just counting up by 1)
One major change is how the ip addresses are stored inside the database. The address_str field in the ip_addresses table will change from storing the ip address as an integer representation of the ip to store the ip address in cidr notation. This change is necessary to not "lose" the prefix information when storing the ip address. Also it has the advantage that you can directly create an IpAddrOrCidr Object out of the string coming from the database.
This PR also changes the RPC Interface for the create_vm method. It will include a separate field called "prefix". We will send the Prefix information in a separate field to not break existing cpis. Older CPIs that do not support prefix allocation will just ignore this field. Below you can find an example network section of the create_vm call:
The prefix here is 32 indicating a single ip address.
What tests have you run against this PR?
Include a comprehensive list of all tests run successfully.
How should this change be described in bosh release notes?
Something brief that conveys the change and is written with the Operator audience in mind.
See previous release notes for examples.
Does this PR introduce a breaking change?
Does this introduce changes that would require operators to take care in order to upgrade without a failure?
Tag your pair, your PM, and/or team!
It's helpful to tag a few other folks on your team or your team alias in case we need to follow up later.