Skip to content

RoCEv2 OTG Model #405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 57 commits into
base: master
Choose a base branch
from
Open

RoCEv2 OTG Model #405

wants to merge 57 commits into from

Conversation

satyamsinghKey
Copy link
Collaborator

@satyamsinghKey satyamsinghKey commented Feb 11, 2025

Issue: #403

New PR Link: (#415)

Redocly View:
https://redocly.github.io/redoc/?url=https://raw.githubusercontent.com/open-traffic-generator/models/dev_rocev2/artifacts/openapi.yaml&nocors#tag/Configuration/operation/set_config

Objectives:
Add support of RoCEv2 Protocol in OTG model
Added fields under Configuration and Monitor

Sample Code Snippet:

    config = api.config()
    (device,) = config.devices.device(name="device")

    # device config
    eth = device.ethernets.add()
    eth.name = "eth"
    eth.connection.port_name = port.name
    eth.mac = "00:00:00:00:00:11"
    ipv4 = eth.ipv4_addresses.add()
    ipv4.name = "ipv4"
    ipv4.address = "21.1.1.1"
    ipv4.prefix = 24
    ipv4.gateway = "21.1.1.2"

    rocev2v4 = device.rocev2
    rocev2v4_int = rocev2v4.ipv4_interfaces.add()
    rocev2v4_int.ipv4_name = ipv4.name
    rocev2v4_peer = rocev2v4_int.peers.add()
    rocev2v4_peer.name = "RoCEv2 1"
    rocev2v4_peer.num_of_qps = 2
    rocev2v4_peer.remote_end_point_ip_address = "2.2.2.2"
    rocev2v4_peer.ib_mtu = 15
    flow_settings = rocev2v4_peer.flow_settings.add()
    flow_settings.dscp = 25


    # push configuration
    api.set_config(config)

    # start all protocols
    control_state = api.control_state()
    control_state.protocol.all.state = control_state.protocol.all.START
    api.set_control_state(control_state)

    # create a query for rocev2 metrics
    req = api.metrics_request()
    req.rocev2.peer_names = ["RoCEv2 1"]
    results = api.get_metrics(req)

satyamsinghKey and others added 30 commits November 21, 2024 20:49
@satyamsinghKey satyamsinghKey changed the title Dev rocev2 RoCEv2 OTG Model Feb 11, 2025
@@ -51,19 +51,26 @@ components:
items:
$ref: '../flow/flow.yaml#/components/schemas/Flow'
x-field-uid: 6
stateful_flows:
description: |-
stateful flows currently on RoCEv2 present.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not for RoCEv2 alone. So probably something like "The stateful flows that will be configured for the traffic generator, for protocols such as RoCEv2 where packet rate and/or contents can change dynamically based on packets received from the connected device "

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified description

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apratimmukherjee , @satyamsinghKey - I have doubt if we will change the packet content. For example for Roce, the content of the packet does not change. What changes dynamically/state-fully is packet rate (due to detection of congestion in the network), sequence of packet (for example the old packet may get send again due to dynamic retransmission) etc. In a prefect scenarios what changes in every packet is SEQ num. iCRC etc. But in the first reading it felt by "content" we are meaning payload, which does not change. So you may think of omitting the word "content".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of 'contents' we can use 'contents of protocol headers' .. I mean flags in IPv4 header or window etc. for TCP etc. which is unlike stateless flows where nothing really changes pkt to pkt .

x-field-uid: 3
rate:
description: |-
Two valid choices are line_rate and burst_mode, if line_rate is selected, per_port_settings will be used to configure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more clarity would be good here.
e.g. when line_rate mode is chosen , it is required that all flows with a tx endpoint on such a port must choose line_rate mode [ unless mixed mode is alowed ???]
In this case, all the rdma flows will transmit data in interleaved manner based on qp.message size as fast as possible but not exceeding the line_rate value configured in port.rate .
If burst_mode is chosen , all flows with Tx endpoint on a port should be configured in burst_mode. Then each rdma flows sends packets as per rflow.message_size configuration for fixed / infinite times with burst_gap time between each iteration.
Some level of clarity on what is expected to go on the wire for each mode is needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified description as suggested

Two valid choices are line_rate and burst_mode, if line_rate is selected, per_port_settings will be used to configure.
type: string
x-enum:
line_rate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

port_line_rate gives a hint to configure the line_rate in the port attributes .

type: integer
format: uint64
x-field-uid: 5
connect_request_tx:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't these flap like regular tcp based sessions such as bgp ; any reason not to have a flap count.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can

rocev2_verb:
description: >-
RoCEv2 Verb, Available options are: none, write, wrtie_with_immediate, send, send_with_immediate and read: The corresponding flow will not take part in traffic.
type: string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a choice. Allows immediate_data to be made visible only for _immediate types. Looks like it is not relevant for the other types.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct

format: hex
default: "00000000"
x-field-uid: 7
message_size:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is message_size part of session bring up or is it needed solely for data flows e.g. could a single QP potentially first send a payload of size 1000 bytes and then another of size 10000 bytes ? In which probably should be somewhere in stateful_flow configuration as properties of the data flow , maybe as an array with ixnetwork supporting a single value only for all flows between same qp endpoints. Or is this specification bound that each qp can have single datagram being split and sent on it ?

properties:
connection_type:
description: >-
There are multiple connection types. Valid values are : Reliable Connection (RC), Reliable Datagram (RD), Extended Reliable Connection (XRC),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time of this implementation, please consider only RC. So from description excludes other modes. Or, explicitly mention that supported mode is RC.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not applicable for IxN but other test vendors whose configuration will be based on the model

x-field-uid: 7
message_size:
description: >-
The Maximum message size that is allowed to transfer depends on the MTU size and the number of VLANs configured on the interfaces.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maximum bytes transferred will depend on IB MTU (not Ethernet MTU). So, given message size will be broken into multiple packets viz. WF, WM, WL etc. based on the IB MTU value. We take IB MTU as an user input also.

type: array
items:
$ref: '../flow/stateful_flow.yaml#/components/schemas/StatefulFlow'
x-field-uid: 7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot change x-uids of existing attributes. Please add new attributes to existing objects at the end else it will break clients running tests with older snappi versions.

Configuration for RoCEv2 peer settings.
type: object
properties:
num_of_qps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flow settings is an array so number of times user add a new flow setting will define size of the flow settings array not this. Might need discussion.

type: object
required: [num_of_qps, destination_ip_address, ib_mtu]
properties:
num_of_qps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to previous, the times user adds an element into the array will define the size .. should not be needing this .

description: >-
Configuration for Rocev2 IPv4 peers.
type: object
required: [num_of_qps, destination_ip_address, ib_mtu]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name probably should be must , cant ib_mtu be non required with default value ?
num_qps probably can be removed completely.

description: >-
Configuration for Rocev2 IPv6 peer settings.
type: object
required: [num_of_qps, destination_ip_address, ib_mtu]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_of_qps is probably just removable and ib_mtu should not be required since it has a default value.
In general every attribute should either have a default_value or be required but not both at the same time.

type: integer
format: uint32
x-field-uid: 1
ib_mtu:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just move the ib_mtu to v4 and v6 peer and just remove this object. Model is easier to review and maintain. Or call it RoceAttributes and just have ib_mtu in that and refer to that . Dont need flow_settings which can be directly in the b4 / v6 peer.

x-field-uid: 1
tx_endpoint:
description: |-
Tx Endpoint. Empty for Rocev2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear what is meant by empty. Is it like RSVP-TE where the src defines the destination and must choose correct egress for specific ingress and no choice is available ?
It is not clear to me how this will help user control which QPs to actually send traffic on if this is empty. If a field is of no value we need to removed it.

on qp.message size as fast as possible but not exceeding the line_rate value configured.
If burst_mode is chosen, all flows with Tx endpoint on a port should be configured in burst_mode.
Then each rdma flows sends packets as per rocev2_flow_settings.message_size configuration
for fixed / infinite times with burst_gap time between each iteration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just a suggestion. The text might need more clarity and accuracy in terms of what the different modes are trying to achieve which needs to come from deeper understanding of rocev2 traffic control needs,
If it is indeed required that certain modes require and mandate all flows on a port to exhibit same type of behaviour, current model would leave a lot of room for error in configuration . We probably need to rethink .
Maybe
rocev2_flows :
array of rocev2_ transmit ports :
port_name
choice
target_line_mode
line rate value desired across all qp values txed from this port
array of flows
src qp endpoint (only qps on this port can be selected)
optional dst port ( for tracking maybe if needed )
other flow parameters
burst_mode ( indivdual control per flow )
array of flows
tx endpoint
optional (expected ) rx endpoint for 2 arm flows and stats
burst_mode
choice
continuous
fixed
burst_count
Not clear whether inter_batch is for line_rate or burst_mode or both -> Needs to fitted in correctly accordingly with proper explanation and expectation from implementation what is the behaviour of that setting,

properties:
inter_batch_period_unit:
description: |-
inter batch period units.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inter -> Inter
Needs more info on that all this means to packet generation by a tool implementing this,

inter_batch_period:
description: |-
Inter batch Period.
type: integer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

period.

@apratimmukherjee apratimmukherjee added the enhancement New feature or request label Mar 7, 2025
@satyamsinghKey
Copy link
Collaborator Author

A new PR has been created with the latest RoCEv2 model changes. Please refer to that PR for latest updates.
Link to new PR : (#415)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants