-
Notifications
You must be signed in to change notification settings - Fork 770
mlx5: ignore QP max_recv_wr when SRQ is used #1624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
According to the manual, max_recv_wr and max_recv_sge are ignored by `ibv_create_qp` if the QP is to be associated with an SRQ. Fix the bug where `ibv_create_qp` may fail due to uninitialized `max_recv_wr`. Signed-off-by: ZHOU Huaping <[email protected]>
34f065f
to
d5e4173
Compare
@@ -1770,6 +1770,12 @@ static int mlx5_get_max_recv_wr(struct mlx5_context *ctx, | |||
struct mlx5dv_qp_init_attr *mlx5_qp_attr, | |||
uint32_t *max_recv_wr) | |||
{ | |||
/* Ignore max_recv_wr when SRQ is used. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the issue that you hit without that change ? not clear from the code and the commit log.
What are the values of the below in your case ?
ctx->max_recv_wr
attr->cap.max_recv_wr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my case attr->cap.max_recv_wr
is way greater than ctx->max_recv_wr
because it contains random bytes on stack. However, since the manual notes that "The attributes max_recv_wr and max_recv_sge are ignored by ibv_create_qp() if the QP is to be associated with an SRQ" (https://man7.org/linux/man-pages/man3/ibv_create_qp.3.html), I believe that any value of attr->cap.max_recv_wr
is acceptable when QP is associated with SRQ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this change I will get EINVAL
from ibv_create_qp()
if attr->cap.max_recv_wr > cap.max_recv_wr
even if the QP is to be associated with SRQ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked around at other drivers, is that case handled at all ?
Why should an application use SRQ and pass attr->cap.max_recv_wr which is different than 0 at all ?
If the issue is really important to handle, we may consider setting attr->cap.max_recv_wr to 0 in the verbs layer to let it work for all drivers around.
Please check and let's get other feedback here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked around at other drivers, is that case handled at all ?
I'm not sure if other drivers handle this issue correctly, since I only have mellanox NIC at hand.
Why should an application use SRQ and pass attr->cap.max_recv_wr which is different than 0 at all ?
Because as I mentioned before, the manual said that "The attributes max_recv_wr and max_recv_sge are ignored by ibv_create_qp() if the QP is to be associated with an SRQ" (https://man7.org/linux/man-pages/man3/ibv_create_qp.3.html), which means that assigning different values to max_recv_wr should produce the same result when SRQ is used. Besides, the manual didn't mention that "The attributes max_recv_wr and max_recv_sge should be set to 0 if the QP is to be associated with an SRQ", so it is valid for an application to use SRQ and pass attr->cap.max_recv_wr
which is different than 0. The value of attr->cap.max_recv_wr
may come from uninitialized stack variables, since the application may construct the ibv_qp_init_attr
on stack and not explicitly set the max_recv_wr
when using SRQ.
If the issue is really important to handle, we may consider setting attr->cap.max_recv_wr to 0 in the verbs layer to let it work for all drivers around.
Yeah I think that is viable choice, too.
do we have resolution here? |
I would expect another general solution that will fit all drivers and close this one. |
According to the manual, max_recv_wr and max_recv_sge are ignored by
ibv_create_qp
if the QP is to be associated with an SRQ. Fix the bug whereibv_create_qp
may returnEINVAL
whenattr->cap.max_recv_wr > cap.max_recv_wr
even if the QP is to be associated with SRQ.The code snippet to reproduce this issue: