Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO_URING better than DPDK backend? #2622

Open
GeorgKreuzmayr opened this issue Jan 17, 2025 · 3 comments
Open

IO_URING better than DPDK backend? #2622

GeorgKreuzmayr opened this issue Jan 17, 2025 · 3 comments

Comments

@GeorgKreuzmayr
Copy link

Hey everyone,

I am currently benchmarking the seastar user-space TCP stack and was surprised by the results where the IO_URING backend performed better than the DPDK backend.

Result:
IO_URING Backend: 23 Gbit/s per direction
DPDK Backend: ~5-10 Gbit/s per direction

Setup:
c7gn.16xlarge instance type
AWS cluster placement group
Full-duplex communication
16 TCP connections
1 Shard (CPU core)
Seastar commit: 871079a
Ubuntu 24.04

Also, I found that when increasing the number of connections, at some point the DPDK backend started breaking completely.

You can find the example that I benchmarked in this repository.

Are these performance numbers expected?
Is there any known issue with many TCP connections using the DPDK backend?

@dorlaor
Copy link
Contributor

dorlaor commented Jan 17, 2025 via email

@GeorgKreuzmayr
Copy link
Author

GeorgKreuzmayr commented Jan 17, 2025

Thank you for getting back to me on this!

16 connections isn't much.

I agree, but as I mentioned above, the whole app was breaking when I used more than that with the DPDK backend.

I wasn't aware we support iouring out of the box that easily.

The IO_URING backend seems to be the default. From what I can understand, you probably use something like AF_XDP sockets to send and receive raw ethernet frames, similar to what you can do with DPDK.

Can you verify that in both cases only a single hardware thread is used?

Yes

Another option is to use larger msg sizes, where zero copy will be more
effective.

I don't think it is possible to use "zero-copy" TCP, as the memory I am passing to your TCP stack is not pinned, which is required for the NIC to DMA it onto the wire.

@dorlaor
Copy link
Contributor

dorlaor commented Jan 17, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants