-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark is mostly idle at 10 connections #29
Comments
For next rerun, I have a few relevant changes in v20.41.0 on master, load_test now takes byte length and you can specify any length (it swaps from short, medium to long messages as needed) |
For 16kb messages at 500 connections there's more than 100% diff: Using message size of 16000 bytes Using message size of 16000 bytes So those graphs are quite misleading as of now |
Can reproduce this 👍 Areas to improve:
We also cannot use the normal pub enum MutCow<'a, B>
where
B: 'a + ToOwned + ?Sized,
<B as ToOwned>::Owned: AsRef<B> + AsMut<B>,
{
Borrowed(&'a mut B),
Owned(<B as ToOwned>::Owned),
}
|
Meh, I just realised |
Wrap relevant task in https://docs.rs/tokio/latest/tokio/task/fn.unconstrained.html to avoid forced yields. |
Cool, I was playing with tokio-uring someday and it seems doable to add feature-gated code to support tokio-uring tcp streams. https://docs.rs/tokio-uring/latest/tokio_uring/net/struct.TcpStream.html#method.read |
Published fastwebsockets 0.4.2 @uNetworkingAB you might be interested in these charts: |
Current analysis:
|
Ah, yes writev with 2 chunks beats write for long messages, not something I've bothered with (yet?). The short message bars make no sense though, they definitely do not match what I see here. I see at least 40% better short message perf. (1 kb and less) with uWS . You never tried v21, right? Even v20 beats fastwebsockets v0.4.2 on small messages by at least 15%, but the diff is extremely apparent in v21. |
Does v21 use epoll/kqueue by default for EchoServer? |
Don't get me wrong, this competition is good. I'm already looking at adding no-copy writev sends for anything above a threshold. This is good, and I can confirm those numbers, but current short message numbers are way off. v21 defaults are epoll, there is a release post how to compile with io_uring but you need Linux 6.0 or later. |
Small msgs with uWS v21 EchoServer
It does degrade to 10% but I cannot reproduce the drastic ~40% here. |
It needs Linux 6.0. You are on 5.19. You also need to recompile the load_test so that it uses io_uring. Otherwise you just have epoll trying to stress io_uring. You know it's right if strace only lists io_uring_enter, for both EchoServer and load_test. |
I want to compare epoll based implementations for now to find out why there is a 40% degrade you see. The uWS EchoServer compiled is epoll and above results are for that. Is the 40% diff you see because of io_uring? (then that explains the diff) |
Yes 40% is from io_uring on Linux 6.0. There are features of 6.0 that are very central to that bigger diff and that's why I target this kernel version as minimum. This backend will be default as soon as it is stable, so it would be very strange to exclude it. Anyways, first thing is probably adding this writev send path so we don't have gigantic diffs on bigger messages. I did remember why I never added it though - it's not applicable for compressed messages or SSL, so it's a very specific bypass for only non-ssl, non-compressed, big messages. |
Cool, the 40% diff will be relevant once fastwebsockets has a iouring backend. Opened #31 for tracking iouring support. Self note: Add SSL benchmarks sometime in the future. Anyways, I believe most of the things have been fixed and I'll continue to improve perf on small msgs (max 10% diff is fine for now). Feel free to open more related issues - this has been constructive 👍 |
Yes competition creates incentive to improve, which is good. I will have writev fix done any time now. |
Oh wow, uWS is 10% faster on 16 kb echoes with writev now :D |
The current run of only 10 connections is not enough to stress the servers, at least not uWS. Here is a list of differences between uWS and fastwebsockets at different number of connections assuming 1 kB messages. 10 connections has the least difference, so it's a natural pick if one wants to convey a minimal diff:
So it's pretty easy to tell there's some scaling issues that aren't being conveyed with the low count of only 10 connections. This can be improved by using more connections.
Edit: oh wow for 16 kB messages the diff is 56% at 200 connections
The text was updated successfully, but these errors were encountered: