Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the IOCP event loop (timers, ...) #15238

Merged
merged 16 commits into from
Dec 6, 2024

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented Nov 29, 2024

Upgrades the IOCP event loop for Windows to be on par with the Polling event loops (epoll, kqueue) on UNIX. After a few low hanging fruits (enqueue multiple fibers on each call, for example) the last commit completely rewrites the #run method:

  • store events in pairing heaps;
  • high resolution timers (CreateWaitableTimer);
  • block forever/never (no need for timeout);
  • cancelling timeouts (no more dead fibers);
  • thread safety (parallel timer de/enqueues) for RFC #0002;
  • interrupt run using completion key instead of an UserAPC for RFC #0002 (untested).

The main issue is that the API to associate a wait object to a completion port (see c/ntdll.cr) only goes back to Windows 10 (see this article) while we aim to be compatible with Windows 7+. Go searches the symbols at runtime and falls back to use the timeout parameter when the API isn't available. I guess we could do the same, making sure to interrupt when we enqueue a timer that happens sooner.

I'll skip the HR timers into another PR, and implement the Windows 7+ solution for now?

It was actually easy to dynamically load the symbols at runtime (thanks to @HertzDevil paving the way) so we can use the HR timers on Windows 10+ and fallback to a timeout on legacy Windows.

The spec/std_spec test suite are passing in a Windows 11 VM, in any combination of MT/no MT and legacy timeout/HR timers.

Built on top of #15226. You may skip the first commit. Rebased and ready for review.

TODO after merge: add the new dependency to ntdll.dll in the book and other places.

`Fiber#cancel_timeout` is a misnommer: it doesn't cancel any timeout but
cancels the timeout action of a `select` expression only. It has no
effect on overlapped operations.
We can dequeue more than one operation when we check for completed
overlapped operations. It should be a little more efficient than
returning to the evloop after each resumable fiber.
- store timers in pairing heap (EventLoop::Timers)
- use high resolution timers (CreateWaitableTimer)
- associate HR timer to IOCP (to interrupt blocking wait)
- use post queued completion to interrupt blocking wait (instead of user APC)
- thread safety for parallel timer enqueue / dequeue
- overhaul of #run to make sure timeouts are cancelled (no dead fibers)

Issue: the API to associate a HR timer to an IOCP requires Windows 10+
while we claim to support Windows 7+.
This allows the IOCP event loop to check for the presence of the symbols
in the `ntdll.dll` library at runtime and to only use the high
resolution timer on Windows 10+ and fallback to a timeout for the wait
queued completion status call (and a manual interrupt when we queue a
timer that expires sooner).
@ysbaddaden ysbaddaden marked this pull request as ready for review December 2, 2024 12:25
@ysbaddaden
Copy link
Contributor Author

Rebased from master & ready for review 🙇

src/crystal/event_loop/iocp.cr Outdated Show resolved Hide resolved
src/crystal/system/win32/iocp.cr Outdated Show resolved Hide resolved
src/crystal/system/win32/iocp.cr Outdated Show resolved Hide resolved
src/crystal/event_loop/iocp.cr Outdated Show resolved Hide resolved
src/crystal/system/win32/waitable_timer.cr Outdated Show resolved Hide resolved
The IOCP event loop now always waits forever when blocking, we don't
need to schedule a fiber with a long expiration anymore.
@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Dec 3, 2024

Following a comment by @HertzDevil we don't need to keep a fiber with a far away expiration timer anymore on Windows.

I pushed one last commit to drop it.

@straight-shoota straight-shoota added this to the 1.15.0 milestone Dec 4, 2024
@straight-shoota straight-shoota merged commit f8db68e into crystal-lang:master Dec 6, 2024
69 checks passed
@ysbaddaden ysbaddaden deleted the feature/evloop-iocp branch December 6, 2024 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants