-
-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maybe need to reset fpu status word on context switch #1020
Comments
https://wiki.osdev.org/FPU also says:
So maybe we also need to use EMMS on context switch. And probably also in init_fpu() called on new threads (in by thread_main_c()). |
Note that if we do FNINIT and EMMS after saving the FPU state (see issue #1019), then this issue will also be solved: when we interrupt B with some funky FPU state we will reset the FPU state so the kernel will get a clean one, but when we switch back to A the same clean state will remain and we can return to A fine. |
I am not an expert on how many cycles given instruction costs but per this FNINIT costs 25 cycles and I am not sure how much percentage wise it will slow down context switch in OSv. How would we measure that? But it is more important to have correct and possibly slower code than fast bug buggy. In any case we might eventually optimize this logic by examining if FPU was actually used by the thread we are switching from. I think the information in paragraphs "13.5.4 Processor Tracking of XSAVE-Managed State" and "13.6 OPERATION OF XSAVE" of this document https://www.ti.uni-bielefeld.de/html/teaching/WS1819/techinf1/64-ia-32-architectures-software-developers-manual.pdf may help us accomplish this. |
Applying this patch:
seems to have addresses following issues:
Given that EMMS instruction costs only 1 cycle adding it to every task switch does not dramatically increase its cost at least based on results of running tests/misc-ctxsw.cc. It would be nice to add proper unit test doing FPU/MMX/SSE work in one thread and something filling the FPU stack without clearing in other thread. |
This issue was created due to a thought experiment - I don't have evidence that it leads to an actual bug.
When we return to a thread A after a context switch from another thread B, if the reason we left thread A was an asynchronous event (e.g., interrupt) we will restore A's full FPU state. However, if the reason we left A was a function call (e.g., sleep), we never restore A's FPU state and just keep the one B had (at some arbitrary point in the middle of its computation). The status of the FPU registers doesn't matter (because they are caller-saved), but in commit 0959f20 we realized the control words do matter - changes in thread B will change A as well. In this issue I suggest the status word (FSW) is important too - if thread B had something in its FPU stack, we will return to A thinking there is something in the stack. There shouldn't be. The Linux ABI says that we need to return from a function with an empty stack (unless the function returns a floating point value). On on context switch back to A via thread::switch_to() we should probably clear the FPU stack via FNINIT (we need to do this before loading the control word, of course, because fninit resets that too). If A stopped running because of a function call, this clear stack is what it will receive, but if A stopped on an interrupt, after the switch back to A it will return from interrupt(), leading the fpu_lock to be restored, and the stack will be filled again with what it may have had while A was running.
I need to think if there's any SSE equivalent to the above that needs fixing (I think there is no register stack in SSE, but not sure...).
The text was updated successfully, but these errors were encountered: