-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daxpy 10x slower on macOS (Haswell) #1470
Comments
Nobody here currently has a Mac I fear. From anecdotal evidence in previous issues (#730, JuliaLang/julia#901) you could try removing the |
Can you check if virtual machine has AVX2 inside in /proc/cpuinfo (i.e haswell or sandybridge is used) |
Yes, it does. |
Using the current |
Does killing any and all .align16 in the daxpy microkernel file change anything ? |
Yes, commenting that out in |
Is it possible to perhaps have a new release once we fix this, given that we haven't had one for a long time? We're happy (in the Julia community) to try out an RC and give feedback before release. |
Ah, does it have anything to do with this commit comment? According to the Mac developer docs:
As I understand, |
@simonbyrne yes, sorry for not digging down to the original source earlier. And thanks for the pointer to p2align, certainly looks cleaner than adding ifdefs around each and every .align (though the underlying issue seems to be an Apple flaw, if their align is identical to p2align). |
No worries. Thanks for all your effort maintaining OpenBLAS. |
Thank you. We might just apply the patch for now in that case. |
I have emailed @xianyi about adding more people as admin and perhaps even moving the project to an OpenBLAS organization to help maintain it better. |
PR merged for Haswell and Sandybridge, from wikipedia I suspect Nehalem may also be needed for 2010-12 models of Mac Pro and iMac ? |
I believe so: I think we build macOS Julia for Nehalem, perhaps @staticfloat can confirm? |
We build with |
PR incoming. I now understand that in the early assembly kernels from libGoto this issue was already catered for by using ALIGN_ macros from common_x86_64.h |
Actually according to the Github documentation I should have sufficient rights for creating a release. As that was never discussed as a possible "duty", I would still prefer to do this only if contact with xianyi cannot be established. Also I do not have access to the sourceforge repository used for providing precompiled windows version, or the openblas.net project page that links to it. |
The current website seems to be built from @xianyi's personal GitHub page: |
Tagging for 0.3.0 as xianyi created that milestone recently, hopefully the release will happen soon. |
Calls to
daxpy
seem to be about 10x slower on macOS than when run inside a Linux VM on the same machine (both give the config as "USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell").This affects a lot of code which depends on it (e.g.
dgebrd
).downstream: JuliaLang/LinearAlgebra.jl#501
The text was updated successfully, but these errors were encountered: