Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document jl_zero_subnormals and also mention denormals in the document #12132

Closed
yuyichao opened this issue Jul 13, 2015 · 12 comments
Closed

Document jl_zero_subnormals and also mention denormals in the document #12132

yuyichao opened this issue Jul 13, 2015 · 12 comments
Labels
docs This change adds or pertains to documentation good first issue Indicates a good issue for first-time contributors to Julia

Comments

@yuyichao
Copy link
Contributor

Ref: julia-users

It took me a really long time to figure out why there's such a performance difference for exactly the same code but different input. I did check the floating point and performance tip section of the manual but didn't find anything. Would be nice to mention jl_zero_subnormals and its performance/accuracy impact in the document.

It's also worth noting that @fastmath does not have exactly the same effect with -ffast-math (at least in this regard). It also took me a long time to figure out why.

The document also only mentions subnormals rather than denormals (#3105). I agree the interface should be consistent but it would be nice to mentions both to help searching especially since the wiki page is still using "denormal".

@yuyichao yuyichao added docs This change adds or pertains to documentation good first issue Indicates a good issue for first-time contributors to Julia labels Jul 13, 2015
@eschnett
Copy link
Contributor

What is the difference between @fastmath in Julia and -ffast-math in Clang? What does jl_zero_subnormals do -- can you point to documentation? Are you maybe trying to say that @fastmath could/should somehow "call" jl_zero_subnormals?

@pao
Copy link
Member

pao commented Jul 13, 2015

can you point to documentation

I never wrote the documentation (related to the lack of a Julia interface--you can only ccall it). Briefly, it sets/clears the FTZ and DAZ flags on Intel processors, which cause those processors to zero out denormal floating-point numbers, avoiding a very expensive set of microcoded operations. It's inherently nonportable to different processors (most don't need it anyways), only helps in a very limited set of circumstances (your algorithm needs to produce exceedingly small floating-point numbers which you don't actually care about the value of), and of course loses precision when it kicks in (which needs to be an acceptable trade).

It's a run-time, not compile-time operation, so it doesn't fit well with fastmath-type options. Like several other similar run-time controls it really would be better with deterministic resource handling.

Which of course leaves the question of what to do with it. I'm not really sure myself, which is why it sits there unloved.

@yuyichao
Copy link
Contributor Author

Pretty much what @pao said. I don't think @fastmath should enable it either but it should be mentioned in the doc (given most julia users will probably use it on a Intel or AMD CPU where this is relavant).

@ArchRobison
Copy link
Contributor

I concur with @pao's analysis. It's more like the rounding mode controls in rounding.jl. Perhaps adding a set_zero_subnormals function there would be appropriate? Since I work for Intel, I'm happy to create a PR :-)

It might also be worth documenting William Kahan's suggestion that I heard at a conference. He pointed out that in cases where subnormals show up, often they are modeling a physical system where the values are really not that close to zero. E.g, the coldest temperature ever achieved falls well short of a Float32 subnormal. So just inject some noise up front to avoid the problem.

This line is a C++ example of the noise trick. The code models acoustic waves. When I would inject an impulse, the frame rate of the code would slow down for a while, and then speed up. I was puzzled, but eventually I caught on that the numerical method was creating values that exponentially decayed away from the initial impulse. The exponential would pass through subnormals on its way to zero. After a while, the impulse would spread out and bounce off reflectors, creating enough noise to make the subnormals go away, and the frame rate sped up. I even built a version that colored subnormals in the output. It was neat to watch the frontier of subnormals spread out on the display. Injecting noise up front solved the problem.

@yuyichao
Copy link
Contributor Author

When I would inject an impulse, the frame rate of the code would slow down for a while, and then speed up. I was puzzled, but eventually I caught on that the numerical method was creating values that exponentially decayed away from the initial impulse.

This is exactly how I hit this issue. There's a decay but I don't care about the actually values after it decays to small value....

Perhaps adding a set_zero_subnormals function there would be appropriate?

Do you mean simply a julia wrapper for the ccall or a with_.... version just like the ones for rounding. IMHO, the simple wrapper would be nice to have (and easier to document). I've thought about the with_... version but I don't want anonymous function to ruin type inference....

@yuyichao
Copy link
Contributor Author

Oh, and BTW, thanks for the noise trick. I will probably keep setting DAZ for my calculation but I'll try the noise trick if somehow I could not/don't want to write assembly or otherwise set the register to proper values.

@ArchRobison
Copy link
Contributor

I was thinking just a wrapper for the ccall for now unless there is popular demand for a with... version. We probably should include a get_zero_subnormals function too for sake of users who want to restore the state.

@yuyichao
Copy link
Contributor Author

Or just let the set function return that. Cpu feature detection, which is basically what the current return value is, should be a different function if necessary

@ArchRobison
Copy link
Contributor

Design issue for comment: should set_zero_subnormals return a value indicating success/failure, or return the previous mode? First choice is consistent with set_rounding. Second choice is more convenient for writing a sequence to set the mode, do something, and restore the mode.

@ScottPJones
Copy link
Contributor

Is it frequently possible for it to fail? If infrequent, why not throw an exception, and return the old mode.

@yuyichao
Copy link
Contributor Author

First choice is consistent with set_rounding.

Ahh. Wasn't aware of that. Probably keep the current behavior then.

Without such a context (of set_rounding), I'd prefer making it easier to reset since, different from setting rounding mode, the user usually won't complain if the cpu doesn't have performance issues dealing with denormal values and thus doesn't provide a interface to disable it.

The two interface should be roughly equally easy to use and IMHO consistency is slightly more important here.

ArchRobison pushed a commit to ArchRobison/julia that referenced this issue Jul 16, 2015
This mode sets the FZ/DAZ features on x86 processors that support them.
See issue JuliaLang#12132 for discussion.
ArchRobison pushed a commit to ArchRobison/julia that referenced this issue Jul 17, 2015
…ero" mode.

This mode sets the FZ/DAZ features on x86 processors that support them.
See issue JuliaLang#12132 for discussion.
@yuyichao
Copy link
Contributor Author

Close by #12172.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation good first issue Indicates a good issue for first-time contributors to Julia
Projects
None yet
Development

No branches or pull requests

5 participants