-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fold FixedBoundedFloatStrategy into FloatStrategy #3327
Fold FixedBoundedFloatStrategy into FloatStrategy #3327
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Phillip! This is a much nicer patch than I was anticipating would be required, so pending the comments below and some passing tests, I'm keen to merge!
(PyCon US sprints are this Mon+Tues, so I might even see if someone wants to do a filter-rewriting PR...)
allow_nan: bool, | ||
allow_subnormal: bool, | ||
width: int, | ||
bounds: Optional[Tuple[float, float]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this a min_value and max_value, defaulting to 0.0
and +inf
respectively. We can then be much less conditional everywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also was unsatisfied with the bimodal-ness of this logic.
So, for one, I think we still need the unbounded FloatStrategy to still be able to return negatives. Also, the way we do bounded selection is also based on fractional_float, which I don't think will work with +inf
. Or it's possible (likely?) that I'm missing the context for why these aren't issues.
One thing I did try - including the appropriate NASTY_FLOATs when bounds are specified. This sort of works, but requires special handling to avoid -0.0
and a possibly empty list. Then some shrinking tests started failing in a way that made me worry I was causing problems there. Insights appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying FloatStrategy is only positive IIRC, so st.floats()
has to return a union of two Strategy objects if the allowed range spans zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears that the unbounded case doesn't do this union. But this is a big function, and I know that I am not following all of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, yeah, unbounded floats is positive-only in conjecture.utils
(or conjecture.floats
? anyway...), but adds a sign bit before you get to FloatStrategy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Still happy to refactor as you see fit - LMK!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's push the management of sign-bits-for-bounded-floats down, from st.floats()
into FloatsStrategy
. The high-level goal here is that FloatsStrategy
should hold enough of the arguments to fully represent a call to floats()
- so that the solver/fuzzer/filter-rewriting interface will pretty much "wrap the current FloatsStrategy.do_draw()
".
Kinda annoying now, sorry, but I think it'll pay off in the medium-term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not annoying!
Ok, so you might imagine roughly this logic getting incorporated into the FloatStrategy
class. (though needs be implemented rather differently since strategy-composition isn't a tactic that would be as readily available) Is that about right?
Backing up, some questions that come to mind:
- Would the ideal mid-level interface be identical to
st.floats()
? (FWIW, I think CrossHair could implement st.floats(), FloatStrategy, or anything in between without much trouble) One could argue that some amount of normalization and consistency checking is better off at the higher level. - does it matter whether that interface is a function or a class? I'm not familiar with the filter-rewriting stuff, but I imagine that gets a lot easier if this interface is a class-level thing, right? And that's why we care?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so you might imagine roughly this logic getting incorporated into the
FloatStrategy
class. (though needs be implemented rather differently since strategy-composition isn't a tactic that would be as readily available) Is that about right?
Yep, that sounds right to me. Caveats: we should check that this doesn't noticably hurt performance in the interim period
- Would the ideal mid-level interface be identical to
st.floats()
? (FWIW, I think CrossHair could implement st.floats(), FloatStrategy, or anything in between without much trouble) One could argue that some amount of normalization and consistency checking is better off at the higher level.
I think a mid-level interface I would want is "like floats()
, but inclusive bounds are required and there's no sign bit or allow_subnormal". So we're really just talking about finite floats in some contiguous range, plus maybe nan
(only if inf
is allowed). It's tractable to efficiently generate this with bit-twiddling to get close enough to right that at least half of attempts are valid, and that's good enough that rejection sampling is really fast.
It's deeply weird to humans, but this should work really well for fuzzers and pretty well for solvers too. Then FloatsStrategy
is responsible for composing positive and negative ranges (if applicable), zero (if subnormals disallowed), etc.
- does it matter whether that interface is a function or a class? I'm not familiar with the filter-rewriting stuff, but I imagine that gets a lot easier if this interface is a class-level thing, right? And that's why we care?
The filter rewriting will be invoked via FloatsStrategy.filter()
, so I'd build FloatsStrategy.do_draw()
directly in terms of the mid-level interface. I'd imagined that the interface is basically "provide an object with these callbacks", which, yeah, is very likely to be an instance of some class with those methods.
hypothesis-python/src/hypothesis/strategies/_internal/numbers.py
Outdated
Show resolved
Hide resolved
Apologies for delays - I was travelling this past week. I think it's close; maybe take a peek and see if I got the right idea? To emulate the strategy-unioning that was previously happening, I've moved the sign bit up front. A few niggly tests are still causing me trouble, though - hints welcome! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this looks sensible to me at an evening glance! Of what I see, I'm most suspicious of the offset logic; ideally we'd get to a mask-based rather than addition-based solution (see e.g. #2907, though I'd like to do as little as possible of that to get this working and merged).
min_value: Real = -math.inf, | ||
max_value: Real = math.inf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should definitely be float
annotations - we convert other types in st.floats()
, and it's useful to have the exact bounds here.
I'd expect to also need an allow_subnormal
flag sometimes? But maybe not yet...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK on the float annotations.
allow_subnormal
could be part of the mid-level interterface I think. Another decision I thought about but don't understand the consequences of: is it better to filter the subnormals out, or to instead cast them into normals, like I'm doing with widths. (assuming it's not part of this interface)
Feel like some sort of bit mask should be possible; I can play with this idea this week. |
At first glance, it feels like bounded integer draws for the exponent (and for very small ranges, the mantissa) would be even more targeted than a bitmask, right? Unrelated, I'm getting a test failure that I think suggests an independent problem; even on the main branch, the following assert happens, but I think shouldn't. Could you confirm and suggest a fix? The logic in
|
Bounded integer draws are just a bitmask-and-rejection-sampling: if you've got the right number of bits, then in expectation you succeed in less than two attempts 😁. Admittedly that goes down to a merely better-than-one-in-four chance for floats, assuming that we calculate the mantissa-mask based on the observed exponent, but still pretty good IMO and the simiplicity of atomic representation which works with our existing mutators is probably worth the small overhead. As to the other, fixed in 224425e though I'm pretty sure it's truly unreachable via the public API. |
Co-authored-by: Zac Hatfield-Dodds <[email protected]>
Ah! Got it. Take a look at what I've got now - I'm not sure I'm doing the "make your own luck" process right? Although I'm drawing bits, I'm using integer modulo instead of the mask still - seems behaviorally similar, but perhaps nice to avoid the secondary rejection stage.
Thank you! If this looks good, I can add something for subnormal rejection ... and then we might be good? |
hypothesis-python/src/hypothesis/strategies/_internal/numbers.py
Outdated
Show resolved
Hide resolved
hypothesis-python/src/hypothesis/strategies/_internal/numbers.py
Outdated
Show resolved
Hide resolved
hypothesis-python/src/hypothesis/strategies/_internal/numbers.py
Outdated
Show resolved
Hide resolved
Yeah, I think we're close! My only real concern at this point is performance, which is at least not catastrophically worse at the moment. Based on a code review I don't even have specific reason for concern beyond the test failure below, but it would be nice to check a profile of something float-intensive (unique arrays maybe?) before and after to check. Happy to do that over the weekend if necessary. _______________ test_floats_order_worse_than_their_integral_part _______________
Traceback (most recent call last):
File "tests/conjecture/test_float_encoding.py", line 101, in test_floats_order_worse_than_their_integral_part
@given(st.integers(1, 2**53), st.floats(0, 1).filter(lambda x: x not in (0, 1)))
File "hypothesis/core.py", line 1237, in wrapped_test
raise the_error_hypothesis_found
File "hypothesis/internal/healthcheck.py", line 27, in fail_health_check
raise FailedHealthCheck(message, label)
hypothesis.errors.FailedHealthCheck: It looks like your strategy is filtering out a lot of data.
Health check found 50 filtered examples but only 9 good ones. |
I haven't been able to successfully diagnose these (seemingly sporadic) health check failures yet, but I presume I'm doing something that's not great ... somewhere. I'll continue to poke at it, but if you have suggestions on how you diagnose this kind of problem, I'm all ears. FWIW, I also did a microbenchmark on just generating floats (without any sort of post-filter applied); it seems slower, but by a small amount in comparison to how long, for example, |
Main suggestion is to set the |
There's that high bit that's used to select the integer interpretation (heh, looking at it now, it seems like I severed that path with the clamper ... but we can fix that). We could expand that integer interpretation to allow for some number of decimal places. Maybe put the decimal digits up front, least-significant-first, so shrinking is still nice. If we wanted to get super fancy, we could vary the number of decimal digits by the size of the allowed range. |
Just going with quick responses, because it's late but I'd rather give you something before the weekend:
|
Acknowledged. I've pushed the more straightforward changes at least. I also briefly thought above that I did something bad to the integers representation with the clamper. Not true; we should be all good. (the clamper operates on the real float after un-lex-ing and is independent from that representation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, next steps are "on the weekend I'll have time and brainspace to attack the exponent thing".
And I can't say this often or sincerely enough, thank you for your generous and skillful contributions to some of the trickiest parts of Hypothesis and the broader testing ecosystem 😍
def test_minimal_non_boundary_float(): | ||
x = minimal(floats(min_value=1, max_value=9), lambda x: x > 2) | ||
assert 2 < x < 3 | ||
assert x == 3 # (the smallest integer > 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good, and I'll want to additionally tighten up the other tests to check we get the exact shrinking behavior we want. After we deal with the large-negative-exponent thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some tightening, in this file at least. Any other places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is all that springs to mind, and it looks like you've tightened it all up 👍
Oh, how about adding a new test that floats between say (1.1, 1.6) shrinks to 1.5? Asymmetric interval going to nice values, rather than nice fractions of the allowed interval.
OK, reasoning this through:
Thoughts? (also, explicit reminder for the record: I have a policy that any contributor can ask me to take over a PR if they don't want to work on it anymore - I want contributing to be a good experience, and to get PRs even when the issue might turn out to be a tarpit!) |
Clever! Love it. And not a big change to make. I've just pushed something. Let me know if I understood you correctly, and then I'll go back through the TODOs and whatnot. |
Yep, that's exactly what I was thinking of (and the division-by-mask to implement it is remarkably elegant) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few tiny tweaks, and then we're ready to merge 🎉 🎉 🎉
hypothesis-python/src/hypothesis/strategies/_internal/numbers.py
Outdated
Show resolved
Hide resolved
hypothesis-python/src/hypothesis/strategies/_internal/numbers.py
Outdated
Show resolved
Hide resolved
def test_minimal_non_boundary_float(): | ||
x = minimal(floats(min_value=1, max_value=9), lambda x: x > 2) | ||
assert 2 < x < 3 | ||
assert x == 3 # (the smallest integer > 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is all that springs to mind, and it looks like you've tightened it all up 👍
Oh, how about adding a new test that floats between say (1.1, 1.6) shrinks to 1.5? Asymmetric interval going to nice values, rather than nice fractions of the allowed interval.
Co-authored-by: Zac Hatfield-Dodds <[email protected]>
Not quite getting the result I expected with shrinking (1.1, 1.6) to 1.5. Investigation in progress ... |
…esis into float_strategy_refactor
Well! I didn't figure it out. (1.1, 1.9) DOES get to 1.5, and I've added that test. But all is not well. (1.1, 1.6) yields There is a one bit difference between 1.501953125 and 1.5; in the lex form, it's the last bit in the next-to-last byte. The shrinker successfully clears the earlier mantissa bytes. We do attempt to clear that bit, but at the same time, we roll over the final byte to 255. When bit-reversed, those are the high bits of the mantissa, and we run afoul of the clamper upper bound. And that's why increasing the upper bound to 1.9 makes it work. I've noticed some float-specific shrinking logic (but have not yet spent more than a moment trying to grok it). Even when attempting to disable that logic, I'm seeing approximately the same behavior on this test, so I suspect it's not relevant here. But I don't fully trust myself on that assertion either. Ehhhh. So, not sure exactly what to do next. Possibly it's not really a regression. At the same time, I might be making changes that make pre-existing suboptimal shrinking more likely. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we have a substantial improvement over the status quo already - in the worst cases, shrinking merely isn't noticably better. Since I think fixing this last case would be a substantial adventure into the shrinker, I'm inclined to merge this PR as-is; and I'll write up a tracking issue for the floats(1.1, 1.6)
case tonight.
Thanks again for all your work on this!
Exciting! I know first hand that shepherding PRs is a ton of work too. Thank you for all your help! |
Closes #2907.