Fold FixedBoundedFloatStrategy into FloatStrategy #3327

pschanely · 2022-05-02T00:50:16Z

Closes #2907.

Zac-HD

Thanks Phillip! This is a much nicer patch than I was anticipating would be required, so pending the comments below and some passing tests, I'm keen to merge!

(PyCon US sprints are this Mon+Tues, so I might even see if someone wants to do a filter-rewriting PR...)

hypothesis-python/RELEASE.rst

Zac-HD · 2022-05-02T04:41:03Z

hypothesis-python/src/hypothesis/strategies/_internal/numbers.py

+        allow_nan: bool,
+        allow_subnormal: bool,
+        width: int,
+        bounds: Optional[Tuple[float, float]] = None,


Let's make this a min_value and max_value, defaulting to 0.0 and +inf respectively. We can then be much less conditional everywhere else.

I also was unsatisfied with the bimodal-ness of this logic.

So, for one, I think we still need the unbounded FloatStrategy to still be able to return negatives. Also, the way we do bounded selection is also based on fractional_float, which I don't think will work with +inf. Or it's possible (likely?) that I'm missing the context for why these aren't issues.

One thing I did try - including the appropriate NASTY_FLOATs when bounds are specified. This sort of works, but requires special handling to avoid -0.0 and a possibly empty list. Then some shrinking tests started failing in a way that made me worry I was causing problems there. Insights appreciated!

The underlying FloatStrategy is only positive IIRC, so st.floats() has to return a union of two Strategy objects if the allowed range spans zero.

It appears that the unbounded case doesn't do this union. But this is a big function, and I know that I am not following all of it.

Oh, right, yeah, unbounded floats is positive-only in conjecture.utils (or conjecture.floats? anyway...), but adds a sign bit before you get to FloatStrategy.

Yeah. Still happy to refactor as you see fit - LMK!

Let's push the management of sign-bits-for-bounded-floats down, from st.floats() into FloatsStrategy. The high-level goal here is that FloatsStrategy should hold enough of the arguments to fully represent a call to floats() - so that the solver/fuzzer/filter-rewriting interface will pretty much "wrap the current FloatsStrategy.do_draw()".

Kinda annoying now, sorry, but I think it'll pay off in the medium-term.

Not annoying!

Ok, so you might imagine roughly this logic getting incorporated into the FloatStrategy class. (though needs be implemented rather differently since strategy-composition isn't a tactic that would be as readily available) Is that about right?

Backing up, some questions that come to mind:

Would the ideal mid-level interface be identical to st.floats()? (FWIW, I think CrossHair could implement st.floats(), FloatStrategy, or anything in between without much trouble) One could argue that some amount of normalization and consistency checking is better off at the higher level.

does it matter whether that interface is a function or a class? I'm not familiar with the filter-rewriting stuff, but I imagine that gets a lot easier if this interface is a class-level thing, right? And that's why we care?

Ok, so you might imagine roughly this logic getting incorporated into the FloatStrategy class. (though needs be implemented rather differently since strategy-composition isn't a tactic that would be as readily available) Is that about right?

Yep, that sounds right to me. Caveats: we should check that this doesn't noticably hurt performance in the interim period

Would the ideal mid-level interface be identical to st.floats()? (FWIW, I think CrossHair could implement st.floats(), FloatStrategy, or anything in between without much trouble) One could argue that some amount of normalization and consistency checking is better off at the higher level.

I think a mid-level interface I would want is "like floats(), but inclusive bounds are required and there's no sign bit or allow_subnormal". So we're really just talking about finite floats in some contiguous range, plus maybe nan (only if inf is allowed). It's tractable to efficiently generate this with bit-twiddling to get close enough to right that at least half of attempts are valid, and that's good enough that rejection sampling is really fast.

It's deeply weird to humans, but this should work really well for fuzzers and pretty well for solvers too. Then FloatsStrategy is responsible for composing positive and negative ranges (if applicable), zero (if subnormals disallowed), etc.

does it matter whether that interface is a function or a class? I'm not familiar with the filter-rewriting stuff, but I imagine that gets a lot easier if this interface is a class-level thing, right? And that's why we care?

The filter rewriting will be invoked via FloatsStrategy.filter(), so I'd build FloatsStrategy.do_draw() directly in terms of the mid-level interface. I'd imagined that the interface is basically "provide an object with these callbacks", which, yeah, is very likely to be an instance of some class with those methods.

hypothesis-python/src/hypothesis/strategies/_internal/numbers.py

pschanely · 2022-05-09T03:47:32Z

Apologies for delays - I was travelling this past week. I think it's close; maybe take a peek and see if I got the right idea?

To emulate the strategy-unioning that was previously happening, I've moved the sign bit up front. A few niggly tests are still causing me trouble, though - hints welcome!

Zac-HD

Hmm, this looks sensible to me at an evening glance! Of what I see, I'm most suspicious of the offset logic; ideally we'd get to a mask-based rather than addition-based solution (see e.g. #2907, though I'd like to do as little as possible of that to get this working and merged).

Zac-HD · 2022-05-09T05:31:28Z

hypothesis-python/src/hypothesis/strategies/_internal/numbers.py

+        min_value: Real = -math.inf,
+        max_value: Real = math.inf,


These should definitely be float annotations - we convert other types in st.floats(), and it's useful to have the exact bounds here.

I'd expect to also need an allow_subnormal flag sometimes? But maybe not yet...

ACK on the float annotations.

allow_subnormal could be part of the mid-level interterface I think. Another decision I thought about but don't understand the consequences of: is it better to filter the subnormals out, or to instead cast them into normals, like I'm doing with widths. (assuming it's not part of this interface)

pschanely · 2022-05-09T13:48:16Z

Hmm, this looks sensible to me at an evening glance! Of what I see, I'm most suspicious of the offset logic; ideally we'd get to a mask-based rather than addition-based solution (see e.g. #2907, though I'd like to do as little as possible of that to get this working and merged).

Feel like some sort of bit mask should be possible; I can play with this idea this week.

pschanely · 2022-05-10T19:57:28Z

Hmm, this looks sensible to me at an evening glance! Of what I see, I'm most suspicious of the offset logic; ideally we'd get to a mask-based rather than addition-based solution (see e.g. #2907, though I'd like to do as little as possible of that to get this working and merged).

Feel like some sort of bit mask should be possible; I can play with this idea this week.

At first glance, it feels like bounded integer draws for the exponent (and for very small ranges, the mantissa) would be even more targeted than a bitmask, right?

Unrelated, I'm getting a test failure that I think suggests an independent problem; even on the main branch, the following assert happens, but I think shouldn't. Could you confirm and suggest a fix? The logic in _calc_p_continue is rather opaque.

$ python -c 'from hypothesis.internal.conjecture.utils import _calc_p_continue; _calc_p_continue(1e-5, 1906)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "hypothesis-python/src/hypothesis/internal/conjecture/utils.py", line 468, in _calc_p_continue
    assert 0 < p_continue < 1
AssertionError

Zac-HD · 2022-05-11T06:28:53Z

Bounded integer draws are just a bitmask-and-rejection-sampling: if you've got the right number of bits, then in expectation you succeed in less than two attempts 😁. Admittedly that goes down to a merely better-than-one-in-four chance for floats, assuming that we calculate the mantissa-mask based on the observed exponent, but still pretty good IMO and the simiplicity of atomic representation which works with our existing mutators is probably worth the small overhead.

As to the other, fixed in 224425e though I'm pretty sure it's truly unreachable via the public API.

Co-authored-by: Zac Hatfield-Dodds <[email protected]>

pschanely · 2022-05-11T21:54:14Z

Bounded integer draws are just a bitmask-and-rejection-sampling: if you've got the right number of bits, then in expectation you succeed in less than two attempts 😁. Admittedly that goes down to a merely better-than-one-in-four chance for floats, assuming that we calculate the mantissa-mask based on the observed exponent, but still pretty good IMO and the simiplicity of atomic representation which works with our existing mutators is probably worth the small overhead.

Ah! Got it. Take a look at what I've got now - I'm not sure I'm doing the "make your own luck" process right? Although I'm drawing bits, I'm using integer modulo instead of the mask still - seems behaviorally similar, but perhaps nice to avoid the secondary rejection stage.

As to the other, fixed in 224425e though I'm pretty sure it's truly unreachable via the public API.

Thank you!

If this looks good, I can add something for subnormal rejection ... and then we might be good?

hypothesis-python/src/hypothesis/strategies/_internal/numbers.py

Zac-HD · 2022-05-12T08:05:17Z

Yeah, I think we're close! My only real concern at this point is performance, which is at least not catastrophically worse at the moment. Based on a code review I don't even have specific reason for concern beyond the test failure below, but it would be nice to check a profile of something float-intensive (unique arrays maybe?) before and after to check. Happy to do that over the weekend if necessary.

_______________ test_floats_order_worse_than_their_integral_part _______________
Traceback (most recent call last):
  File "tests/conjecture/test_float_encoding.py", line 101, in test_floats_order_worse_than_their_integral_part
    @given(st.integers(1, 2**53), st.floats(0, 1).filter(lambda x: x not in (0, 1)))
  File "hypothesis/core.py", line 1237, in wrapped_test
    raise the_error_hypothesis_found
  File "hypothesis/internal/healthcheck.py", line 27, in fail_health_check
    raise FailedHealthCheck(message, label)
hypothesis.errors.FailedHealthCheck: It looks like your strategy is filtering out a lot of data.
    Health check found 50 filtered examples but only 9 good ones.

pschanely · 2022-05-13T15:00:33Z

I haven't been able to successfully diagnose these (seemingly sporadic) health check failures yet, but I presume I'm doing something that's not great ... somewhere. I'll continue to poke at it, but if you have suggestions on how you diagnose this kind of problem, I'm all ears.

FWIW, I also did a microbenchmark on just generating floats (without any sort of post-filter applied); it seems slower, but by a small amount in comparison to how long, for example, ConjectureData.draw_bits() takes.

Zac-HD · 2022-05-13T16:08:02Z

Main suggestion is to set the --hypothesis-seed=0, which should at least mean you can get a deterministic repro. Cycle through seeds if that's what it takes...

pschanely · 2022-05-18T13:15:43Z

Maybe working (partly) in the lex_to_float/float_to_lex space would help? I suspect though that we're just going to need some more bits to choose whether we're in the log-or-linear mode, which maybe could be determined based on the "spare" entropy defined by the disallowed floats somehow?

There's that high bit that's used to select the integer interpretation (heh, looking at it now, it seems like I severed that path with the clamper ... but we can fix that). We could expand that integer interpretation to allow for some number of decimal places. Maybe put the decimal digits up front, least-significant-first, so shrinking is still nice. If we wanted to get super fancy, we could vary the number of decimal digits by the size of the allowed range.

Zac-HD · 2022-05-19T08:21:25Z

Just going with quick responses, because it's late but I'd rather give you something before the weekend:

Let's just set aside the distributional problem for now. It is important, but it's also a small and easily-isolated part of the problem; if everything else is lined up I can probably snipe it in half a day but not much less. Worst case, we spend some extra entropy to use a Sampler which can give us whatever distribution over exponents we want (which might actually be a good move 🤔, also works well with swarm testing...)
I am not crazy enough to try something with decimal digits. Integers and binary digits are plenty, and already supported by the format.
But regardless of that part, next steps are per the review above.

pschanely · 2022-05-19T16:58:00Z

Let's just set aside the distributional problem for now. It is important, but it's also a small and easily-isolated part of the problem; if everything else is lined up I can probably snipe it in half a day but not much less. Worst case, we spend some extra entropy to use a Sampler which can give us whatever distribution over exponents we want (which might actually be a good move 🤔, also works well with swarm testing...)

Acknowledged. I've pushed the more straightforward changes at least.

I also briefly thought above that I did something bad to the integers representation with the clamper. Not true; we should be all good. (the clamper operates on the real float after un-lex-ing and is independent from that representation)

Zac-HD

LGTM, next steps are "on the weekend I'll have time and brainspace to attack the exponent thing".

And I can't say this often or sincerely enough, thank you for your generous and skillful contributions to some of the trickiest parts of Hypothesis and the broader testing ecosystem 😍

hypothesis-python/RELEASE.rst

hypothesis-python/tests/cover/test_float_utils.py

Zac-HD · 2022-05-19T18:49:09Z

hypothesis-python/tests/nocover/test_simple_numbers.py

 def test_minimal_non_boundary_float():
    x = minimal(floats(min_value=1, max_value=9), lambda x: x > 2)
-    assert 2 < x < 3
+    assert x == 3  # (the smallest integer > 2)


This is good, and I'll want to additionally tighten up the other tests to check we get the exact shrinking behavior we want. After we deal with the large-negative-exponent thing.

Did some tightening, in this file at least. Any other places?

This file is all that springs to mind, and it looks like you've tightened it all up 👍

Oh, how about adding a new test that floats between say (1.1, 1.6) shrinks to 1.5? Asymmetric interval going to nice values, rather than nice fractions of the allowed interval.

Zac-HD · 2022-05-23T07:49:44Z

OK, reasoning this through:

If the initially-drawn float is permitted, we want to keep it.
- Which means that make_float_clamper will be responsible for fixing the distribution
- It doesn't need to be perfect; any reasonably-bug-revealing heuristic will do - the whole point is to make it possible to use fancier techniques like CrossHair and HypoFuzz
I think the best option is to "clamp" by choosing from a uniform random distribution (like the old FixedBoundedFloats() strategy).
- This works out to a 50/50 split between uniform and log-uniform (i.e. unbiased bits) for floats(0, 1)
- when more exponents are allowed, the uniform component is less important
- when fewer are allowed (e.g. floats(0.25, 1)) then the uniform component is weighted more heavily but distorts the distribution less.
In order to balance many-possible-values with low-correlation-with-initial-draw-failing, I propose that we should just use all the mantissa bits to form a float in the range $[0, 1)$ and then use that as FixedBoundedFloats() did.
- In cases where the allowed interval spans zero, let's keep the sign bit and treat our multiplier as "indexing" into the allowed interval-with-that-sign
- Specifically, let's exclude zero from this, and only generate between the least nonzero float and the max value for positive (etc for negative); this makes handling allow_subnormal=False much easier

Thoughts?

(also, explicit reminder for the record: I have a policy that any contributor can ask me to take over a PR if they don't want to work on it anymore - I want contributing to be a good experience, and to get PRs even when the issue might turn out to be a tarpit!)

pschanely · 2022-05-23T16:31:52Z

Thoughts?

Clever! Love it. And not a big change to make. I've just pushed something.

Let me know if I understood you correctly, and then I'll go back through the TODOs and whatnot.

Zac-HD · 2022-05-23T18:49:16Z

Yep, that's exactly what I was thinking of (and the division-by-mask to implement it is remarkably elegant)

Zac-HD

Just a few tiny tweaks, and then we're ready to merge 🎉 🎉 🎉

hypothesis-python/src/hypothesis/strategies/_internal/numbers.py

hypothesis-python/tests/nocover/test_simple_numbers.py

Zac-HD · 2022-05-23T22:47:19Z

hypothesis-python/tests/nocover/test_simple_numbers.py

 def test_minimal_non_boundary_float():
    x = minimal(floats(min_value=1, max_value=9), lambda x: x > 2)
-    assert 2 < x < 3
+    assert x == 3  # (the smallest integer > 2)


This file is all that springs to mind, and it looks like you've tightened it all up 👍

Oh, how about adding a new test that floats between say (1.1, 1.6) shrinks to 1.5? Asymmetric interval going to nice values, rather than nice fractions of the allowed interval.

Co-authored-by: Zac Hatfield-Dodds <[email protected]>

pschanely · 2022-05-24T13:57:16Z

Not quite getting the result I expected with shrinking (1.1, 1.6) to 1.5. Investigation in progress ...

…esis into float_strategy_refactor

pschanely · 2022-05-24T17:39:57Z

Not quite getting the result I expected with shrinking (1.1, 1.6) to 1.5. Investigation in progress ...

Well! I didn't figure it out. (1.1, 1.9) DOES get to 1.5, and I've added that test. But all is not well. (1.1, 1.6) yields 1.501953125. Here is my best attempt at explaining what happens, but I'm not confident about my understanding:

There is a one bit difference between 1.501953125 and 1.5; in the lex form, it's the last bit in the next-to-last byte. The shrinker successfully clears the earlier mantissa bytes. We do attempt to clear that bit, but at the same time, we roll over the final byte to 255. When bit-reversed, those are the high bits of the mantissa, and we run afoul of the clamper upper bound. And that's why increasing the upper bound to 1.9 makes it work.

I've noticed some float-specific shrinking logic (but have not yet spent more than a moment trying to grok it). Even when attempting to disable that logic, I'm seeing approximately the same behavior on this test, so I suspect it's not relevant here. But I don't fully trust myself on that assertion either.

Ehhhh. So, not sure exactly what to do next. Possibly it's not really a regression. At the same time, I might be making changes that make pre-existing suboptimal shrinking more likely. What do you think?

Zac-HD

Well, we have a substantial improvement over the status quo already - in the worst cases, shrinking merely isn't noticably better. Since I think fixing this last case would be a substantial adventure into the shrinker, I'm inclined to merge this PR as-is; and I'll write up a tracking issue for the floats(1.1, 1.6) case tonight.

Thanks again for all your work on this!

pschanely · 2022-05-24T20:43:23Z

Exciting! I know first hand that shepherding PRs is a ton of work too. Thank you for all your help!

pschanely mentioned this pull request May 2, 2022

Refactoring ConjectureData to support symbolic execution #3086

Closed

Zac-HD reviewed May 2, 2022

View reviewed changes

pschanely commented May 2, 2022

View reviewed changes

hypothesis-python/src/hypothesis/strategies/_internal/numbers.py Outdated Show resolved Hide resolved

pschanely marked this pull request as ready for review May 3, 2022 01:46

Zac-HD reviewed May 9, 2022

View reviewed changes

pschanely requested a review from DRMacIver as a code owner May 10, 2022 18:59

pschanely and others added 10 commits May 11, 2022 10:15

Fold FixedBoundedFloatStrategy into FloatStrategy

22fd4a7

Update hypothesis-python/RELEASE.rst

90f8261

Co-authored-by: Zac Hatfield-Dodds <[email protected]>

Fix formatting and mypy issues

f124a8b

Remove bounds checking in FloatStrategy.permitted

638f240

Push more floats() logic down into FloatStrategy

ccf9f4b

Apply formatting

31f7dee

Fork on sign consistently

16ba49c

Fix float shrinking

6c66b31

Type FloatStrategy bound params as float lint

2118d26

Use bit-level draws for bounded float ranges

590bb5d

Add return type for make_float_clamper

6117079

Zac-HD reviewed May 12, 2022

View reviewed changes

Zac-HD mentioned this pull request May 12, 2022

Overhauling the st.floats() internals #2907

Closed

pschanely added 2 commits May 12, 2022 15:16

Integrate float subnormal handling more deeply

dd005a1

Allow zeros and use sign in float example bounds

b16953a

Update float sampling range in test

8be1a42

Make various improvements suggested from review

ca81fd4

pschanely added 2 commits May 19, 2022 13:18

Remove type annotation for consistency

975a392

Apply code formatting

dfaf14d

Zac-HD reviewed May 19, 2022

View reviewed changes

Add PBT for make_float_clamper()

c6fe05e

Clamp floats by sampling linearly within bounds

348cf3c

pschanely added 3 commits May 23, 2022 16:40

Update release notes

9dc7195

Tighten float minimization tests

3bc6bb8

Revert test workaround for small floats

806f6f9

Zac-HD reviewed May 23, 2022

View reviewed changes

pschanely and others added 3 commits May 24, 2022 08:53

Tighten float list expectation

1bb6499

Correct FloatStrategy repr()

d3b7386

Ensure nonzero magnitude is > 0

0975f19

Co-authored-by: Zac Hatfield-Dodds <[email protected]>

pschanely added 2 commits May 24, 2022 12:43

Add float test with asymetric bounds

a1ee9ca

Merge branch 'float_strategy_refactor' of github.com:pschanely/hypoth…

b28fd27

…esis into float_strategy_refactor

Zac-HD approved these changes May 24, 2022

View reviewed changes

Zac-HD enabled auto-merge May 24, 2022 18:24

Zac-HD merged commit ef5ec3c into HypothesisWorks:master May 24, 2022

Zac-HD mentioned this pull request May 24, 2022

The minimal example for st.floats(1.1, 1.6) should be 1.5, as if there were no bounds #3357

Closed

This was referenced Jun 20, 2022

Boundary of search space on pairs of floats is reliably limited #2671

Closed

Filter-rewriting for st.floats() #3385

Merged

Fold FixedBoundedFloatStrategy into FloatStrategy #3327

Fold FixedBoundedFloatStrategy into FloatStrategy #3327

Conversation

pschanely commented May 2, 2022 • edited by Zac-HD Loading

Zac-HD left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pschanely commented May 9, 2022

Zac-HD left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pschanely commented May 9, 2022

pschanely commented May 10, 2022

Zac-HD commented May 11, 2022

pschanely commented May 11, 2022

Zac-HD commented May 12, 2022

pschanely commented May 13, 2022

Zac-HD commented May 13, 2022 • edited Loading

pschanely commented May 18, 2022

Zac-HD commented May 19, 2022 • edited Loading

pschanely commented May 19, 2022

Zac-HD left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD commented May 23, 2022

pschanely commented May 23, 2022 • edited Loading

Zac-HD commented May 23, 2022

Zac-HD left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pschanely commented May 24, 2022

pschanely commented May 24, 2022

Zac-HD left a comment

Choose a reason for hiding this comment

pschanely commented May 24, 2022

pschanely commented May 2, 2022 •

edited by Zac-HD

Loading

Zac-HD commented May 13, 2022 •

edited

Loading

Zac-HD commented May 19, 2022 •

edited

Loading

pschanely commented May 23, 2022 •

edited

Loading