-
-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support coverage-guided generation #84
Comments
@vlsi Thanks for the suggestion. Guided generation looks interesting. Can you give a concrete example how you imagine the integration of ZEST with jqwik could look like for the user? As for the integration, it would most probably require a new kind of generation mode and a way to inject the random stream and the guidance. My intuition says it should be possible but it will require deeper understanding on my side how Zest and JQF work. If you can convince me with good examples from user perspective I'm absolutely willing to tackle it :-) |
Let us take Apache Calcite as an example. SQL has expressions (e.g. Of course, there's an implementation of the expression optimizer. So the optimizer has properties to be verified. For instance, the optimizer must not throw exceptions while doing its optimization. The current jqwik / junit-quickcheck approach would be to generate expression trees at random, and pass them to the optimizer. This case for the end user could look as follows: public class RexFuzzerTest {
private RexSimplify rexOptimizer;
public JqfFuzzerTest() {
rexOptimizer = null; // setup the optimizer appropriately
}
@Property
// or @Property(coverageGuidance=true)
// or @Property @Guidance(type=ZEST) @ZestParams(include="org.apache.calcite.*")
public void hello(@ForAll RexNode value) {
rexOptimizer.simplify(value); // we check that optimizer does not throw exceptions
}
} Note: I am not sure regarding the way guidance should be configured. At the end of the day, I don't think you can (easily) add a The user would have to enable JQF's bytecode instrumentation somehow. It is available as a Note: currently JQF is integrated with Does that answer your question? In other words, from an end-user perspective, the very basic case is the same as with simple What ZEST does, it analyzes the execution outcomes, and it provides you with a If you ask for more advanced cases then it would be nice to be able to see extra statistics from
You might want to check |
@vlsi As for your second question:
I'm probably never going to do that. Here's my argument: Jupiter, the default engine, is a jack-of-all-trades, which in turn means that it's not well suited for special needs. Jupiter exentsion points have been designed to be composable, that's e.g. the reason why JUnit4's model of wrapping the execution of tests and suites was given up and replaced by before and after hooks. Nevertheless, generic composability is difficult to get right, and it requires to stick with some limitations. Jupiter's lifecycle is not really well suited for running an indeterminate number of tests scheduled by the feedback coming from previous test runs. Dynamic test generation allows that to some degree but then you will miss out on shrinking, statistics reporting and a few other subtleties that make developer experience of jqwik the way it is now - and as I like it. That said, it would probably be possible to come up with a jqwik extension for Jupiter that could inject parameters, run the individual tries as dynamic tests and even do standard shrinking. But it would
Junit platform test engines - as jqwik is an example - have been introduced to fill the hole that Jupiter extensions leave empty: Different syntax to specify tests, different test lifecycle, difficulties with composability and side effects. BTW, there's a reason that junit-quickcheck has its own JUnit4 runner and is not just a set of JUnit 4 rules: Runners in JUnit 4 are (to some degree) what engines are in JUnit 5; they are not composable. As you mention with |
javaagent configuration is mostly done by the IDE or the build system, so this should be doable. I still don't get how providing random bytes can direct the generation of certain values since the values generated from certain random bytes cannot really be predicted without knowing the implementation of all generators involved - and that can be many. What do I miss or fail to understand? |
I guess it is described in zest paper (see
AFAIU, the idea is pretty much the same as the mutation and selection in a typical genetic optimizer directs the population to the desired goal. Genetic optimizers do not need to know the implementation of the underlying systems.
It does not need to predict the output. |
I would like to see annotation in line with existing approach. For example: @Property(generation = GenerationMode.ZEST_DRIVEN, shrinking = ShrinkingMode.FULL)
@Report(Reporting.COVERAGE)
void arbitraryUserIsAlwaysValidForSerDes(@ForAll("arbitraryUser") UserForTests arbitraryUser) {
} PS: I like proposed idea. Does it mean that tuning properties to coverage various cases would not be needed generally? (genetics programming can be good in many cases) |
Frankly speaking, I have no idea at this point. In general, it looks to be useful even without tuning knobs. The use cases I have in mind are Apache Calcite ( PS. It would probably be fun to have |
Thanks for info. I can use it (when integrated with jqwik) as soon as there is something to test. I have spring-data-rest real world project (in alpha version through) which will reach its customers this year and testing is one of our priority.
To find bugs more easily, you can try also "reactive jdbc driver". See its postgresql client code for example: https://github.com/r2dbc/r2dbc-postgresql |
I'm not sure the current Zest/JQF supports multi-threaded applications :-/ |
Jop, you are right, through using it when business logic is in main thread only, would be OK. See rohanpadhye/JQF#41. On the other side, this is not my case. In spring-data-rest there is some tomcat in main thread (perhaps), or main thread is my web test client (http client, which is sending http requests from jqwik test). |
@vlsi I looked at the examples coming with JQF. They don't seem to define any special case guidance which confuses me. Take this simple example: @Fuzz
public void insert(int @Size(min=100, max=100)[] elements) {
BinaryTree b = new BinaryTree();
for (int e : elements) {
b.insert(e);
}
} What difference in generated data would you expect - if any - compared to
running on Vanilla junit-quickcheck? |
Note: ZestGuidance is instantiated and passed via static field in GuidedFuzzing Frankly speaking, the current Zest / JQF code seems to be focused to be used as Is it what you are looking for? |
Would be cool. When do you have time this week? |
Feel free to select a suitable timeslot (15-20-30 min?) at https://doodle.com/vlsi |
Please, post some "key points" as result of talk if possible. It is nice reading and interesting approach getting alive :) |
Hi! I am the main developer of JQF.
I added an example in the README to demonstrate a good use case. I can copy that here: @Fuzz /* The args to this method will be generated automatically by JQF */
public void testMap2Trie(Map<String, Integer> map, String key) {
// Key should exist in map
assumeTrue(map.containsKey(key)); // the test is invalid if this predicate is not true
// Create new trie with input `map`
Trie trie = new PatriciaTrie(map);
// The key should exist in the trie as well
assertTrue(trie.containsKey(key)); // fails when map = {"x": 1, "x\0": 2} and key = "x"
} Running only random sampling reveals no assertion violations even after hours of input generation. In fact, most inputs do not even satisfy the (You might argue that this property test can be written in a better way that might help find the bug via random sampling alone, but this is just an extreme example showing the utility of coverage guidance)
Just to clarify, this is so that instrumented class files can report coverage events to the unique guidance when executing branches and method calls. This is similar to coverage tools such as JaCoCo, which use static fields to collect code coverage. Having multiple guidances live simultaneously would require multiple versions of an instrumented class. Although it could be possible to bind guidances to class-loader instances, there hasn't been a need for such a setup as of yet.
JQF has a Maven plugin, which lets you run (Zest by default) via |
Thanks, that is useful, however, it looks like you answer to a slightly different question. I agree you need to talk to the instrumentation engine. That is fine, and you call However, I agree Of course,
@rohanpadhye , What I mean by The only use of |
Thanks for chiming in. Much appreciated. The example is a motivating one that I could strive for to get running with jqwik as a first step. When I try to run it with As for the instrumenting classloader, would it suffice to load the test container class through this classloader? I'm wondering which kind of lifecycle hook might be needed. Last question: Would you be willing to move all junit4/junit-quickcheck related code in a separate artefact - or the other way round to create a jqf-core module? |
That's a reasonable suggestion and I'll open an issue to make this change.
Yes, that is correct. There wasn't a need for exposing the fuzzing engine as a library as of now. However, if separating components into reusable library-like packages helps projects like jqwik, I am fine with refactoring.
I created a standalone example here: https://github.com/rohanpadhye/jqf-zest-example. Let me know if following the README works.
I am not sure of this. JQF currently uses a separate plugin because of the control of specifying a custom classloader. I am not sure how to change the classloader used by the default test runner.
Handling the dependency on junit-quickcheck should be straightforward, since all the dependent classes are already isolated into one directory. Regarding dependency on JUnit4, I will have to investigate further as to whether there is anything outside of this package that depends on JUnit classes. @vlsi already opened an issue in rohanpadhye/JQF#80 for this, which can be used if there is a need for this refactoring. |
To clarify, this is what the JQF Maven plugin currently does: I am not sure how to get this effect without using a custom plugin (i.e., how to change the classloader used by |
@rohanpadhye I could get the isolated example to run (many thanks!). It does not find a unique failure, though. Using mvn:repro fails with
BTW, the following jqwik property is able to detect the PatriciaTrie bug:
You have to use the latest snapshot version "1.2.3-SNAPSHOT" though, because I had forgotten to add
This will require some experimentation to find out. In the best case an annotation will be enough to swap out classloaders for parts of the system. In a worse case a javaagent corresponding to your class loader might be required. In the very worst case both a Maven and Gradle plugin will be necessary. |
@rohanpadhye Tried the example a few more times. Eventually I got a unique failure. After some code reading I guess that integrating JQF with jqwik will require an agent similar to what QuickTheories is doing. Here's an excerpt from https://github.com/quicktheories/QuickTheories#coverage-guidance:
This probably means that any kind of integration should be in an optional artefact. |
There might be other Luckily, there's So I would expect that jqwik should provide pluggable points for implementation of guided fuzzing, and implementation of |
@vlsi Even then. The dependency on JQF (without Zest) would be too strong IMO to put it into jqwik's core. Alternatively jqwik had to duplicate a lot of what JQF is bringing to the table. What jqwik has to offer in its core is the ability to change the stream of random bytes and to trigger some behaviour after each execution of a property. Given that I'd hope that the rest can be done as a module on top. |
No-one suggested adding a dependency on JQF I guess :) |
@vlsi To summarize our talk:
|
Here's my current take considering all your input: interface GenerationGuidance {
/**
* Returns a reference to an iterator that will deliver
* integer values to feed the pseudo-random number generator for the next try.
*
* @throws IllegalStateException if there is no next try available
*/
Iterator<Integer> nextTry();
/**
* Decide if another sample can be tried.
*
* Method could potentially block to wait for guiding algorithm to finish.
*
* If it returns false generation will be finished.
*/
boolean hasNextTry();
/**
* Callback for observing actual generated sample passed to the property method.
*/
void observeGeneratedSample(List<Object> sample);
/**
* Handles the result of a property try.
*/
void handleResult(TryResult result);
}
interface TryResult {
enum Status {
SATISFIED,
FALSIFIED,
INVALID
}
Status status();
Optional<Throwable> throwable();
} |
Generated values do not always have equals/hashCode, so there's no much guidance can do to compare the values. JQF itself does not use |
#84 (comment) looks good, except I would skip
It might be slightly better to have I wonder if it makes sense to unify the result type of I'm not sure, however, In other words, something behind the lines of: Then, if the user wants to reproduce the failure (or hard-code a single test case), they might use a non-guided WDYT? |
As far as I know
Maybe, maybe not. Too much speculation for the time being. I'd rather start with a proof of concept and then learn from that.
There's already a different way to reproduce a failing sample so that's covered for the time being. |
So here's the current state. I removed the interface GenerationGuidance {
/**
* Returns a reference to an iterator that will deliver
* integer values to feed the pseudo-random number generator for the next try.
*
* @throws IllegalStateException if there is no next try available
*/
Iterator<Integer> nextTry();
/**
* Decide if another sample can be tried.
*
* Method could potentially block to wait for guiding algorithm to finish.
*
* If it returns false generation will be finished.
*/
boolean hasNextTry();
/**
* Handles the result of a property try.
*/
void handleResult(TryExecutionResult result);
}
interface TryExecutionResult {
enum Status {
SATISFIED,
FALSIFIED,
INVALID
}
Status status();
Optional<Throwable> throwable();
} |
I'm wondering if interface ByteStream implements Closeable {
byte next();
boolean hasNext();
} |
There's
Just in case: |
That's what I meant with "not straightforward". I don't think I'll go with either |
I don't think it is worth spending time on the discussion like this, however, I truly do not see why Stream does support the notion of
In other words, Java architects were more or less fine with using I don't see why it does not |
Absolutely not worth it. We just don't agree here. |
Just in case: val iter = IntStream.of(1, 2, 3, 4).iterator()
while (iter.hasNext()) {
println(iter.nextInt())
} ^^^ the above is straightforward stream processing to me |
It's creating an iterator again. And creation would probably have to go through |
@vlsi Now that we must accept that we won't agree on every design aspect, do you still consider it worthwhile going forward with a proof of concept? |
This is my current best bet: interface GenerationGuidance {
/**
* Returns a reference to a source that will deliver
* integer values to feed the pseudo-random number generator for the next try.
*
* @throws IllegalStateException if there is no next try available
*/
TryGenerationSource nextTry();
/**
* Decide if another sample can be tried.
* <p>
* Method could potentially block to wait for guiding algorithm to finish.
* <p>
* If it returns false generation will be finished.
*/
boolean hasNextTry();
/**
* Handles the result of a property try.
*/
void handleResult(TryExecutionResult result);
}
interface TryExecutionResult {
enum Status {
SATISFIED,
FALSIFIED,
INVALID
}
Status status();
Optional<Throwable> throwable();
}
/**
* Source for providing integer values.
*/
interface TryGenerationSource extends AutoCloseable {
int next();
boolean hasNext();
/**
* Will be called when no more values are necessary for
* generating the parameters of the current try.
*/
@Override
default void close() {
// Optional
}
} |
LGTM |
I wonder if TryGenerationSource could be simplified to /**
* Source for providing integer values.
*/
interface TryGenerationSource extends AutoCloseable {
/** There must always be another value to feed generation */
int more();
/**
* Will be called when no more values are necessary for
* generating the parameters of the current try.
*/
@Override
default void close() {
// Optional
}
} |
I guess both of them are OK provided there's the following factory method :) interface TryGenerationSource extends AutoCloseable {
static TryGenerationSource of(IntStream source);
} |
However, I agree the random generator should better be treated as an infinite source. |
@vlsi Are you still interested in the feature and would have time to experiment with it? If so I'll move (a branch with) the basic mechanism towards the top of my todo list. |
@jlink , that's on my list, however, I don't think I would be able to pick it in the nearest month :-/ |
@vlsi OK. Then I'll leave it in my plans for the summer. |
This paper propagates coverage guided property testing as a more effective variant of what AFL does: https://www.cs.umd.edu/~mwh/papers/fuzzchick.pdf |
any news on that issue ? |
Implementing the basic hook in jqwik is probably possible with reasonable effort. But someone - not me since I’m short on time - would then have to use it for adding AFL-like mutation and coverage it. Any volunteers? I might push it up in the prioritised todo list. |
While I do not have immediate plans to implement coverage-guided generation yet, however, the following looks useful and relevant: https://tiemoko.com/blog/diff-fuzz/ |
Here's an article that suggests to split random generators that affect structure of the input from the ones that affect data. |
Any more thoughts on this? I think this would have been very cool in JQWik. Reading the papers and the resources the Zest algorithm sounds particularly interesting. I'm just wondering what the best way to get something like this implemented is ? It requires instrumentation of the classes being tested, so coverage can actually be tracked. For reference I'm attaching a picture here of the Zest algo.
This might be a fun exercise to implement at some point.... We could probably rip the instrumentation project from JQF and then implement the above logic in a hook that intercepts each try. Some questions come up though, How would we deal with |
There's already a Java PBT lib that supports coverage-guided generation: https://github.com/quicktheories/QuickTheories. QT also shows that instrumentation through an agent is feasible. I don't understand the point about |
I guess #424 might be relevant as it would be hard to introduce "coverage-guided" as long as jqwik uses |
@jlink Cool. Last time I played with it, it didn't have this feature. As for |
Hi,
There's https://github.com/rohanpadhye/jqf by @rohanpadhye
The idea here is that a randomized-input generation can be guided based on the feedback from the test execution.
For instance, guidance can observe code coverage and it could attempt to produce inputs that explore uncovered branches.
Another example might be to measure the performance (e.g. performance counters or just heap allocation) and attempt to produce an input that triggers performance issues.
Relevant information
https://www.fuzzingbook.org/html/MutationFuzzer.html#Guiding-by-Coverage (and other parts of the book)
Suggested Solution
If we take Zest (which is coverage-guided fuzzer), then ZestGuidance provides an input stream that can be considered as the source of randomness: https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L136
As far as I can see, the current
RandomizedShrinkablesGenerator
never exposesRandom
instance, so there's no way to take a random stream input from the guidance and pass it toRandomizedShrinkablesGenerator
(see https://github.com/jlink/jqwik/blob/2517d6d8cd612137fcff730a9114169260fad4bf/engine/src/main/java/net/jqwik/engine/execution/CheckedProperty.java#L155 )As far as I understand, jqwik does not provide a way to plug guided fuzzing: https://github.com/jlink/jqwik/blob/5632ef8ca3d51ff083380257fb0b2b9bd7383920/engine/src/main/java/net/jqwik/engine/properties/GenericProperty.java#L38
So I would like to hear your opinion on the way to integrate guided fuzzing.
It looks like it would be nice to be able to use pluggable guidances, so what do you think if the random in question was taken from a
Guidance
instance and the guidance get notified on the test outcomes?PS. I wonder if jqwilk engine can be integrated with JUnit5's default one.
I know JUnit5's default engine does not suit very well for property-based tests, however, can you please clarify what are the major issues that prevent the use of JUnit5 engine for jqwik?
For instance, JUnit5 (e.g. TestTemplate) keeps all the test outcomes (which results in OOM), however, in property-based tests we don't need that, and we need to just count the number of passed tests.
On the other hand, it looks like that can be improved in JUnit (e.g. by adding
@AggregateSuccessfulExecutionsAsCount
or something like that).I just thought that the default JUnit5 engine would make the tests easier to write as the documentation would be consolidated.
For instance, there's
net.jqwik.api.Disabled
, and there'sorg.junit.jupiter.api.Disabled
which are more or less the same thing. It is unfortunate that regular annotations do not work with jqwik. Of course, it would be illegal to mix@jupiter.api.Test
with@Property
, however, that looks like a corner case that can be validated.The text was updated successfully, but these errors were encountered: