-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Carbon fuzzing 3/3: added actual fuzzer implementation and a fuzzvert…
…er utility for investigating crashing protos (#1156) * finished fuzzer and added fuzzverter util * fixed typo * renamed cmd line params * fixed libproto_mutator download path * small fixes * small fixes * small fixes * renamed sample corpus proto * small fixes * try building on github with LIBCPP_DEBUG enabled * temporarily marked proto fuzzer as a manual test * code review * use a dedicated proto-fuzzer feature to work around LIBCPP_DEBUG=1 crash in proto code * code review comments, added README.md * minor fixes to the text * Update bazel/cc_toolchains/clang_cc_toolchain_config.bzl Co-authored-by: Jon Meow <[email protected]> * use Carbon source representation for "empty Main()" instead of text format proto representation * fixed typo * made FuzzerUtil produce the full carbon source (proto converted + Main if needed) to decrease code duplication a bit * typo * switched to text proto format per code review * Update executable_semantics/prelude.h Co-authored-by: Jon Meow <[email protected]> * Update executable_semantics/fuzzing/README.md Co-authored-by: Jon Meow <[email protected]> * Update executable_semantics/fuzzing/README.md Co-authored-by: Jon Meow <[email protected]> * Update executable_semantics/fuzzing/README.md Co-authored-by: Jon Meow <[email protected]> * Update executable_semantics/fuzzing/README.md Co-authored-by: Jon Meow <[email protected]> * Update executable_semantics/fuzzing/README.md Co-authored-by: Jon Meow <[email protected]> * review comments * removed unnecessary file mode variables * Update executable_semantics/fuzzing/fuzzverter.cpp Co-authored-by: Jon Meow <[email protected]> * Update executable_semantics/fuzzing/README.md Co-authored-by: Jon Meow <[email protected]> * Update executable_semantics/fuzzing/README.md Co-authored-by: Jon Meow <[email protected]> * code review comments * Update executable_semantics/syntax/BUILD Co-authored-by: Jon Meow <[email protected]> * buildifier Co-authored-by: Jon Meow <[email protected]>
- Loading branch information
1 parent
db0de10
commit 22462a0
Showing
18 changed files
with
497 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
# Executable semantics structured fuzzer | ||
|
||
<!-- | ||
Part of the Carbon Language project, under the Apache License v2.0 with LLVM | ||
Exceptions. See /LICENSE for license information. | ||
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
--> | ||
|
||
## Overview | ||
|
||
Fuzz testing is based on generating a large amount of random inputs for a | ||
software component in order to trigger bugs and unexpected behavior. Basic | ||
fuzzing uses randomly generated arrays of bytes as inputs, which works great for | ||
some applications but is problematic for testing the logic that operates on | ||
highly structured data, as most random inputs are immediately rejected as | ||
invalid before any interesting parts of the code get a chance to run. | ||
|
||
Structured fuzzing addresses this issue by ensuring the randomly generated data | ||
is itself structured, and as such has a high chance of presenting a valid input. | ||
|
||
`executable_semantics_fuzzer` is a structured fuzzer based on | ||
[libprotobuf-mutator](https://github.com/google/libprotobuf-mutator), which is a | ||
library to randomly mutate | ||
[protobuffers](https://github.com/protocolbuffers/protobuf). | ||
|
||
The input to the fuzzer is an instance of `Carbon::Fuzzing::Carbon` proto | ||
randomly generated by the `libprotobuf-mutator` framework. | ||
`executable_semantics_fuzzer` converts the proto to a Carbon source code string, | ||
and tries to parse and execute the code using `executable_semantics` | ||
implementation. | ||
|
||
## Fuzzer data format | ||
|
||
`libprotobuf-mutator` supports fuzzer inputs in either text or binary protocol | ||
buffer format. `executable_semantics_fuzzer` uses text proto format with | ||
`Carbon` proto message definition in `common/fuzzing/carbon.proto`. | ||
|
||
## Running the fuzzer | ||
|
||
The fuzzer can be run in 'unit test' mode, where the fuzzer executes on each | ||
input file from the `fuzzer_corpus/` folder, or in 'fuzzing' mode, where the | ||
fuzzer will keep generating random inputs and executing the logic on them until | ||
a crash is triggered, or forever in a bug-free program ;). | ||
|
||
To run in 'unit test' mode: | ||
|
||
```bash | ||
bazel test --config=proto-fuzzer --test_output=all //executable_semantics/fuzzing:executable_semantics_fuzzer | ||
``` | ||
|
||
To run in 'fuzzing' mode: | ||
|
||
```bash | ||
bazel build --config=proto-fuzzer //executable_semantics/fuzzing:executable_semantics_fuzzer | ||
|
||
bazel-bin/executable_semantics/fuzzing/executable_semantics_fuzzer | ||
``` | ||
|
||
It's also possible to run the fuzzer on a single input: | ||
|
||
```bash | ||
bazel-bin/executable_semantics/fuzzing/executable_semantics_fuzzer /tmp/crash.textproto | ||
``` | ||
|
||
## Investigating a crash | ||
|
||
To reproduce a crash, run the fuzzer on the crashing input as described above. | ||
|
||
A separate tool called `fuzzverter` can be used for things like converting a | ||
crashing input to Carbon source code for running `executable_semantics` on the | ||
code directly. | ||
|
||
To convert a `Fuzzing::Carbon` text proto to Carbon source: | ||
|
||
```bash | ||
bazel-bin/executable_semantics/fuzzing/fuzzverter --mode proto_to_carbon --input /tmp/crash.textproto | ||
``` | ||
|
||
## Generating new fuzzer corpus entries | ||
|
||
The ability of the fuzzing framework to generate 'interesting' inputs can be | ||
improved by providing 'seed' inputs known as the fuzzer corpus. The inputs need | ||
to be a `Fuzzing::Carbon` text proto. | ||
|
||
To generate a text proto from Carbon source: | ||
|
||
```bash | ||
bazel-bin/executable_semantics/fuzzing/fuzzverter --mode carbon_to_proto --input /tmp/crash.carbon --output /tmp/crash.textproto | ||
``` |
42 changes: 42 additions & 0 deletions
42
executable_semantics/fuzzing/executable_semantics_fuzzer.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
// Part of the Carbon Language project, under the Apache License v2.0 with LLVM | ||
// Exceptions. See /LICENSE for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
|
||
#include <google/protobuf/text_format.h> | ||
#include <libprotobuf_mutator/src/libfuzzer/libfuzzer_macro.h> | ||
|
||
#include "common/fuzzing/carbon.pb.h" | ||
#include "executable_semantics/fuzzing/fuzzer_util.h" | ||
#include "executable_semantics/interpreter/exec_program.h" | ||
#include "executable_semantics/syntax/parse.h" | ||
#include "executable_semantics/syntax/prelude.h" | ||
#include "llvm/Support/raw_ostream.h" | ||
|
||
namespace Carbon { | ||
|
||
// Parses and executes a fuzzer-generated program. | ||
void ParseAndExecute(const Fuzzing::CompilationUnit& compilation_unit) { | ||
const std::string source = ProtoToCarbonWithMain(compilation_unit); | ||
|
||
Arena arena; | ||
ErrorOr<AST> ast = ParseFromString(&arena, "Fuzzer.carbon", source, | ||
/*trace=*/false); | ||
if (!ast.ok()) { | ||
llvm::errs() << "Parsing failed: " << ast.error().message() << "\n"; | ||
return; | ||
} | ||
AddPrelude("executable_semantics/data/prelude.carbon", &arena, | ||
&ast->declarations); | ||
const ErrorOr<int> result = ExecProgram(&arena, *ast, /*trace=*/false); | ||
if (!result.ok()) { | ||
llvm::errs() << "Execution failed: " << result.error().message() << "\n"; | ||
return; | ||
} | ||
llvm::outs() << "Executed OK: " << *result << "\n"; | ||
} | ||
|
||
} // namespace Carbon | ||
|
||
DEFINE_TEXT_PROTO_FUZZER(const Carbon::Fuzzing::Carbon& input) { | ||
Carbon::ParseAndExecute(input.compilation_unit()); | ||
} |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
// Part of the Carbon Language project, under the Apache License v2.0 with LLVM | ||
// Exceptions. See /LICENSE for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
|
||
#include "executable_semantics/fuzzing/fuzzer_util.h" | ||
|
||
#include "common/check.h" | ||
#include "common/fuzzing/proto_to_carbon.h" | ||
|
||
namespace Carbon { | ||
|
||
// Appended to fuzzer-generated Carbon source when the source is missing | ||
// `Main()` definition, to prevent early error return in semantic analysis. | ||
static constexpr char EmptyMain[] = R"( | ||
fn Main() -> i32 { | ||
return 0; | ||
} | ||
)"; | ||
|
||
auto ProtoToCarbonWithMain(const Fuzzing::CompilationUnit& compilation_unit) | ||
-> std::string { | ||
const bool has_main = std::any_of( | ||
compilation_unit.declarations().begin(), | ||
compilation_unit.declarations().end(), | ||
[](const Fuzzing::Declaration& decl) { | ||
return decl.kind_case() == Fuzzing::Declaration::kFunction && | ||
decl.function().name() == "Main"; | ||
}); | ||
return Carbon::ProtoToCarbon(compilation_unit) + (has_main ? "" : EmptyMain); | ||
} | ||
|
||
} // namespace Carbon |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
// Part of the Carbon Language project, under the Apache License v2.0 with LLVM | ||
// Exceptions. See /LICENSE for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
|
||
#ifndef EXECUTABLE_SEMANTICS_FUZZING_FUZZER_UTIL_H_ | ||
#define EXECUTABLE_SEMANTICS_FUZZING_FUZZER_UTIL_H_ | ||
|
||
#include "common/fuzzing/carbon.pb.h" | ||
|
||
namespace Carbon { | ||
|
||
// Converts `compilation_unit` to Carbon. Adds an default `Main()` | ||
// definition if one is not present in the proto. | ||
auto ProtoToCarbonWithMain(const Fuzzing::CompilationUnit& compilation_unit) | ||
-> std::string; | ||
|
||
} // namespace Carbon | ||
|
||
#endif // EXECUTABLE_SEMANTICS_FUZZING_FUZZER_UTIL_H_ |
Oops, something went wrong.