-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database-centric architecture for communication, persistence, and autoscaling #3
Comments
I'm taking a look at this. What are the details of the db interactions? (1) Each worker stores metadata in the database specified by the test settings, with key (2) Each worker stores metadata in a completely separate hypofuzz db, for-now hardcoded at Or is it (3) a secret third thing? Either way, dashboard will have to be told all keys |
We want to be able to host the dashboard on a separate server to the fuzzing workers, so it'll need to be the database specified by the test settings. No multiplexing though; we can have a As you say, we'll use |
The Status Quo
This is going to be substantial architecture overhaul, so let's start with how things currently work: a HypoFuzz run has three basic parts:
In the current design, this is fundamentally a run-it-on-one-box kind of system: the tests are divided up between workers at startup time (or maybe run on every worker concurrently; the workers are fine with this though the dashboard isn't), and while the workers can reload the previous examples everything else is as if it were the first run ever - with some hit to efficiency and the clarity of statistics.
Goal: support a system where workers can come and go, for example to soak up idle CPU time as a low-priority autoscaling group on a cluster, and the fuzzing system overall keeps humming along.
Solution: lean on the database
If our problem is that information is neither persisted nor well distributed, let's solve that with the Hypothesis database! This is a very simple key-value store where keys are bytestrings and values are sets of bytestrings, with create/read/delete operations. The most common implementation is on the user's local filesystem, but there's also a Redis backend and it's trivial to write more.
What problems does this solve, and create?
MultiplexedDatabase
) and keep goinggit
; other VCS systems can be supported as demand arises.Action Items
MVP is to ditch http and communicate all state through the database.
Better dashboard means we can get a little fancier about what we're displaying (mostly to keep these ideas out of the MVP):
The full version is going to be an ongoing project. Once we get here, I'll aim to close this and split out more specific issues.
The text was updated successfully, but these errors were encountered: