Skip to content

Commit

Permalink
PVF worker: Add seccomp restrictions (block networking)
Browse files Browse the repository at this point in the history
We're already working on sandboxing by [blocking all unneeded syscalls](#882). However, due to the wide scope it will take a while longer. This PR starts with a much smaller scope, only blocking network-related syscalls until the above is ready.

For security we block the following with `seccomp`:

- creation of new sockets - these are unneeded in PVF jobs, and we can safely block them without affecting consensus.

- `io_uring` - as discussed [here](paritytech/polkadot#7334 (comment)), io_uring allows for networking and needs to be blocked. See below for a discussion on the safety of doing this.

- `connect`ing to sockets - the above two points are enough for networking and is what birdcage does (or [used to do](phylum-dev/birdcage#47)) to restrict networking. However, it is possible to [connect to abstract unix sockets](https://lore.kernel.org/landlock/[email protected]/T/#u) to do some kinds of sandbox escapes, so we also block the `connect` syscall.

(Intentionally left out of implementer's guide because it felt like too much detail.)

`io_uring` is just a way of issuing system calls in an async manner, and there is nothing stopping wasmtime from legitimately using it. Fortunately, at the moment it does not. Generally, not many applications use `io_uring` in production yet, because of the numerous kernel CVEs discovered. It's still under a lot of development. Android outright banned `io_uring` for these reasons.

Considering `io_uring`'s status, and that it very likely would get detected either by our [recently-added static analysis](#1663) or by testing, I think it is fairly safe to block it.

If execution hits an edge case code path unique to a given machine, it's already taken a non-deterministic branch anyway. After all, we just care that the majority of validators reach the same result and preserve consensus. So worst-case scenario, there's a dispute, and we can always admit fault and refund the wrong validator. On the other hand, if all validators take the code path that results in a seccomp violation, then they would all vote against the current candidate, which is also fine. The violation would get logged (in big scary letters) and hopefully some validator reports it to us.

Actually, a worst-worse-case scenario is that 50% of validators vote against, so that there is no consensus. But so many things would have to go wrong for that to happen:

1. An update to `wasmtime` is introduced that uses io_uring (unlikely as io_uring is mainly for IO-heavy applications)
2. The new syscall is not detected by our static analysis
3. It is never triggered in any of our tests
4. It then gets triggered on some super edge case in production on 50% of validators causing a stall (bad but very unlikely)
5. Or, it triggers on only a few validators causing a dispute (more likely but not as bad?)

Considering how many things would have to go wrong here, we believe it's safe to block `io_uring`.

Closes #619
Original PR in Polkadot repo: paritytech/polkadot#7334
  • Loading branch information
mrcnski committed Oct 24, 2023
1 parent 39c04fd commit 5e100f6
Show file tree
Hide file tree
Showing 14 changed files with 875 additions and 545 deletions.
11 changes: 11 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion polkadot/node/core/candidate-validation/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,8 @@ async fn run<Context>(
exec_worker_path,
),
pvf_metrics,
);
)
.await;
ctx.spawn_blocking("pvf-validation-host", task.boxed())?;

loop {
Expand Down
2 changes: 2 additions & 0 deletions polkadot/node/core/pvf/common/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ sp-tracing = { path = "../../../../../substrate/primitives/tracing" }

[target.'cfg(target_os = "linux")'.dependencies]
landlock = "0.3.0"
seccompiler = "0.3.0"
thiserror = "1.0.31"

[dev-dependencies]
assert_matches = "1.4.0"
Expand Down
2 changes: 2 additions & 0 deletions polkadot/node/core/pvf/common/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ pub mod tests {
pub struct SecurityStatus {
/// Whether the landlock features we use are fully available on this system.
pub can_enable_landlock: bool,
/// Whether the seccomp features we use are fully available on this system.
pub can_enable_seccomp: bool,
// Whether we are able to unshare the user namespace and change the filesystem root.
pub can_unshare_user_namespace_and_change_root: bool,
}
Expand Down
30 changes: 28 additions & 2 deletions polkadot/node/core/pvf/common/src/worker/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,13 @@ macro_rules! decl_worker_main {
let status = -1;
std::process::exit(status)
},
"--check-can-enable-seccomp" => {
#[cfg(target_os = "linux")]
let status = if security::seccomp::check_is_fully_enabled() { 0 } else { -1 };
#[cfg(not(target_os = "linux"))]
let status = -1;
std::process::exit(status)
},
"--check-can-unshare-user-namespace-and-change-root" => {
#[cfg(target_os = "linux")]
let status = if let Err(err) = security::unshare_user_namespace_and_change_root(
Expand Down Expand Up @@ -119,6 +126,7 @@ macro_rules! decl_worker_main {
let mut worker_dir_path = None;
let mut node_version = None;
let mut can_enable_landlock = false;
let mut can_enable_seccomp = false;
let mut can_unshare_user_namespace_and_change_root = false;

let mut i = 2;
Expand All @@ -137,6 +145,7 @@ macro_rules! decl_worker_main {
i += 1
},
"--can-enable-landlock" => can_enable_landlock = true,
"--can-enable-seccomp" => can_enable_seccomp = true,
"--can-unshare-user-namespace-and-change-root" =>
can_unshare_user_namespace_and_change_root = true,
arg => panic!("Unexpected argument found: {}", arg),
Expand All @@ -151,6 +160,7 @@ macro_rules! decl_worker_main {
let worker_dir_path = std::path::Path::new(worker_dir_path).to_owned();
let security_status = $crate::SecurityStatus {
can_enable_landlock,
can_enable_seccomp,
can_unshare_user_namespace_and_change_root,
};

Expand Down Expand Up @@ -307,6 +317,22 @@ pub fn worker_event_loop<F, Fut>(
let landlock_status =
security::landlock::enable_for_worker(worker_kind, worker_pid, &worker_dir_path);
if !matches!(landlock_status, Ok(landlock::RulesetStatus::FullyEnforced)) {
// We previously were able to enable, so this should never happen.
gum::error!(
target: LOG_TARGET,
%worker_kind,
%worker_pid,
"could not fully enable landlock: {:?}. This should not happen, please report to the Polkadot devs",
landlock_status
);
}
}

#[cfg(target_os = "linux")]
if security_status.can_enable_seccomp {
let seccomp_status =
security::seccomp::enable_for_worker(worker_kind, worker_pid, &worker_dir_path);
if !matches!(seccomp_status, Ok(())) {
// We previously were able to enable, so this should never happen.
//
// TODO: Make this a real error in secure-mode. See:
Expand All @@ -315,8 +341,8 @@ pub fn worker_event_loop<F, Fut>(
target: LOG_TARGET,
%worker_kind,
%worker_pid,
"could not fully enable landlock: {:?}. This should not happen, please report to the Polkadot devs",
landlock_status
"could not fully enable seccomp: {:?}. This should not happen, please report to the Polkadot devs",
seccomp_status
);
}
}
Expand Down
Loading

0 comments on commit 5e100f6

Please sign in to comment.