Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGX loader is not statically linked #1261

Open
mythi opened this issue Mar 31, 2023 · 13 comments
Open

SGX loader is not statically linked #1261

mythi opened this issue Mar 31, 2023 · 13 comments
Assignees

Comments

@mythi
Copy link
Contributor

mythi commented Mar 31, 2023

Description of the problem

v1.4 release notes noted that all SGX tools are now statically linked but looks like loader is not.

---- note by @mkow ---
This turned out to be a serious issue, see my comment below: #1261 (comment).
----

Steps to reproduce

$ dpkg -L gramine | grep loader | xargs ldd
/usr/lib/x86_64-linux-gnu/gramine/sgx/loader:
	linux-vdso.so.1 (0x00007ffe6fe74000)
	libprotobuf-c.so.1 => /lib/x86_64-linux-gnu/libprotobuf-c.so.1 (0x00007fa9b2149000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa9b1f68000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fa9b219c000)
/usr/lib/x86_64-linux-gnu/gramine/direct/loader:
	statically linked

Expected results

/usr/lib/x86_64-linux-gnu/gramine/sgx/loader to be statically linked

Actual results

No response

Gramine commit hash

988e6b8

@dimakuv
Copy link

dimakuv commented Mar 31, 2023

all SGX tools are now statically linked

By tools, we meant shared libraries that can be used inside Gramine (for better/smaller TCB inside SGX enclave).

looks like loader is not

That's right. This is because of the dependency on libprotobuf-c. This dependency is required to talk to the AESM daemon (to do several tasks, but mainly to get the SGX quote).

Why is that a problem? Could you suggest a solution?

UPDATE: Actually, looks like we could statically link libprotobuf-c into Gramine. Maybe we should do it? (I looked at https://packages.debian.org/sid/amd64/libprotobuf-c-dev/filelist, I can see .a version there.)

@mythi
Copy link
Contributor Author

mythi commented Mar 31, 2023

This dependency is required to talk to the AESM daemon (to do several tasks, but mainly to get the SGX quote).

Is this needed for non-EPID? IIRC this launch sequence got changed after we dropped the .token thing for DCAP so perhaps non-EPID can go without the protobuf completely?

@dimakuv
Copy link

dimakuv commented Mar 31, 2023

We always need to get the SGX Quote, no matter EPID or DCAP attestation.

And the SGX Quote is generated by the Quoting Enclave (QE), with which Gramine needs to communicate (send the request "Please produce the SGX quote for this enclave" and receive the Quote back).

For this communication with QE, we use AESM daemon. There is another way, IIRC, but it's linking against some QE-wrapping shared library or something like this, which has its own problems.

@dimakuv
Copy link

dimakuv commented Mar 31, 2023

So a quick experiment gives me this:

  • I statically link libprotobuf-c, that's easy:
$ git diff
diff --git a/meson.build b/meson.build
index 6f87cf47..01d5a324 100644
--- a/meson.build
+++ b/meson.build
@@ -254,7 +254,7 @@ if sgx

     cjson_dep = cjson_proj.get_variable('cjson_dep')

-    protobuf_dep = dependency('libprotobuf-c')
+    protobuf_dep = dependency('libprotobuf-c', static: true)

     if dcap
         sgx_dcap_quoteverify_dep = cc.find_library('sgx_dcap_quoteverify')
  • Let's check what the SGX loader depends on now:
~/gramineproject/gramine/built-debug/lib/x86_64-linux-gnu/gramine/sgx$ ldd loader

        linux-vdso.so.1 (0x00007ffdb2b84000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb9ff687000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fb9ff8df000)
  • Could we get rid of libc.so (by specifying -nostdlib GCC flag)? Doesn't look like so:
~/gramineproject/gramine/built-debug/lib/x86_64-linux-gnu/gramine/sgx$ nm --undefined-only loader

                 U __assert_fail@@GLIBC_2.2.5
                 w __cxa_finalize@@GLIBC_2.2.5
                 U __environ@@GLIBC_2.2.5
                 U environ@@GLIBC_2.2.5
                 U free@@GLIBC_2.2.5
                 U __libc_start_main@@GLIBC_2.2.5
                 U malloc@@GLIBC_2.2.5
                 U realpath@@GLIBC_2.3

So the SGX loader is still using environment variables, asserts, malloc/free and even realpath from Glibc. I think these all could be gotten rid of, though I still don't see what would be the reason for this?

@mythi
Copy link
Contributor Author

mythi commented Mar 31, 2023

though I still don't see what would be the reason for this?

I was hoping to be able to run Gramine without depending on any of the distro specific libraries.

@dimakuv
Copy link

dimakuv commented Mar 31, 2023

I was hoping to be able to run Gramine without depending on any of the distro specific libraries.

Not even on Glibc?

@mythi
Copy link
Contributor Author

mythi commented Mar 31, 2023

The result after

-    protobuf_dep = dependency('libprotobuf-c')
+    protobuf_dep = dependency('libprotobuf-c', static: true)

is not quite enough but I now realize my original ask may have issues. Let me think about it.

@woju
Copy link
Member

woju commented Mar 31, 2023

I was hoping to be able to run Gramine without depending on any of the distro specific libraries.

Beware what you're hoping for. If we really linked everything statically, then if there's a security vulnerability in any of our dependencies, the only option for us is to do a release, probably incompatible one (for example, 1.5 will be a breaking change because of #1256). Static linking increases amount of breakage in the ecosystem. If they're dynamically linked, then you can just update the dependency and leave Gramine alone.

@mythi
Copy link
Contributor Author

mythi commented Apr 3, 2023

I was hoping to be able to run Gramine without depending on any of the distro specific libraries.

Beware what you're hoping for. If we really linked everything statically, then if there's a security vulnerability in any of our dependencies, the only option for us is to do a release, probably incompatible one (for example, 1.5 will be a breaking change

That's a valid comment. Making this a configure/build time option for users would probably work for starters (it's different from what I originally reporter but I can update the issue description).

Do you see any blockers with this approach?

@woju
Copy link
Member

woju commented Apr 3, 2023 via email

@dimakuv dimakuv added feature request P: 3 needs discussion Needs to be discussed on the maintainers call labels Sep 25, 2024
@mkow
Copy link
Member

mkow commented Oct 14, 2024

It turns out this is a much more serious issue than we thought - glibc silently assumes that all the threads created in a process were spawned using pthread_create, which is not true in our case, as we spawn untrusted PAL threads using clone(). One example problem this causes is that glibc tracks whether the app is single threaded (https://github.com/bminor/glibc/blob/glibc-2.31/nptl/libc_multiple_threads.c#L26C5-L26C28) and if it is, it skips all the locking (example: https://github.com/bminor/glibc/blob/glibc-2.31/malloc/malloc.c#L3224-L3233), leading to races in e.g. malloc()/free() if clone-spawned threads use these functions.

Random notes:

  • Reproduced on Ubuntu 20.04, glibc 2.31 and also on glibc 2.40 (most recent as of writing this).
  • Note that it doesn't matter whether we use glibc's clone wrapper or an inline syscall - both ways don't work.
  • It's not enough to just set __libc_multiple_threads to 1, there seem to be more issues (see the workaround below - it reduced the number of crashes significantly, but didn't reduce them to zero). I suspect this is about tcaches.
  • This is not a security issue, as it's in the untrusted PAL, but it's still a significant reliability issue for multithreaded apps which issue a lot of OCALLs concurrently.

Possible solutions:

  • Get rid of the glibc dependency. We'd need to implement a few (~10) libc functions ourselves to satisfy protobuf's requirements.
  • Change clone() to pthread_create - IMO not a good solution, there are more issues with glibc, e.g. that we require a carefully crafted memory layout and don't want glibc to do stuff behind our backs.

See also:

@mkow mkow added bug P: 0 and removed feature request needs discussion Needs to be discussed on the maintainers call P: 3 labels Oct 14, 2024
@mkow mkow self-assigned this Nov 5, 2024
@kailun-qin kailun-qin moved this to Working on it in Gramine Roadmap Dec 19, 2024
@kailun-qin kailun-qin moved this from Working on it to Coming in next release (v1.9) in Gramine Roadmap Dec 19, 2024
@benldrmn
Copy link

Is this issue being worked on?

CC @mkow

@mkow
Copy link
Member

mkow commented Jan 12, 2025

Yes, we plan to have it for the next release (in ~2 months), but may land it on master much earlier, we'll see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Coming in next release (v1.9)
Development

No branches or pull requests

5 participants