-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
boringssl --> OpenSSL 3.2 #59870
boringssl --> OpenSSL 3.2 #59870
Conversation
This is an automated comment for commit 9d2301f with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
One of the reason of going to boringSSL was performance speedup (30x) for encryption functions #11844 (comment) |
The azure submodule worked with boringssl only by chance, and the 1.8 upgrade no longer builds at all with boringssl. That's a blocker. I assume that the performance measurements were done against OpenSSL 1.x, perhaps runtimes improved in the meantime. |
Status update:
|
|
Not true. Encryption works and produces the right result, decryption throws an error with generic error message |
Aforementioned manpage states in section
Calling |
I am pretty sure that the error handling in our existing code calling OpenSSL is wrong (this leads to different actual and expected error codes for a negative test in integration test On the hand, doing |
Two more integration tests with wrong expected error coded fixed, trying now to restore the non-x86 builds. |
For ARM, I tried the cross-compiling instructions mentioned here #43991 (comment). For Arm, I'll also try on my EC2 ARM instance. |
ARM builds are fixed now, checking now why integration test |
Need to work on other stuff with a deadline until Friday. |
a156ddb
to
18e1ea9
Compare
PPC builds should work now. Tried to fix RISC-V builds locally but when I try to cross-compile ClickHouse, I run into:
plus the build is ultra slow locally. Also tried to fix s390x builds but my attempt to cross-compile locally (without the boring-to-openssl transition) failed with weird linker errors.. Additionally, a standalone openssl (which is needed to generate platform-specifc files) refuses to compile on LE systems (like x86) for BE systems (s390x). Luckily, it generated at least a few critical files, other ones (which seem very similar across platforms) I could copy from other platforms. EDIT: The RISC-V build should be working now. s390/x can hopefully be fixed by looking at CI output. |
8f8c130
to
f5dbe00
Compare
Yes, RISC-V works. Darwin ARM builds now compile and link successfully locally (with actual
which does not look related but I'll make a I also generated similar platform files for darwin-x86_64 and there is a chance it works, but I did not test locally yet (cross-compiling for Mac is troublesome locally) EDIT: The same error appears for standard EDIT EDIT: we also have compat builds for x86_64 (no SSE3 or higher) and ARM (v8.0). Both builds are green but since there is quite some assembly used (see build descriptions), there is a chance that any of these will regardless fail at runtime with ILLEGAL INSTRUCTION because the assembly contains more recent instructions. Unfortunately, it is difficult to test for me I don't have access to such ancient hardware and disabling specific SIMD instruction sets on AWS EC2 instances does not seem possible. |
11d2ccb
to
0f3103f
Compare
Darwin builds are now really green. Regarding compat check:
Regarding s390x (here):
Will continue working on this PR once #59516 is merged. |
dc58ea2
to
cf43db2
Compare
Temporarily disabled s390/x. |
@yakov-olkhovskiy is the one that hardcoded mold |
I guess you mean the mold linker was hardcoded here: https://github.com/ClickHouse/ClickHouse/pull/59870/files#diff-cc10102f8b1db393176bb4ef3536718e0d91ca8b6c086e088953b0d951732c9bR82 |
Linux s390x build was added here: |
d520ea2
to
47919cb
Compare
Fixed now (--> fb00fa6). Interestingly, the same problem with old glibc-s appeared already earlier: openssl/openssl#22036 |
979e06c
to
d30b48f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the rest LGTM. Awesome work!
@@ -1,5 +1,5 @@ | |||
#!/usr/bin/env bash | |||
# Tags: deadlock | |||
# Tags: deadlock, no-tsan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we have supressed tsan issues here https://github.com/ClickHouse/ClickHouse/pull/59870/files#diff-3cdbc9c2a0c3de2ee8e6fd929808e956ed2b0a27d183a9040a89863da09d272fR333 Why do we tag no-tsan
in the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember I could not get this test and 01393_benchmark_secure_port.sh to run locally. I just saw that they consistently failed in tsan builds. The log files from CI were also not helpful ... so I disabled them for now :-( Not great, but I'll check as a follow up.
01c55e6
FYI, I'm seeing tons of errors when building from scratch:
Verbose:
Do we need a specific version of perl? Do we even need to generate those files instead of pushing them directly to the submodule? |
It seems it needs CC to be declared externally:
I guess cmake should take care of that |
Has anybody tried to build the latest with
|
Hm, indeed, this reproduces on x86 with a minimal CMake invocation: |
Regarding CH build for openssl dynamic linking issue, a PR has been raised:62888 |
…e sentry) The problem is tha openssl registers OPENSSL_cleanup() as atexit handler, which called before destroying of SentryWriter, so to avoid this problem, let's destroy it explicitly. <details> <summary>stack trace example</summary> Thread 2 (Thread 0x7ffff54006c0 (LWP 24847) "clickhouse-serv"): 0 ___pthread_rwlock_rdlock (rwlock=0x0) at pthread_rwlock_rdlock.c:26 1 0x00000000164c18a9 in CRYPTO_THREAD_read_lock (lock=0x0) at threads_pthread.c:93 2 0x000000001642e6b9 in int_err_get_item (d=0x7ffff53f74e0) at err.c:192 ... 7 ossl_connect_common (cf=0x7ffff7812c80, data=0x7ffff70a4c00, nonblocking=bool_true, done=0x7ffff53f834c) at openssl.c:4486 ... 17 curl_easy_perform (data=data@entry=0x7ffff70a4c00) at easy.c:787 18 0x000000000b4c3854 in sentry__curl_send_task (_envelope=<optimized out>, _state=0x7ffff7074300) at sentry_transport_curl.c:225 19 0x000000000b4ba880 in worker_thread (data=0x7ffff70e5500) at sentry_sync.c:262 Thread 1 (Thread 0x7ffff7cb2c80 (LWP 24842) "clickhouse-serv"): 5 0x000000000b4bb0e2 in sentry__cond_wait_timeout (cv=0x7ffff70e5540, mutex=0x7ffff70e5570, msecs=250) at sentry_sync.h:332 6 sentry__bgworker_shutdown (bgw=0x7ffff70e5500, timeout=2000) at sentry_sync.c:412 7 0x000000000b4b3e95 in sentry_close () at sentry_core.c:238 8 0x000000000b4a5f1f in SentryWriter::~SentryWriter (this=0x7ffff71a1240) at SentryWriter.cpp:147 9 std::__1::default_delete<SentryWriter>::operator()[abi:v15000](SentryWriter*) const (this=0x7ffff70e5568, __ptr=0x7ffff71a1240) at unique_ptr.h:48 10 std::__1::unique_ptr<SentryWriter, std::__1::default_delete<SentryWriter> >::reset[abi:v15000](SentryWriter*) (this=0x7ffff70e5568, __p=0x0) at unique_ptr.h:305 11 std::__1::unique_ptr<SentryWriter, std::__1::default_delete<SentryWriter> >::~unique_ptr[abi:v15000]() (this=0x7ffff70e5568) at unique_ptr.h:259 12 0x00007ffff7de62e6 in __run_exit_handlers (status=0, listp=<optimized out>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108 13 0x00007ffff7de642e in __GI_exit (status=<optimized out>) at exit.c:138 14 0x00007ffff7dccd51 in __libc_start_call_main (main=main@entry=0x6111c20 <main(int, char**)>, argc=argc@entry=13, argv=argv@entry=0x7fffffffb718) at libc_start_call_main.h:74 15 0x00007ffff7dcce0c in __libc_start_main_impl (main=0x6111c20 <main(int, char**)>, argc=13, argv=0x7fffffffb718, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffb708) at libc-start.c:360 (gdb) p req.body $7 = 0x7ffff7816000 "{\"dsn\":\"...\"}\n{\"type\":\"session\",\"length\":190}\n{\"init\":true,\"sid\":\"...\",\"status\":\"exited\",\"errors\":0,\"started\":\"2024-05-08T20:29:23.253Z\",\"duration\":17.213,\"attrs\":{\"release\":\"24.5\",\"environment\":\"test\"}}" </details> P.S. Likely started happens after conversion to OpenSSL (ClickHouse#59870). Signed-off-by: Azat Khuzhin <[email protected]>
Like #56398, but OpenSSL 3.2. Made a new PR so that it doesn't become too messy.
Background:
messconstraint, i.e. all data stored withAES_128_GCM_SIV
andAES_256_GCM_SIV
codecs would no longer be decrypt-able.Details:
ENABLE_OPENSSL
) via a submodule or dynamically (ENABLE_OPENSSL AND ENABLE_OPENSSL_DYNAMIC
) against the OS-provided OpenSSL, usually a FIPS-certified one.Changelog category (leave one):