Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in check_session_buf_not_used #449

Closed
patcable opened this issue Nov 10, 2023 · 15 comments
Closed

Segfault in check_session_buf_not_used #449

patcable opened this issue Nov 10, 2023 · 15 comments
Assignees
Labels

Comments

@patcable
Copy link

Likely related to the use-after-free fix (#400, #417)

Describe the bug
I'm getting an occasional segfault with my OpenVPN server. There's nothing obvious in the logs that's setting this off yet but I'm now running the OpenVPN server with --verb 4 to see if I get any extra information before it happens.

To Reproduce
Not clear yet.

Expected behavior
The OpenVPN process doesn't segfault.

Version information (please complete the following information):

  • OS: Ubuntu 20.04
  • OpenVPN version: 2.6.7

Additional context

#0  0x0000aaaad95de984 in check_session_buf_not_used (to_link=0xaaab10aa4c40, to_link=0xaaab10aa4c40, session=<optimized out>) at ssl.c:3195
#1  tls_multi_process (multi=0xaaab10989c80, to_link=to_link@entry=0xaaab10aa4c40, to_link_addr=to_link_addr@entry=0xaaab10aa4958, to_link_socket_info=0xaaab10a87dd8, wakeup=wakeup@entry=0xffffd3a6c340) at ssl.c:3312
#2  0x0000aaaad957f3b4 in check_tls (c=0xaaab10aa3be0) at forward.h:310
#3  pre_select (c=c@entry=0xaaab10aa3be0) at forward.c:2001
#4  0x0000aaaad959f50c in multi_process_post (m=m@entry=0xffffd3a6c4e8, mi=0xaaab10aa3a20, flags=flags@entry=5) at multi.c:3049
#5  0x0000aaaad95a04bc in multi_process_timeout (m=m@entry=0xffffd3a6c4e8, mpp_flags=mpp_flags@entry=5) at multi.c:3685
#6  0x0000aaaad959aaa4 in tunnel_server_udp (top=0xffffd3a6d868) at mudp.c:512
#7  0x0000aaaad95a7c00 in openvpn_main (argc=14, argv=0xffffd3a6eb98) at openvpn.c:317
#8  0x0000ffff8192fe10 in __libc_start_main (main=0xaaaad956b930 <main>, argc=14, argv=0xffffd3a6eb98, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:308
#9  0x0000aaaad956b968 in _start () at openvpn.c:394
@cron2
Copy link
Contributor

cron2 commented Nov 10, 2023

Thanks a lot for the crash dump. Someone else reported crashes on the openvpn-devel mailing list, but he had no crashdump yet (but a log file, hinting at "this happens if a TLS (re)negotiation expires, possibly because the other side is not answering").

check_session_buf_not_used() is brand new, introduced in commit cd4d819 (in branch release/2.6), and it's purpose in life is "ensure that bad things are not happening", but it seems "commit suicide" is not the right approach here... @schwabe has been pinged.

@schwabe
Copy link
Contributor

schwabe commented Nov 10, 2023

I am not entirely sure what happens but it might be that we get called with already freed key states. Are you able to patch OpenVPN with this patch?

diff --git a/src/openvpn/ssl.c b/src/openvpn/ssl.c
index 210a70165..9c88b9e84 100644
--- a/src/openvpn/ssl.c
+++ b/src/openvpn/ssl.c
@@ -3258,6 +3258,11 @@ check_session_buf_not_used(struct buffer *to_link, struct tls_session *session)
     for (int i = 0; i < KS_SIZE; i++)
     {
         struct key_state *ks = &session->key[i];
+        if (ks->state == S_UNDEF)
+        {
+            continue;
+        }
+
         for (int j = 0; j < ks->send_reliable->size; j++)
         {
             if (ks->send_reliable->array[i].buf.data == dataptr)
             
             

@patcable
Copy link
Author

Just deployed a build with that patch in it, will let you know!

@patcable
Copy link
Author

So far so good -- but there hasn't been a ton of people connected over the weekend. I think the real test will come Monday.

@patcable
Copy link
Author

Yeah, had another couple core dumps this morning, but frustratingly apport didn't capture them because i hadnt moved the old one out of /var/crash. I'll wait for one more to happen, but then i've got to downgrade because our dev vpn is also where our development happens.

@patcable
Copy link
Author

Alright, here's the latest one. Wish i knew what caused it so i could reproduce it, alas i've downgraded.

Reading symbols from ../openvpn-2.6.7/usr/sbin/openvpn...
[New LWP 747988]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/openvpn --daemon ovpn-server --status /run/openvpn/server.status 10 -'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000aaaabec3e984 in check_session_buf_not_used (to_link=0xaaaae68209f0, to_link=0xaaaae68209f0, session=<optimized out>)
    at ssl.c:3195
3195	ssl.c: No such file or directory.

@cron2
Copy link
Contributor

cron2 commented Nov 13, 2023

Is this unmodified 2.6.7 or with the S_UNDEF patch applied? It really shouldn't crash anymore, then, according to the debug infos we've received so far... scratch head

@patcable
Copy link
Author

thats with 2.6.7 with S_UNDEF patch applied yeah.

cron2 pushed a commit to cron2/openvpn that referenced this issue Nov 15, 2023
When a key_state is in S_UNDEF the send_reliable is not initialised. So
checking it might access invalid memory or null pointers.

Github: fixes OpenVPN/openvpn#449

Change-Id: I226a73d47a2b1b29f7ec175ce23a806593abc2ac
[[email protected]: add check for !send_reliable and message]
Signed-off-by: Arne Schwabe <[email protected]>
Acked-by: Gert Doering <[email protected]>
Message-Id: <[email protected]>
URL: https://www.mail-archive.com/[email protected]/msg27401.html
Signed-off-by: Gert Doering <[email protected]>
(cherry picked from commit a903ebe)
@cron2 cron2 closed this as completed in a903ebe Nov 15, 2023
vishwin pushed a commit to vishwin/freebsd-ports that referenced this issue Nov 15, 2023
Add two patches cherry-picked from upstream Git repository:

OpenVPN 2.6.7 regressed and experienced crashes in some situations,
OpenVPN/openvpn#449
Reported by:	Vladimir Druzenko (vvd@)
Reported by:	Patrick Cable (upstream)
Obtained from:	OpenVPN/openvpn@b90ec6d

Also, some typos in the documentation are fixed,
Obtained from:	OpenVPN/openvpn@457f468

Bump PORTREVISION.
PR:		275055
MFH:		2023Q4
netgate-git-updates pushed a commit to pfsense/FreeBSD-ports that referenced this issue Nov 15, 2023
Related to:

PR:		275055
Security:	2fe004f5-83fd-11ee-9f5d-31909fb2f495
Security:	CVE-2023-46849
Security:	CVE-2023-46850

This specifically documents < 2.6.7_1 in order to collect the
regression fix for OpenVPN/openvpn#449
which was a bug newly introduced into 2.6.7.
freebsd-git pushed a commit to freebsd/freebsd-ports that referenced this issue Nov 15, 2023
Add two patches cherry-picked from upstream Git repository:

OpenVPN 2.6.7 regressed and experienced crashes in some situations,
OpenVPN/openvpn#449
Reported by:	Vladimir Druzenko (vvd@)
Reported by:	Patrick Cable (upstream)
Obtained from:	OpenVPN/openvpn@b90ec6d

Also, some typos in the documentation are fixed,
Obtained from:	OpenVPN/openvpn@457f468

Bump PORTREVISION.
PR:		275055
MFH:		2023Q4

(cherry picked from commit 8d2e9d9)
netgate-git-updates pushed a commit to pfsense/FreeBSD-ports that referenced this issue Nov 17, 2023
Add two patches cherry-picked from upstream Git repository:

OpenVPN 2.6.7 regressed and experienced crashes in some situations,
OpenVPN/openvpn#449
Reported by:	Vladimir Druzenko (vvd@)
Reported by:	Patrick Cable (upstream)
Obtained from:	OpenVPN/openvpn@b90ec6d

Also, some typos in the documentation are fixed,
Obtained from:	OpenVPN/openvpn@457f468

Bump PORTREVISION.
PR:		275055
MFH:		2023Q4

(cherry picked from commit 8d2e9d9)
@cron2
Copy link
Contributor

cron2 commented Nov 18, 2023

Could you re-test with 2.6.8 please? We're reasonably sure that we've fixed the issue - at least the way we could reproduce is fully gone.

@cron2 cron2 reopened this Nov 18, 2023
@braindead-bf
Copy link

Is this the real issue? The for loop here is using j as an iterator but checking the array at index i (then reporting j as the array index when it matches dataptr):

for (int j = 0; j < ks->send_reliable->size; j++)
        {
            if (ks->send_reliable->array[i].buf.data == dataptr)
            {
                msg(M_INFO, "Warning buffer of freed TLS session is still in"
                    " use (session->key[%d].send_reliable->array[%d])",
                    i, j);
...

@schwabe
Copy link
Contributor

schwabe commented Nov 22, 2023

That is a good catch. Fortunatily i is always between 0 and 2 and ks->send_reliable->size is (when it is defined) always 6 (TLS_RELIABLE_N_SEND_BUFFERS). So while the check is stupid with j, it at least is not crashing or anything similar.

@cron2
Copy link
Contributor

cron2 commented Nov 22, 2023

Is this the real issue?

No, that's just broken code... the crash-causing issue (that we could track down) was that ks->send_reliable can be NULL, which wasn't checked in 2.6.7.

But the one you found need fixing as well as it renders the whole check a bit useless (it should never trigger, but if the check is not checking correctly in the first place, this ensures it never will...)

@braindead-bf
Copy link

Thanks, I submitted PR #463 for this fix.

@patcable
Copy link
Author

Just to check back in -- i've been running 2.6.8 for a while without issue which is a good sign.

cron2 pushed a commit that referenced this issue Dec 2, 2023
The inner loop used i instead of j when iterating through the buffers.

Since i is always between 0 and 2 and ks->send_reliable->size is
(when it is defined) always 6 (TLS_RELIABLE_N_SEND_BUFFERS) this does not
cause an index of out bounds.  So while the check was not doing anything
really useful with i instead of j, at least it was not crashing or
anything similar.

Noticed-By: Jon Williams (braindead-bf) on Github issue #449
Change-Id: Ia3d5b4946138df322ebcd9e9e77d04328dacbc5d
Signed-off-by: Arne Schwabe <[email protected]>
Acked-by: Frank Lichtenheld <[email protected]>
Message-Id: <[email protected]>
URL: https://www.mail-archive.com/[email protected]/msg27576.html
Signed-off-by: Gert Doering <[email protected]>
(cherry picked from commit 59551b9)
cron2 pushed a commit that referenced this issue Dec 2, 2023
The inner loop used i instead of j when iterating through the buffers.

Since i is always between 0 and 2 and ks->send_reliable->size is
(when it is defined) always 6 (TLS_RELIABLE_N_SEND_BUFFERS) this does not
cause an index of out bounds.  So while the check was not doing anything
really useful with i instead of j, at least it was not crashing or
anything similar.

Noticed-By: Jon Williams (braindead-bf) on Github issue #449
Change-Id: Ia3d5b4946138df322ebcd9e9e77d04328dacbc5d
Signed-off-by: Arne Schwabe <[email protected]>
Acked-by: Frank Lichtenheld <[email protected]>
Message-Id: <[email protected]>
URL: https://www.mail-archive.com/[email protected]/msg27576.html
Signed-off-by: Gert Doering <[email protected]>
@cron2
Copy link
Contributor

cron2 commented Dec 2, 2023

So thanks for all the testing and feedback. I have now committed & pushed the fix for the i/j loop variable confusion, and I think this can now be closed for good. Such a haunted function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants