Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc client tls1.3 bug #28

Open
a3sroot opened this issue Nov 10, 2022 · 27 comments
Open

grpc client tls1.3 bug #28

a3sroot opened this issue Nov 10, 2022 · 27 comments
Assignees

Comments

@a3sroot
Copy link

a3sroot commented Nov 10, 2022

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

connect grpc server tls error
I use go client connection is tls1.3, python client is tls1.2
Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.
image
image

Desktop (please complete the following information):

  • OS: [e.g. MacOS]
  • Version v0.0.18
    LibreSSL 2.8.3
    Macos 12.6

Additional context
Add any other context about the problem here.

@litobro
Copy link

litobro commented Nov 27, 2022

Also running into this problem on parrotos.

@a3sroot
Copy link
Author

a3sroot commented Nov 30, 2022

I forced the server to downgrade to tls1.2

@daddycocoaman daddycocoaman self-assigned this Dec 14, 2022
@daddycocoaman
Copy link
Collaborator

I think this issue comes from the gRPC library, which doesn't expose the ability to select or force TLS 1.2 or 1.3. I'll look into this more.

@moloch--
Copy link
Owner

It may be a limitation of the underlying Python version's OpenSSL version?

@daddycocoaman
Copy link
Collaborator

daddycocoaman commented Dec 27, 2022

import ssl
print(ssl.OPENSSL_VERSION)
print(ssl.HAS_TLSv1_3)

@a3sroot This should return true, but it does not with LibreSSL 2.8.3. Looks like TLS1.3 might have been added with LibreSSL 3.4.0. You can try upgrading LibreSSL, but it might be safer and easier to just update to Python 3.10, which installs OpenSSL >1.1 and does support TLS1.3.

We might need to up the minimum requirement for Python to 3.10. I work on this using 3.10 on Windows (Python on Windows ships with its own OpenSSL DLLs) and I was trying to leave the minimum version as 3.8 but looks it might be inconsistent between OSes.

But as far as I'm aware, ParrotOS uses OpenSSL. @litobro Can you provide the output from above?

@litobro
Copy link

litobro commented Dec 27, 2022

I ultimately switched to Kali and had no issues with Python3.10, I did reinstall a ParrotOS to check this again and with a clean install at Python3.9 it appears to work flawlessly. I can attempt to restore one of my older backups from my previous Parrot VM but it does seem that older libraries are to blame here.

>>> import ssl
>>> print(ssl.OPENSSL_VERSION)
OpenSSL 1.1.1n 15 Mar 2022
>>> print(ssl.HAS_TLSv1_3
True

@tabinfl
Copy link

tabinfl commented Dec 28, 2022

I'm having the same problem. Have tried on multiple systems including Ubuntu 22.04 with python 3.10. All show the same error (I'm running a simple connect pytest):

grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:31337: Ssl handshake failed: SSL_ERROR_SSL: error:10000410:SSL routines:OPENSSL_internal:SSLV3_ALERT_HANDSHAKE_FAILURE"

On that Ubuntu system, in my venv:

>>> print(ssl.OPENSSL_VERSION)
OpenSSL 3.0.2 15 Mar 2022
>>> print(ssl.HAS_TLSv1_3)
True

Sliver server 1.5.31 (I did try a couple of older versions as well), have tried running it on both Ubuntu & MacOS, and connecting from each to itself and to the other. I've also tried several prior versions of grpcio, no change.

I can run Sliver client using the same operator config file my test is using, and it works fine all around. I've checked in the debugger to make sure the op config is actually being parsed (it is).

This was working for me on my MacOS (python 3.8) ~a week ago, it's possible other things have changed on there, but the one I know for sure is I created a new venv. I believe it had also worked previously on the Ubuntu, but also a new venv there. I'd never run it before on the Kali (2021.4a).

Please let me know if there's anything I can do to assist in tracking this down, I'm stumped at the moment.

EDIT:
Also just did a full apt update/upgrade on the Ubuntu 22.04, recreated my venv, no change

EDIT 2:
A colleague just tried it on their own ubuntu 22.04 with python 3.10 and also got the same error.

I have verified under debugger that credentials are in place and all else looks normal at the time of the call to grpc.aio.secure_channel() from client.py's connect() method.

@daddycocoaman
Copy link
Collaborator

@tabinfl Does your connection work outside of pytest? Can you provide minimal code?

@tabinfl
Copy link

tabinfl commented Dec 28, 2022

My sliver client connection works fine, with the same .cfg file.

I'm getting this error running the sliver-py client tests.

I added a "connect" tag to the get version test so it's less spammy: @test("Client can get version", tags=["client","connect"]) on line 48 of test_client.py, then ward --tags connect

EDIT:
Same error on MacOS 10.15.7 / python 3.8, ubuntu 18.04 / python 3.8, ubuntu 22.04 / python 3.10

UPDATE:
Same error running on windows 10, python 3.10

On windows, I used the following code, installing the sliver-py package normally, instead of using the unit tests:

import argparse
import asyncio

from pathlib import Path
from sliver import SliverClient, SliverClientConfig

async def test_connect(cfg):
    config = SliverClientConfig.parse_config_file(cfg)
    client = SliverClient(config)
    await client.connect()
    ver = await client.version()

    print(f"Got Sliver version {ver}")


if __name__ == '__main__':
    parser = argparse.ArgumentParser(prog="simple_sliver")
    parser.add_argument(
        "--cfg",
        nargs="?",
        required=True,
        help="Operator config file",
    )

    args = parser.parse_args()
    cfg_file = Path(args.cfg)

    print(f"Using operator config file {cfg_file}")
    asyncio.run(test_connect(cfg_file))

@daddycocoaman
Copy link
Collaborator

I spun up a fresh 22.04 VM with Python 3.10/OpenSSL 3.0.2 and was able to successfully run the tests against a local and remote sliver server. Same on Windows and MacOS. Can you provide a packet capture of traffic from your client to server? I'd expect to see the TLSv1.3 Client/Server Hello. Would like to see where things are going wrong in the handshake.

image

@tabinfl
Copy link

tabinfl commented Dec 29, 2022

I'm getting TLS 1.2. Ubuntu 22.04, python 3.10, openssl 3.0.2, sliver server v1.5.31 on localhost

Screen Shot 2022-12-28 at 7 38 09 PM
Screen Shot 2022-12-28 at 7 39 27 PM
Screen Shot 2022-12-28 at 7 39 49 PM
Screen Shot 2022-12-28 at 7 40 55 PM

pcap:

sliver-py-connect.pcapng.zip

Will try fresh ubuntu install next.

EDIT:

To recap:

I've personally run both sliver-py master branch and pypi package on 4 different VMs (ubuntu 18.04/22.04, Kali 2021.4, MacOS 10.15, Windows 10) with sliver server on 3 different VMs (ubuntu 18.04/22.04 & MacOS, local & remote) with same results.

Two colleagues have run the sliver-py master branch unit tests on their own (naive to sliver) VMs , with sliver server 1.5.31, and encountered the same error.

@tabinfl
Copy link

tabinfl commented Dec 29, 2022

AHA

sliver-server running on fresh ubuntu 22.04 VM works, with both local sliver-py test and with previously broken client. All show TLS 1.3 in pcap.

So something is forcing the sliver-server on the "old" systems to drop down to TLS 1.2? I've run the server on ubuntu 22.04 (before & after apt update/upgrade, ubuntu 18.04, and MacOS with same error message resulting. "Fresh" ubuntu 22.04 shows same openssl and python versions as the "old" one, but running the server on there with Python client anywhere else works.

A few other notes along the way (can put these in new issues if you'd prefer):

  • sliver-py unit tests are failing with sys:1: RuntimeWarning: coroutine 'UnaryUnaryCall._invoke' was never awaited but it's entirely possible I'm not set up right somehow
  • the sliver server does not show the sliver-py "operator" connecting as it did previously, even if I add another task after the version query
  • pip install sliver-py does not install dependencies (grpcio, protobuf) -- looks like no [dependencies] section in pyproject.toml

@daddycocoaman
Copy link
Collaborator

I'm kind of at a loss as to why the older VMs might default to 1.2 but good to hear you found at least one working case. What's even more interesting is that all your instances and your colleagues didn't work, Windows included.

Since we can't force TLS versions on gRPC yet (grpc/grpc#28382), I think the most we can do is verify that users have the right OpenSSL versions (1.1.1+). Still doesn't explain the other VMs though. I don't think the server is at fault, since your Wireshark pic shows the client trying to connect to TLS 1.2 in the Client Hello and the server rejecting it, but I don't have any other ideas in that area.

For your notes, feel free to make a new issue about the failing test and the dependencies. First time writing tests so I may have missed something. And I see what the issue might be with the dependencies, so I'll address that real quick. Surprised it's been missed for this long.

@tabinfl
Copy link

tabinfl commented Dec 29, 2022

Have confirmed that the only difference between working & not is whether the sliver server is running on an old or “clean” VM. Since sliver server/client are both using golang grpc bindings, this may indicate some subtle difference in the Python vs Go grpc bindings. Haven’t gotten too deep into it, debugging cython… ☹️

I’ll make an issue for the deps tomorrow with suggested fix (one-liner, more or less), happy to give you a PR for tests if you’d like — you’ve turned us on to ward which is excellent.

@daddycocoaman
Copy link
Collaborator

Have confirmed that the only difference between working & not is whether the sliver server is running on an old or “clean” VM. Since sliver server/client are both using golang grpc bindings, this may indicate some subtle difference in the Python vs Go grpc bindings. Haven’t gotten too deep into it, debugging cython… ☹️

I’ll make an issue for the deps tomorrow with suggested fix (one-liner, more or less), happy to give you a PR for tests if you’d like — you’ve turned us on to ward which is excellent.

The dependencies have been fixed in v0.0.19. Not sure how it got missed before but whoops.

The OS of the server is interesting. I'm curious if there's any difference between the openssl.cfg files on your old server and the new VM, mostly the SECLEVEL or minimum/maximum versions. Still doesn't explain Windows but it's the last thing I can think of.

I dunno if Ward is gonna stay maintained but it's pretty great. Feel free to PR more tests.

@tabinfl
Copy link

tabinfl commented Dec 29, 2022

Gets weirder by the minute. Started futzing with sliver server logs and deleting ~/.sliver to see if I could get more insight into the operator config creation process. And then on one run, it worked with a server it was failing with yesterday. And then started working on another one (after deleting ~/.sliver, runing server, generating new operator config file). But is still broken on another system, even after doing the same steps.

I am baffled.

Things I know:

  • it's where the server runs that breaks it
  • when broken, the initial client hello is TLS 1.2 (vs TLS 1.3 when working)
  • the only way the client knows anything about the server prior to the initial TLS hello is in the operator config file
  • the sliver client (using Go grpc) connects successfully even when sliver-py (using Python grpc) fails
  • a brand new OS install seems to work every time, while "used" systems may work or not

I'm going to keep poking at this when I can, will update with anything else I find.

@Lcys
Copy link

Lcys commented Feb 7, 2023

is there any solution

@litobro
Copy link

litobro commented Feb 7, 2023

I realize I never did update my status on this, I was unable to restore from an old enough backup due to my retention policies unfortunately.

If I have time over the weekend I may try to find an old install image and install clean without updating it and see if I can reproduce the issue.

@Marshall-Hallenbeck
Copy link
Contributor

I got this working through a workaround way (at least it connects right now, not sure if the protobuf version is going to affect Sliver usage).

What worked is building grpc from source with the GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True flags (thanks @0b5cur17y from Slack), but also downgrading protobuf is necessary for grpc I guess.

# build dependencies
sudo apt-get install build-essential python3-dev libssl-dev

# install sliver-py first, we will be over-writing the grpc version from source
pip install sliver-py

# install grpc from source
git clone https://github.com/grpc/grpc
cd grpc
git submodule update --init
pip install -r requirements.txt
pip uninstall protobuf
pip install protobuf==3.20.*
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True GRPC_PYTHON_BUILD_WITH_CYTHON=1 pip install .

Now at least my connection to the server works!

@moloch--
Copy link
Owner

@daddycocoaman do you know if there's any way to set the env vars during the pip install?

@Marshall-Hallenbeck
Copy link
Contributor

I got a better fix!

GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True pip install --no-binary :all: --force-reinstall grpcio

Works perfectly, but the install takes 4-5 minutes on my VM.

@Marshall-Hallenbeck
Copy link
Contributor

Marshall-Hallenbeck commented Mar 19, 2023

So apparently --no-binary is deprecated, so you need to use --use-pep517 instead, making the full command on Linux and pip (see pypa/pip#11451):

GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True pip install --use-pep517 --force-reinstall grpcio

On Windows you need to use set to set environment variables, and I think you need to do set SETUPTOOLS_USE_DISTUTILS="stdlib"; before as well (testing this takes like 5 minutes a run so it's annoying to test).

If you are using Poetry, you can run the following to force a rebuild (again, testing this is a PITA, so YMMV):
poetry config --local installer.no-binary grpcio

@Magier
Copy link

Magier commented Oct 19, 2023

Hi, I was wondering if there are any plans to fix this issue? Imho, downgrading is no viable long-term solution.
I am currently working on Manjaro with Python 3.11.

I did some more research on the issue, and I think this problem is the result of an issue in gRPC issue#22442. Sadly, it has been open since March 2020 with no fix in sight. However, this comment points out, that usage of the curve secp384r1 might be the issue.
Inspecting the generated config for the operator confirms, that this unsupported curve is used:

Would a possible solution be to support some configuration of the generated certificate on the server side, like using a specific curve or RSA instead of ECC?

@moloch--
Copy link
Owner

moloch-- commented Oct 19, 2023

I'm not sure Go allows you to disable specific curves in TLS 1.3 iirc the cipher config is only respected for TLS 1.0 - 1.2

@Magier
Copy link

Magier commented Oct 19, 2023

I thought, the generation of the operator cert is done by the server? At least there is a GenerateECCCertificate in sliver/server/certs/certs.go

I've just generated several certs to get all 3 curves (521, 384 and 256) and the problem was the same with all. So my comment might have been a dead end anyways. 😅

@moloch--
Copy link
Owner

Ah okay yes we can select curves in the certificate yes, i was thinking service cipher suite config.

@spenceradolph
Copy link

spenceradolph commented Mar 24, 2024

Just to update this, still running into this issue. I'm trying to run this within a container so that I can quickly replicate a working environment elsewhere and finely control the versions of things when they (hopefully) eventually work.

Edit: sliver-script appears to work, but that repo seems very out-dated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants