Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow TLSv1.3 in ZK for FIPS client connection in testCertificates #10726

Merged
merged 2 commits into from
Oct 24, 2024

Conversation

showuon
Copy link
Member

@showuon showuon commented Oct 17, 2024

Type of change

Select the type of your PR

  • Bugfix

Description

In RHEL 9, there is a new policy forced in FIPS mode:
The Extended Master Secret TLS Extension is now enforced on FIPS-enabled systems

The result is that:
Legacy clients that do not support EMS or TLS 1.3 now cannot connect to FIPS servers running on RHEL 9.

This is the case we saw in the test.

So, to allow the client connect to FIPS server, we need to enable TLSv1.3 in zookeeper.

In v3.8.x, it only enables TLSv1.2 by default. We need to enable it manually. To enable TLSv1.3 in ZK, we need to set some configurations. You can see the default value of ssl.protocol and ssl.quorum.protocol is TLSv1.2. We need to update them.

For the ssl.ciphersuites, we need to add TLSv13Ciphers manually, too. Compared v3.8.4 with master branch, we can see the TLSv13Ciphers is missed in v3.8.4 branch. I added them into the config, too.

With this change, the SecurityST#testCertificates can pass in FIPS enabled cluster.

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • [V] Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

Comment on lines 25 to 26
ssl.ciphersuites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256,TLS_CHACHA20_POLY1305_SHA256
ssl.quorum.ciphersuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256,TLS_CHACHA20_POLY1305_SHA256
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of
getSupportedCiphers(getGCMCiphers(), getCBCCiphers(), getTLSv13Ciphers()) in this line.

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the system test issue @im-konge had? Does this really related to UBI9 (versus RHEL9)? I think we need to consider how many things this might break.

@scholzj
Copy link
Member

scholzj commented Oct 17, 2024

Also, this is the wrong place to put this as these fields are user configurable. If we really want this, I think it should be handled in the operator.

@showuon
Copy link
Member Author

showuon commented Oct 17, 2024

Also, this is the wrong place to put this as these fields are user configurable. If we really want this, I think it should be handled in the operator.

OK, @im-konge , could you follow up with this PR? You can take it over since I'm going to log off and it is blocking some testing. Thank you.

@showuon showuon changed the title allow TLSv1.3 in ZK for FIPS client connection allow TLSv1.3 in ZK for FIPS client connection in testCertificates Oct 18, 2024
@showuon
Copy link
Member Author

showuon commented Oct 18, 2024

Since we only observe the testCertificates failure in FIPS mode, I've updated the PR to only enable TLSv1.3 in ZK in testCertificates. @scholzj @im-konge , FYI.

@scholzj
Copy link
Member

scholzj commented Oct 18, 2024

@showuon I'm totally fine with this change if the QE folks are fine with it.

But I think it indicates larger issues somewhere. As far as I understood from @im-konge everything works in FIPS but this particular test is not passing. What does that mean for the FIPS support? I guess it means that this policy you mentioned does not affect Java in any way and affects only OpenSSL used to gather the certificates? Is the FIPS support in Java broken? Or is it just behind? If these settings are needed only for ZooKeeper does it mean ZooKeeper configuration does not respect some Java settings and is not FIPS compatible?

@scholzj scholzj requested a review from im-konge October 18, 2024 08:43
@im-konge
Copy link
Member

All other tests (and basically everything else) work fine and without issues on the FIPS-enabled clusters. However, in this test we are using the certificates that are available for connecting to the Kafka broker and ZK. For the Kafka broker, there is everything fine and without an issue. For the ZK, there is error like this:

80DB224B717F0000:error:1C8000E9:Provider routines:kdf_tls1_prf_derive:ems not enabled:providers/implementations/kdfs/tls1_prf.c:200:
80DB224B717F0000:error:0A08010C:SSL routines:tls1_PRF:unsupported:ssl/t1_enc.c:83:
command terminated with exit code 1

Maybe it's just issue with OpenSSL in our tests and we should maybe somehow customize it. But if it would be some bigger issue, I think it would make all other tests that are using ZK failing.

And it's an issue just with ZK, in KRaft everything works fine.

@scholzj
Copy link
Member

scholzj commented Oct 18, 2024

Right, so the way I read it:

  • OpenSSL in FIPS mode supports TLSv1.3 only (by default? Can this be tuned? Or is 1.2 completely disabled in FIPS?)
  • Java does not care and is able to use TLS v1.2
  • ZooKeeper seems to enable only TLS to v1.2 (suggested by https://zookeeper.apache.org/doc/r3.8.4/zookeeperAdmin.html as the default for ssl.protocol and ssl.enabledProtocols and their quorum counterparts)

So, should we instead of changing this for one particular test, enable TLSv1.2 and TLS v1.3 by default by setting the ssl.enabledProtocols and ssl.quorum.enabledProtocols?

That should probably:

  • Prepare us for any possible future where Java will require TLS v1.3 as well
  • Keep both Kafka-ZooKeeper and ZooKeeper-ZooKeeper connections under TLS v1.2 for the time being
  • Allow OpenSSL to connect under TLS v1.3

Knowing how Java FIPS support worked int he past, it is not unlikely that it somehow decides to disable the TLSv1.2 support from one day to anohter. So maybe this is worth doing even for a few motnhs only?


However, I'm not entirely sure how the cipher suites fit into it. Is that really needed for the test to pass? if ZooKeeper somewhere hardcodes the cipher suites instead of selecting them based on the enabled protocols, it deserves to die in a very special hell. I do not think we want to set any cipher suites by default. So that might be best addressed in the test only as done in this PR.

@showuon
Copy link
Member Author

showuon commented Oct 18, 2024

OpenSSL in FIPS mode supports TLSv1.3 only (by default? Can this be tuned? Or is 1.2 completely disabled in FIPS?)

The new FIPS mode supports: TLSv1.2 with Extended Master Secret (EMS), or TLSv1.3. It sounds complicated to enable EMS in TLSv1.2, so I choose to enable TLSv1.3.

Java does not care and is able to use TLS v1.2

Java connects with the highest supported version in client and server side.

ZooKeeper seems to enable only TLS to v1.2 (suggested by https://zookeeper.apache.org/doc/r3.8.4/zookeeperAdmin.html as the default for ssl.protocol and ssl.enabledProtocols and their quorum counterparts)

Correct!

So, should we instead of changing this for one particular test, enable TLSv1.2 and TLS v1.3 by default by setting the ssl.enabledProtocols and ssl.quorum.enabledProtocols?

Of course it is the better solution, to allow ZK support TLSv1.3 and TLSv1.2. But we might need complete tests for this change to make sure everything works.

However, I'm not entirely sure how the cipher suites fit into it. Is that really needed for the test to pass?

This is a bad design in Zookeeper I agree. In the doc, they said:

ssl.ciphersuites and ssl.quorum.ciphersuites : Default: Enabled cipher suites depend on the Java runtime version being used.

And from the code in v3.8.4, you can see it assign different order of ciphers based on JAVA 8 or JAVA 9+. And in master branch, they added TLSv1.3 ciphers at the end. So we need to manually add them into the config in v3.8.4.

@scholzj
Copy link
Member

scholzj commented Oct 18, 2024

Manipulating the exact list of cipher suites is really dangerous. We should definitely not do it by default anywhere. Without it it probably doesn't make sense to enable TLS v1.3 by default :-(.

@showuon
Copy link
Member Author

showuon commented Oct 21, 2024

Manipulating the exact list of cipher suites is really dangerous. We should definitely not do it by default anywhere. Without it it probably doesn't make sense to enable TLS v1.3 by default :-(.

I'm fine we only fix the test. At least if there are users having problem with it, we can ask them to enable TLSv1.3 in Zookeeper config like what we did in this PR.

@im-konge
Copy link
Member

im-konge commented Oct 21, 2024

In that case, it makes sense to me fixing it just in the test. Should we add some note about it into the documentation?
And thanks @showuon for the investigation and the fix!

@@ -124,7 +124,21 @@ void testCertificates() {
KafkaNodePoolTemplates.controllerPoolPersistentStorage(testStorage.getNamespaceName(), testStorage.getControllerPoolName(), testStorage.getClusterName(), 3).build()
)
);
resourceManager.createResourceWithWait(KafkaTemplates.kafkaEphemeral(testStorage.getNamespaceName(), testStorage.getClusterName(), 3).build());
resourceManager.createResourceWithWait(KafkaTemplates.kafkaEphemeral(testStorage.getNamespaceName(), testStorage.getClusterName(), 3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you cannot do it like this, as the test is running in both ZK and KRaft mode.
So I would do something like this instead

        KafkaBuilder kafkaBuilder = KafkaTemplates.kafkaEphemeral(testStorage.getNamespaceName(), testStorage.getClusterName(), 3);

        if (!Environment.isKRaftModeEnabled()) {
            // in order to make the connection work on FIPS-enabled cluster, we need to enable TLSv1.3 on the ZooKeeper side
            kafkaBuilder = kafkaBuilder
                .editSpec()
                    .editOrNewZookeeper()
                        .addToConfig("ssl.protocol", "TLSv1.3")
                        .addToConfig("ssl.quorum.protocol", "TLSv1.3")
                        .addToConfig("ssl.enabledProtocols", "TLSv1.3,TLSv1.2")
                        .addToConfig("ssl.quorum.enabledProtocols", "TLSv1.3,TLSv1.2")
                        .addToConfig("ssl.ciphersuites", "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256," +
                            "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256," +
                            "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA," +
                            "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384," +
                            "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256,TLS_CHACHA20_POLY1305_SHA256"
                        )
                        .addToConfig("ssl.quorum.ciphersuites", "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256," +
                            "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256," +
                            "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA," +
                            "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA," +
                            "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256,TLS_CHACHA20_POLY1305_SHA256"
                        )
                    .endZookeeper()
                .endSpec();
        }

        resourceManager.createResourceWithWait(kafkaBuilder.build());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right! Updated.

@showuon showuon force-pushed the allowFIPSZK branch 2 times, most recently from c9cde07 to 14f7d42 Compare October 22, 2024 06:35
Copy link
Member

@im-konge im-konge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! :) LGTM

@im-konge im-konge added this to the 0.45.0 milestone Oct 22, 2024
@scholzj
Copy link
Member

scholzj commented Oct 22, 2024

/azp run zookeeper-regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj
Copy link
Member

scholzj commented Oct 24, 2024

/azp run zookeeper-regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj scholzj merged commit 8a17029 into strimzi:main Oct 24, 2024
23 checks passed
@scholzj
Copy link
Member

scholzj commented Oct 24, 2024

Thanks for the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants