Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STS web identity] TLS negotiation timeout too small #169

Closed
grrtrr opened this issue Jul 16, 2022 · 2 comments
Closed

[STS web identity] TLS negotiation timeout too small #169

grrtrr opened this issue Jul 16, 2022 · 2 comments
Labels
bug This issue is a bug. needs-review This issue or pull request needs review from a core team member. p2 This is a standard priority issue

Comments

@grrtrr
Copy link
Contributor

grrtrr commented Jul 16, 2022

This was observed in the STS Web Identity provider, but could be a general problem affecting other credential providers as well.

Problem description

In production we experiences STS Web Identity TLS negotiation timeout of more than 10 seconds (the default).

Here is one of the traces we captured:

3/11/2022, 1:34:11 PM UTC	stdout	[ERROR] 2022-03-11 13:34:11.117 file-utils [140518474917760] static: Failed to open file /root/.aws/config with errno 2
3/11/2022, 1:34:11 PM UTC	stdout	[WARN] 2022-03-11 13:34:11.117 AuthProfile [140518474917760] Failed to read file at "/root/.aws/config"
3/11/2022, 1:34:11 PM UTC	stdout	[ERROR] 2022-03-11 13:34:11.117 file-utils [140518474917760] static: Failed to open file /root/.aws/credentials with errno 2
3/11/2022, 1:34:11 PM UTC	stdout	[WARN] 2022-03-11 13:34:11.117 AuthProfile [140518474917760] Failed to read file at "/root/.aws/credentials"
3/11/2022, 1:34:11 PM UTC	stdout	[ERROR] 2022-03-11 13:34:11.117 AuthCredentialsProvider [140518474917760] static: Profile credentials parser could not load or parse a credentials or config file.
3/11/2022, 1:34:21 PM UTC	stdout	[ERROR] 2022-03-11 13:34:21.164 http-connection [140518184900352] static: Client connection failed with error 1067 (AWS_IO_TLS_NEGOTIATION_TIMEOUT).
3/11/2022, 1:34:21 PM UTC	stdout	[WARN] 2022-03-11 13:34:21.164 connection-manager [140518184900352] id=0x7fcd00063b00: Failed to obtain new connection from http layer, error 1067(Channel shutdown due to tls negotiation timeout)
3/11/2022, 1:34:21 PM UTC	stdout	[WARN] 2022-03-11 13:34:21.164 connection-manager [140518184900352] id=0x7fcd00063b00: Failed to complete connection acquisition with error_code 1067(Channel shutdown due to tls negotiation timeout)
3/11/2022, 1:34:21 PM UTC	stdout	[WARN] 2022-03-11 13:34:21.164 AuthCredentialsProvider [140518184900352] id=0x7fcce7d4afe0: STS_WEB_IDENTITY provider failed to acquire a connection, error code 1067(Channel shutdown due to tls negotiation timeout)
3/11/2022, 1:34:21 PM UTC	stdout	[WARN] 2022-03-11 13:34:21.164 AuthCredentialsProvider [140518184900352] (id=0x7fcce7d4afe0) STS_WEB_IDENTITY credentials provider failed to query credentials

We decided to use a TLS negotiation timeout of multiples of the TCP timeout, since TLS handshakes consume 2 extra RTTs.

Potential fix

This is how we solved the problem in our code:

--- a/source/credentials_provider_sts_web_identity.c
+++ b/source/credentials_provider_sts_web_identity.c
@@ -1213,6 +1213,12 @@ struct aws_credentials_provider *aws_credentials_provider_new_sts_web_identity(
     }

     aws_tls_connection_options_init_from_ctx(&tls_connection_options, options->tls_ctx);
+    /*
+     * Override the TLS timeout. The default of 10s has led to timeouts.
+     * Since TLS has a higher overhead than TCP, use multiples of the connection timeout.
+     */
+    tls_connection_options.timeout_ms = 3 * STS_WEB_IDENTITY_CONNECT_TIMEOUT_DEFAULT_IN_SECONDS * 1000 /*msec*/;
+
     struct aws_byte_cursor host = aws_byte_cursor_from_buf(&parameters->endpoint);
     if (aws_tls_connection_options_set_server_name(&tls_connection_options, allocator, &host)) {
         AWS_LOGF_ERROR(
grrtrr added a commit to grrtrr/aws-c-auth that referenced this issue Jul 16, 2022
@yasminetalby yasminetalby added the needs-review This issue or pull request needs review from a core team member. label Jun 23, 2023
@jmklix jmklix added bug This issue is a bug. p2 This is a standard priority issue labels Jan 10, 2024
@jmklix
Copy link
Member

jmklix commented Feb 12, 2024

Thanks for this suggestion, but we don't want to change the default timeout for sts. Please let us know if you are still running into any problems with the default timeouts

@jmklix jmklix closed this as completed Feb 12, 2024
@grrtrr
Copy link
Contributor Author

grrtrr commented Feb 13, 2024

Yes, increasing the timeout to accommodate for TLS overhead is very much recommended, to avoid the above described Channel shutdown due to tls negotiation timeout problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. needs-review This issue or pull request needs review from a core team member. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants