Skip to content
This repository has been archived by the owner on Mar 4, 2024. It is now read-only.

Add additional async getaddrinfo step to outgoing tcp connections #301

Merged
merged 2 commits into from
Aug 31, 2022

Conversation

NorbertHeusser
Copy link
Contributor

As first step to allow usage of hostnames instead of static IPs I added the required additional async getaddrinfo step to the outgoing tcp connection to the other cluster members.
Unfortunately the libuv async getaddrinfo is not connected to the TCP handle. As a consequence abort the connect got a little bit more complicated as the getaddrinfo may be in progress and cannot be stopped. In this case the data structures need to be valid until the getaddrinfo callback is finished.

In the current first implementation the getaddrinfo is restricted to IPv4 addresses in a string representation as before. This way it should behave excactly as before.

@freeekanayaka
Copy link
Contributor

I've never used uv_getaddrinfo, but I suspect the extra dance in this PR is not needed. If we can't cancel a getaddrinfo request, we simply just discard any result if t->closing is non-zero (as in the uvTcpConnectUvWriteCb and uvTcpConnectUvConnectCb callbacks).

Basically something like this (declaration order inverted and extra comments added just for clarity):

int UvTcpConnect(...) {
    ...
    assert(!t->closing);  /* Note that t->closing is false at the beginning */
    ...
    QUEUE_PUSH(&t->connecting, &r->queue); /* So we know there's a connection attempt in progress, 
                  and we abort it in UvTcpConnectClose() if uvTcpClose() gets
                  called because the engine is shutting down. */.
    ...
    uvTcpConnectStart(r, address);
}

static int uvTcpConnectStart(struct uvTcpConnect *r, const char *address) {
    ...
    rv = uv_tcp_init(r->t->loop, r->tcp); /* we'll need to close this handle in case of failure,
                           see uvTcpConnectAbort */
    assert(rv == 0);
    ...
    rv = uv_getaddrinfo(r->t->loop, &r->getaddrinfo, &uvGetAddrInfoCb, hostname,
                        service, &hints);
    ...
}

static void uvGetAddrInfoCb(uv_getaddrinfo_t *req,
                            int status,
                            struct addrinfo *res) {
    ...
    if (t->closing) { /* This basically checks if uvTcpClose() was called, see uv_tcp.c.
        Note that uvTcpConnectAbort() has been already called too, by UvTcpConnectClose */
        connect->status = RAFT_CANCELED;
        return;
    }

    if (status != 0) { /* The DNS resolution failed, explicitly abort the whole attempt */
        assert(status != UV_ECANCELED); /* t->closing would have been true, because
               we attempt cancelling in uvTcpAbort()  */
        connect->status = RAFT_NOCONNECTION;
        ...
        goto err;
    }

    rv = uv_tcp_connect(&r->connect, r->tcp, (const struct sockaddr *)res->ai_addr
                        uvTcpConnectUvConnectCb);
    if (rv != 0) {
        /* UNTESTED: since parsing succeed, this should fail only because of
         * lack of system resources */
        connect->status = RAFT_NOCONNECTION;
        goto err;
    }

    return;

err:
    uvTcpConnectAbort(connect);
}

static void uvTcpConnectUvConnectCb(struct uv_connect_s *req, int status) {
    ... /* No change should be needed here */
}

static void uvTcpConnectUvWriteCb(struct uv_write_s *write, int status) {
    ... /* No change should be needed here */
}

static void uvTcpConnectFinish(struct uvTcpConnect *connect) {
   ...
    /* The only change needed here should be freeing the addr structure, since we
        don't need it anymore. Note that this function will be always invoked,
        either by uvTcpConnectUvWriteCb or by uvTcpConnectUvCloseCb, so no leaking will happen.  */
   uv_freeaddrinfo(connect->getaddrinfo.addrinfo);
   ...
}

static void uvTcpConnectAbort(struct uvTcpConnect *connect)
{
    QUEUE_REMOVE(&connect->queue);
    QUEUE_PUSH(&connect->t->aborting, &connect->queue);
    uv_cancel((struct uv_req_s *)&connect->getaddrinfo); /* Best effort, it doesn't matter 
          if it actually cancels or not, in any case the uvGetAddrInfoCb callback will
          be invoked if the request is still in-flight  */
    uv_close((struct uv_handle_s *)connect->tcp, uvTcpConnectUvCloseCb);
}

@freeekanayaka
Copy link
Contributor

Just to be extra clear: the call to uv_cancel() is entirely optional, the code should work just fine if we remove it. In fact, we don't even try to cancel the uv_write() call that happens after the connect, we just let it finish and check if we possibly need to abort.

@NorbertHeusser
Copy link
Contributor Author

NorbertHeusser commented Aug 29, 2022

Not canceling the uv_write is totally fine as the uv_write is dependent on the same tcp handle as the tcp_connect. In this case it is fine and totally safe to simply close the tcp handle and all pending or currently processed on this handle will be stopped before the callback give to the uv_close will be invoked. This hold for any of-of-band invocation from a thread outside the libuv loop as well.
The code you proposed was my first attempt as well. But this way we would introduce a race condition, with the libuv library working on invalid (freed) memory. As the getaddrinfo itself (in contrast to network operations based on libepoll) is not really asynchronous it is wrapped by some threads inside the libuv. If the cancellation of the async getaddrinfo fails this may have two reasons:

  1. The getaddrinfo was already performend an the callback was already invoked. We could have introduced an additional member/flag to indicate this, but simply used the state of the tcp handle to detect this case.
  2. The getaddrinfo request is currently being processed internally inside the libuv and cannot be canceled anymore. The given callback maybe invoked any time or currently is in execution.

So if we simply invoke the uv_close on the tcp handle right after we have tried to cancel the getaddrinfo request the libuv will immediately close the tcp handle (as it is not used for anything up to now) and invoke the give cb. And this callback will invoke the uvTcpConnectFinish, which will cleanup all the memory structures. But depending on the current execution stack of the getaddrinfo request in the libuv our getaddrinfo callback was accessing freed memory. Additionally the finish cb will try to free memory (the getaddrinfo result), which may not been allocated.
Happily this was easy to reproduce running very detailed unit test resulting in SEGFAULT and core dump...
This was the reason to sped the additional effort on the extra dance.

@NorbertHeusser
Copy link
Contributor Author

So the uv_cancel is not really necessary to stop the connect, but we need to make sure the structures used in uvGetAddrInfoCb are still valid until the callback execution has finished.

@freeekanayaka
Copy link
Contributor

Ah, you're right, I now see what you mean.

What about adding a flag that tracks whether the getaddrinfo request is in-flight or not, basically along the lines of what you had suggested at the very beginning I believe.

Same code as above with a little modification:

struct uvTcpConnect {
    bool resolving; /* will be initialized to false */
}


static int uvTcpConnectStart(struct uvTcpConnect *r, const char *address) {
    ...
    rv = uv_tcp_init(r->t->loop, r->tcp); /* we'll need to close this handle in case of failure,
                           see uvTcpConnectAbort */
    assert(rv == 0);
    ...
    rv = uv_getaddrinfo(r->t->loop, &r->getaddrinfo, &uvGetAddrInfoCb, hostname,
                        service, &hints);
    if (rv != 0) {...}
    r->resolving = true; /* this means that the uv_getaddrinfo call is in progress */
    ...
}

static void uvGetAddrInfoCb(uv_getaddrinfo_t *req,
                            int status,
                            struct addrinfo *res) {
    ...

    connect->resolving = false; /* run this before anything else */

    if (t->closing) { /* This basically checks if uvTcpClose() was called, see uv_tcp.c.
        Note that uvTcpConnectAbort() has been already called too, by UvTcpConnectClose,
        but uv_close() was not called yet because the connect->resolving flag was true */
        connect->status = RAFT_CANCELED;

        /* call uv_close() now that the getaddrinfo request has completed */
        uv_close((struct uv_handle_s *)connect->tcp, uvTcpConnectUvCloseCb);

        return;
    }
    ... /* the rest is the same as above */
}

static void uvTcpConnectAbort(struct uvTcpConnect *connect)
{
    QUEUE_REMOVE(&connect->queue);
    QUEUE_PUSH(&connect->t->aborting, &connect->queue);
    uv_cancel((struct uv_req_s *)&connect->getaddrinfo); /* Best effort, it doesn't matter 
          if it actually cancels or not, in any case the uvGetAddrInfoCb callback will
          be invoked if the request is still in-flight  */

    /* Only call uv_close() if there's no getaddrinfo request in flight. Otherwise, we
        delay it until it is completed. */ 
    if (!connect->resolving) {
        uv_close((struct uv_handle_s *)connect->tcp, uvTcpConnectUvCloseCb);
    }
}

@NorbertHeusser
Copy link
Contributor Author

The solution you proposed was one I though about as well. But decided not to use it for several reasons:

  1. From the design perspective the code sections checking the t->closing flag are the same in all callbacks. With the proposed solution thios design would have been broken and in behavior would be different for this callback.
  2. Second design perspective: Right now the uv_close on the tcp handle is executed either right before everything is handed of to the libuv (due to a failure in the uvTcpConnectStart). Or initiated from the uvTcpConnectAbort and not from any of the callbacks, which belong to the normal processing flow. This makes the close flow very transparent in my view. The newly introduced check cb with the close is in my perspective directly related to the abort and less harming in transparency.
  3. The most important point: With 23 years of experience in implementing multi-threaded solutions I learned it's always hard to make sure to really hit all kind of concurrent execution with the introduction of these kind of check flags. And this kind of race conditions are sometimes hard to detect in unit tests (normally running single threaded). In real concurrent execution it may work well 99,9% of the time and hit you hard, if it crashes in production. e.g in the code you proposed the libuv may be right in the execution of uvGetAddrInfoCb. And the execution stack right after the if t->closing section. So starting at the code section you described with /* the rest is the same as above */. The current LWP maybe out aside by the system scheduler for whatever reason. Now the main thread is executing it's uvTcpConnectAbort. The uv_cancel will fail (as the CB is in execution), the connect->resolving will be false and the uv_close will be executed and the given uv_close cb will be invoked immediately by the libuv as the connect on the tcp handle has not been started. All structs wills be removed and viola the continuation of the uvGetAddrInfoCb will continue to work on stale memory....

This was the reason for the solution I proposed. As the current solution in the PR is based on the guarantees given by the libuv.

@NorbertHeusser
Copy link
Contributor Author

Is there a way to retrigger the CLA check as I filed the contributor agreement now ?

@MathieuBordere
Copy link
Contributor

MathieuBordere commented Aug 30, 2022

Is there a way to retrigger the CLA check as I filed the contributor agreement now ?

It failed again, possibly because there is some delay between you signing the agreement and the system actually knowing that you signed. I'll keep an eye on it if the problem would persist.

@freeekanayaka
Copy link
Contributor

freeekanayaka commented Aug 30, 2022

The solution you proposed was one I though about as well. But decided not to use it for several reasons:

  1. From the design perspective the code sections checking the t->closing flag are the same in all callbacks. With the proposed solution thios design would have been broken and in behavior would be different for this callback.

This is actually a very common design throughout this raft library (and libuv itself). Basically the two callbacks that we have now can be aborted at the same time for the reason you mentioned (because they both depend on the tcp handle). This third callback we are introducing is an independent one that introduces an additional ordering requirement. Using the flag we make sure that ordering is preserved: first we let the uv_getaddrinfo call to finish and the uvGetAddrInfoCb callback to be triggered and only then we proceed with uv_close().

We already do this kind of callback ordering in various places.

  1. Second design perspective: Right now the uv_close on the tcp handle is executed either right before everything is handed of to the libuv (due to a failure in the uvTcpConnectStart). Or initiated from the uvTcpConnectAbort and not from any of the callbacks, which belong to the normal processing flow. This makes the close flow very transparent in my view. The newly introduced check cb with the close is in my perspective directly related to the abort and less harming in transparency.

The uv_close() call due to a failure of uvTcpConnectStart is pretty trivial and it's basically just a direct rollback.

The role of uvTcpConnectAbort() is to trigger an orderly shutdown. The fact that we don't call uv_close() right away actually makes things very explicit in my view: there is a shutdown/abort order to honor and at this time we can't yet call uv_close().

  1. The most important point: With 23 years of experience in implementing multi-threaded solutions I learned it's always hard to make sure to really hit all kind of concurrent execution with the introduction of these kind of check flags. And this kind of race conditions are sometimes hard to detect in unit tests (normally running single threaded). In real concurrent execution it may work well 99,9% of the time and hit you hard, if it crashes in production. e.g in the code you proposed the libuv may be right in the execution of uvGetAddrInfoCb. And the execution stack right after the if t->closing section. So starting at the code section you described with /* the rest is the same as above */. The current LWP maybe out aside by the system scheduler for whatever reason. Now the main thread is executing it's uvTcpConnectAbort. The uv_cancel will fail (as the CB is in execution), the connect->resolving will be false and the uv_close will be executed and the given uv_close cb will be invoked immediately by the libuv as the connect on the tcp handle has not been started. All structs wills be removed and viola the continuation of the uvGetAddrInfoCb will continue to work on stale memory....

This can't happen because of the single-threaded event-driven design of raft/libuv. There is just one thread, and any callback that is invoked by libuv is guaranteed to run from start to end without any other libuv or user callback code being executed, regardless of the OS scheduling. It's very different from multi-threaded designs were I totally agree that flags like this are usually a bad idea, or should at least be protected with very careful mutexes. There are no race conditions of that kind in event-loop based designs.

In this case if libuv is right in the execution of uvGetAddrInfoCb, in particular starting the t->closing section that you mention, then there is no way that the "main thread" executes uvTcpConnectAbort in parallel. There is only one main thread, and it's executing uvGetAddrInfoCb, so nothing else can't execute until the uvGetAddrInfoCb function has returned.

Again, this kind of design pattern is very common in both libraft and libuv, and I believe very solid.

I'd recommend to stick with the t->resolving flag approach since it makes ordering explicit and has less moving parts (like the additional uv_check handle).

@NorbertHeusser
Copy link
Contributor Author

Got it. As the libraft is totally running single threaded and using the same thread to invoke the uv_run periodically there is no concurrency at all. So will I will replace the uv_check with the flag you prefer and update the PR.

@freeekanayaka
Copy link
Contributor

Got it. As the libraft is totally running single threaded and using the same thread to invoke the uv_run periodically there is no concurrency at all. So will I will replace the uv_check with the flag you prefer and update the PR.

Exactly! Thank you for bearing with me, that's appreciated :)

Signed-off-by: Norbert Heusser <[email protected]>
@NorbertHeusser
Copy link
Contributor Author

Appreciate it as well. Always interested to find a good solution, which fit well into the existing code base and keep the code consistent, stable and maintainable.
Replaced the check now with the bool flag. All tests still ran fine locally.

int rv;

rv = uvIpParse(address, &addr);
connect->resolving =
false; /* Indicate we are in the name resolving phase */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to write "Indicate that we are done with the resolving phase"

@freeekanayaka
Copy link
Contributor

Looks good to me, thanks again.

Just in case you are interested, since you mentioned that "there is no concurrency at all", there is actually concurrency in this kind of single-threaded event-based systems. However there is no parallelism, which is different. This is a nice talk about it: https://www.youtube.com/watch?v=tIrVLcUq4xE (it's not really Go-specific).

@MathieuBordere
Copy link
Contributor

MathieuBordere commented Aug 31, 2022

@NorbertHeusser Is [email protected] the email address you signed up with at Launchpad? The cla-check uses that address to check if you signed.

@NorbertHeusser
Copy link
Contributor Author

Yes. This is the e-mail adress I used for the launchpad account as well.
Signed the contributor agreement now once again. Maybe something went wrong the first time as the launchpad Account was new as well.
Is there a way to see signing the agreement was successfull somewhere in the launchpad account ?
Did not get any kind of confirmation e-mail about signing the contributor agreement.

service_ptr = strchr(address, colon);
}
if (!service_ptr || *service_ptr == 0 || *(++service_ptr) == 0) {
service_ptr = "8080";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's special about 8080? Maybe document this somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a question for @freeekanayaka as I kept it fully compatible with the existing uvIpParse (which was used before). And he introduced this default with the first version of the file visible to me in the repo.

struct uv_connect_s connect; /* TCP connection request */
struct uv_write_s write; /* TCP handshake request */
int status; /* Returned to the request callback */
bool resolving; /* Indicate name resolving in progress */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligning with the comments above would be a bit more aesthetically pleasing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the project has a .clang-format file in it's root folder I ran a clang-format on all files I touched. My expectation was all format will be fine after that.

Copy link
Contributor Author

@NorbertHeusser NorbertHeusser Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting: Using the clang-format option AlignTrailingComments: true seems not to work on comment blocks, but just on single line comments //
Not sure, if it you are willing to use // for single line comments, which would solve this problem based on clang-format. But some people don't like using the c++ originated sigle line comment in C file (even knowing this is allowed since C99).

/* Initialize the handshake buffer. */
rv = uvTcpEncodeHandshake(t->id, t->address, &r->handshake);
if (rv != 0) {
assert(rv == RAFT_NOMEM);
ErrMsgOom(r->t->transport->errmsg);
ErrMsgOom(t->transport->errmsg);
goto err;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handshake.base is freed after err, but never allocated. It looks safe right now, but it's not very pretty and can lead to an erroneous free in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initialise the pointer to NULL in the beginning of the function. As the raft_free (like the posix free is safe to be invoked on a NULL pointer) it safe from my perspective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just the default port if none is specified. We should indeed document it, or perhaps return an error instead.

@MathieuBordere MathieuBordere merged commit 09a821f into canonical:master Aug 31, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants