Feature hostnames #309

NorbertHeusser · 2022-09-08T13:59:35Z

This is the second PR to finalize hostname support for the raft library. This will close #298

NorbertHeusser · 2022-09-08T19:51:40Z

OK. Fixed all build issues now. And signed all commits.

Re-signed the contributor agreement the third time now :-(

Signed-off-by: Norbert Heusser <[email protected]>

test case parameter Signed-off-by: Norbert Heusser <[email protected]>

number of test case with listen failures will increase Signed-off-by: Norbert Heusser <[email protected]>

Signed-off-by: Norbert Heusser <[email protected]>

https://bugzilla.redhat.com/show_bug.cgi?id=496300 Signed-off-by: Norbert Heusser <[email protected]>

codecov-commenter · 2022-09-09T15:38:27Z

Codecov Report

Merging #309 (0a3a323) into master (09a821f) will increase coverage by 0.06%.
The diff coverage is 93.85%.

@@            Coverage Diff             @@
##           master     #309      +/-   ##
==========================================
+ Coverage   83.54%   83.60%   +0.06%     
==========================================
  Files          49       49              
  Lines        9268     9339      +71     
  Branches     2477     2499      +22     
==========================================
+ Hits         7743     7808      +65     
- Misses        969      970       +1     
- Partials      556      561       +5

Impacted Files	Coverage Δ
src/uv_tcp_connect.c	`89.94% <90.90%> (+1.18%)`	⬆️
src/uv_tcp_listen.c	`91.12% <93.54%> (+1.18%)`	⬆️
src/uv_ip.c	`78.37% <100.00%> (+0.32%)`	⬆️
src/uv_tcp.c	`94.52% <100.00%> (+1.18%)`	⬆️
src/heap.c	`96.15% <0.00%> (-3.85%)`	⬇️
src/uv_send.c	`88.25% <0.00%> (-1.52%)`	⬇️
src/uv_append.c	`90.78% <0.00%> (-0.02%)`	⬇️
src/uv_writer.c	`77.44% <0.00%> (ø)`
src/uv_segment.c	`88.36% <0.00%> (ø)`
src/uv.c	`84.64% <0.00%> (+0.21%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Norbert Heusser <[email protected]>

NorbertHeusser · 2022-09-12T09:06:49Z

Did an additional force push with just a whitespace formatting correction to force a new cla-check.

MathieuBordere · 2022-09-12T09:39:11Z

OK. Fixed all build issues now. And signed all commits.

Re-signed the contributor agreement the third time now :-(

The cla-check finally recognized you ... hooray for that :-) Will review soon.

NorbertHeusser · 2022-09-12T13:25:04Z

I am happy the cla-check works now for :-)

Perhaps a few comments on the design. As the getaddrinfo may result into multiple IP addresses I need to deal with that in the different parts of the solution. This holds for the synchronous system call as well as for the asynchronous lbuv based invocation. The solution design I have chosen are:

For the outgoing connection I need to try one IP after the other until I find a first one working. This will cover the case one of the IP addresses is not open for listening. Once I am able to connect to the port I expect the handshake to work as well. Another solution would have been to try the next IP additionally in case the handshake is failing. But I did not consider this to be a common error scenario. As I need to close the tcp handle after each attempt and reopen it I added an additional bool flag retry to react on it in the abort (otherwise we could try to close the handle twice). It would be possible to combine the two flags resolving and retry in a single one as never both will be set at the same time, but I find the current solution better maintainable.
For the listen socket I have to bind all the resolved IP addresses. But to my experience I have seen a bind address resolving to multiple IP addresses very rarely. For this reason I have chosen to keep the current dedicated listener and rename it to default_listener as this single listener will be used in most cases. As I know after the name resolve wether I need multiple listeners I will create multiple listener only in case they are needed and in this rare case don't use the default_listener. This ways it's easier to close everything in a clear way as well.
Due to an old glibc behavior to reply with duplicate entries (which is fixed in some linux distros like SLES) I added a small workaround for this as well. Otherwise I would end up in an address already in use error.
For testing purpose I implement a kind of getaadrinfo/freeaddrinfo mock. By providing a local implementation of these in function in the testing library. As they will be bound by the linker first. The behavior of this helper file is described in the header. To avoid the need to provide all getaddrinfo mock responses for all test cases and to make sure everything works with the real system getaddrinfo the mock function will forward the calls to the real implementation, if it is not enabled. And the mock functionality is used in the tests cases which want to test some special cases like multiple IPs only.

src/uv_tcp_connect.c

src/uv_ip.c

freeekanayaka · 2022-09-13T08:09:35Z

Looks overall good to me at first sight. Before I dig deeper: how much important is it to support multiple listeners? Could we instead return an error if the hostname resolves to multiple IPs? Or choose the first IP.

I know that's sub-optimal, just wondering if it is worth the cost of additional complexity or if it's something that will probably never be used in practice. For example is it relevant for your own use case?

Since you already did all the work it's a pity to simplify it, but if there's no immediate need for multiple listeners we could keep this code around in a branch and only introduce this support when there's a request for it.

I guess same reasoning for outgoing connections.

test/lib/addrinfo.c

test/lib/addrinfo.h

NorbertHeusser · 2022-09-13T12:11:58Z

Regarding the listen sockets dealing with multiple IP addresses immediately comes into play, if we would introduce handling of IPv6 addresses. As systems with IPv6 only are still extremely rare today most systems have dual mode using IPv4 and IPV6 at the same time. So with activation of IPv6 additionally to IPv4 most hostnames will resolve one (or more) IPv6 addresses and an IPv4 address.

For outgoing TCP connections the most common use case I have seen in the past is the configuration of multiple network interfaces and IP addresses to achieve fail safety. And this is beside the cluster scale-out concept one of the most common use cases to install a cluster as a HA solution. This might be less common over the last year due to the cheaper availability of network components supporting network bonding.

Right now I need the hostname feature to be able to run our cluster in a docker swarm environment. And this does not support static IPs. But the customer plans to use multiple different virtual networks over his docker swarm. So I donÄt know right now, how fast we need it. But sooner or later we will have a customer asking for it.

And in general the I don't like implementing half of a solution (was already thinking about adding IPv6 support straight away as well). Hostnames resolving to multiple IPs is nothing new, but has been the case since the beginning of the IP protocol definition. And to my opinion dealing with this should be default implementation in all IP network base applications.

freeekanayaka · 2022-09-13T12:32:51Z

Fair enough, thanks for explaining. I'll give a deeper look at the PR.

NorbertHeusser · 2022-09-14T09:29:41Z

Fixed the requested typos/errors in comments. And applied a clang-format (except for struct comments) to align the formatting.

freeekanayaka · 2022-09-14T10:23:57Z

Looks good overall.

The main bit I'm not entirely convinced about is the differentiation between default_listener and listeners. In my mind it'd be simpler to actually have just listeners, because in any case you have to handle all the intricacies of the multiple-listeners case no matter what. It's true that listeners will have only 1 item most of the times, but that's fine, the logic will stay the same.

NorbertHeusser · 2022-09-14T12:59:55Z

The main bit I'm not entirely convinced about is the differentiation between default_listener and listeners. In my mind it'd be simpler to actually have just listeners, because in any case you have to handle all the intricacies of the multiple-listeners case no matter what. It's true that listeners will have only 1 item most of the times, but that's fine, the logic will stay the same.

What you describe was the first attempt I tried to implement, but I ran into problems with the close/destruction order of some corner cases, when I tried to do this. Will try to describe what I observed. The root cause, if I manage the listeners totally dynamically the listeners array (even for a single listener) will be created after the successful response from the name resolving. In case the name resolving does not work we will end up without any listener tcp handle. But in this case we don't have any handle to close in the UvTcpListenClose and therefor don't trigger the uvTcpListenCloseCbListener. And therefore the uv_tcp_listen will not trigger the UvTcpMaybeFireCloseCb. This all work fine as long as there are outgoing connection, which where connected successfully. In this case the outgoing connections will trigger the UvTcpMaybeFireCloseCb and all structures will get removed. But without any outgoing connections the structures will not be freed. I tried to solve this by direct invocation of UvTcpMaybeFireCloseCb from the UvTcpListenClose in case the listener array is not created. But this will fail, if there are outgoing connections, because they expect the whole tcp object to exists until their close_cb functions are finished.

More generally description:
So there is a destructions dependency between the tcp_listen and tcp_connect part of the code, which work fine right now as both code parts will always have at least have one handle to close. So the code flow will allways close all handles. All close-CB will be invoced and the last one will finally trigger the destruction of the tcp object grouping both.
This clear order of close/destruction no longer works , if one source code does not have a handle to close any more.

Any suggestion welcome how to solve this problem. So how do I trigger the UvTcpMaybeFireCloseCb from the tcp_listen part of the code without knowing, if there are outgoing connections or not.

freeekanayaka · 2022-09-14T13:31:11Z

I see, thanks. I'll have to look more closely myself as well then, perhaps there's a way out.

freeekanayaka · 2022-09-15T10:41:13Z

Ok, after reading the code I understand the situation better. I believe we should handle this ordering issue with the same pattern that we use throughout the code: queues (which is what libuv itself does, and basically most asynchronous programs, in way form or another).

My suggestion would be

Replace the struct uv_tcp_s listener; field of struct UvTcp with a queue listeners queue object. This queue will contain all active listeners (see the next point).
Add a new struct uvTcpListener to uv_tcp_listen.c, it should look roughly like this:

/* Hold information about a single listening socket. */
struct uvTcpListener
{
    struct UvTcp *t;                 /* Transport implementation */
    struct uv_tcp_s tcp;             /* Listening TCP socket handle */
    queue queue;                     /* Current queue (either listeners or aborting) */
};

Modify UvTcpListen to create an uvTcpListener object for each required address, initialize its tcp object and call uv_tcp_bind/uv_listen on it, and finally insert it in the listeners queue. This is basically a slight modification of what you do in UvTcpListenOnMultipleIP
Modify UvTcpListenClose to start closing all active listeners, something like:

void UvTcpListenClose(struct UvTcp *t)
{
    queue *head;
    assert(t->closing);
    assert(!QUEUE_IS_EMPTY(&t->listeners));

    while (!QUEUE_IS_EMPTY(&t->accepting)) {
        struct uvTcpIncoming *incoming;
        head = QUEUE_HEAD(&t->accepting);
        incoming = QUEUE_DATA(head, struct uvTcpIncoming, queue);
        uvTcpIncomingAbort(incoming);
    }

    while (!QUEUE_IS_EMPTY(&t->listeners)) {
        struct uvTcpListener *listener;
        head = QUEUE_HEAD(&t->listener);
        listener = QUEUE_DATA(head, struct uvTcpListener, queue);
        uvTcpListenerAbort(listener);
    }
}

Implement uvTcpListenerAbort with something like:

/* The listening TCP handle has been closed, release all memory
 * associated with the listener object. */
static void uvTcpListenerCloseCb(struct uv_handle_s *handle)
{
    struct uvTcpListener *listener = handle->data;
    struct UvTcp *t = listener->t;
    QUEUE_REMOVE(&listener->queue);
    RaftHeapFree(listener);
    UvTcpMaybeFireCloseCb(t);
}

/* Stop listening for incoming connection on the given listener. */
static void uvTcpListenerAbort(struct uvTcpListener *listener)
{
    struct UvTcp *t = listener->t;
    /* After uv_close() returns we are guaranteed that no more listen callback
     * will be called. */
    QUEUE_REMOVE(&listener->queue);
    QUEUE_PUSH(&t->aborting, &listener->queue);
    uv_close((struct uv_handle_s *)&listener->tcp, uvTcpListenerCloseCb);
}

Modify UvTcpMaybeFireCloseCb so it checks only the aborting queue and not t->listener.data (which does not exist anymore):

void UvTcpMaybeFireCloseCb(struct UvTcp *t)
{
    if (!t->closing) {
        return;
    }

    assert(QUEUE_IS_EMPTY(&t->listeners));
    assert(QUEUE_IS_EMPTY(&t->accepting));
    assert(QUEUE_IS_EMPTY(&t->connecting));

    if (!QUEUE_IS_EMPTY(&t->aborting)) {
        return;
    }

    if (t->close_cb != NULL) {
        t->close_cb(t->transport);
    }
}

I believe that should work neatly, unless I'm missing something.

One important note to add is that in the case you mentioned, where the name resolving does not work for the listening address, then the UvTcpListen function should just return an error and the closing sequence shouldn't even be triggered (essentially this error will bubble up and raft_start() will return it, before the event loop is even started).

I'm even not entirely sure why you raised the issue of the name resolving not working, because as I said in that case things are simple, it's just error propagation before the loop is even started, so no actual closing sequence is involved (and certainly there won't be any outgoing connections attempts ongoing).

Hope it all makes sense.

NorbertHeusser · 2022-09-19T10:29:27Z

After reading your long comment about adding an additional queue I was not really sure, if this is really what we need.
The problem I was facing was not about the UvTcpMaybeFireCloseCb being invoked to ealry, but instead nto being invoked at all. Especially your comments about everything around listen is synchronous and should not be a problem let me get back one more time and check, if the problems I was facing were about the unit tests only.

So I removed the default listener from the code, which made the code around the listener much more straight forward. To avoid the UvTcpMaybeFireCloseCb not being invoked at all I now invoke it directly from UvTcpListenClose in case there is no listener.
I got it all working by replacing a assert in the UvTcpMaybeFireCloseCb with an if check (as the UvTcpMaybeFireCloseCb maybe invoked one more time now). And remove one assertion from the unit test (as the tcp close cb maybe invoked in some unit test now directly from the close).

Please take a look, if the PR maybe accepted this way.

test/integration/test_uv_tcp_listen.c

src/uv_tcp_listen.c

freeekanayaka · 2022-09-20T09:12:25Z

This approach looks good to me, I've left a few minor comments, I'll give it another pass later.

freeekanayaka

Another round of comments.

src/uv_tcp.c

src/uv_tcp.h

src/uv_tcp_listen.c

added assertions back into UvTcpMaybeFireCloseCb

unsigned

freeekanayaka

I've added just one small nit.

Looks good to me, thanks for your work and for bearing with review feedback :)

src/uv_tcp.c

freeekanayaka · 2022-09-21T11:32:20Z

@stgraber @MathieuBordere this can be merged as far as I'm concerned.

stgraber · 2022-09-21T13:49:16Z

@MathieuBordere LGTM, feel free to merge if happy with it

MathieuBordere · 2022-09-21T16:11:34Z

Thanks a lot for your contribution!

NorbertHeusser added 7 commits September 9, 2022 15:32

Add helper addrinfo.c/.h to inject getaddrinfo responses into tests

9409f01

Signed-off-by: Norbert Heusser <[email protected]>

Add support for hostnames in outgoing tcp connections

1dc0308

Signed-off-by: Norbert Heusser <[email protected]>

Refactoring: Replace explicit raft_uv_tcp_set_bind_address unit test by

1f6d889

test case parameter Signed-off-by: Norbert Heusser <[email protected]>

Refactoring: Moved f->transport.list from setup into own macro as the

37db22a

number of test case with listen failures will increase Signed-off-by: Norbert Heusser <[email protected]>

Add support for hostnames in tcp listen sockets

86df2c1

Signed-off-by: Norbert Heusser <[email protected]>

Fixed memory leak and clang error in addrinfo inject helper

13fdeaf

Signed-off-by: Norbert Heusser <[email protected]>

Fix handling of duplicate IP entries of getaddrinfo in glibc due to

210fa15

https://bugzilla.redhat.com/show_bug.cgi?id=496300 Signed-off-by: Norbert Heusser <[email protected]>

NorbertHeusser force-pushed the feature-hostnames branch from beb4c9c to 1d1767e Compare September 9, 2022 15:34

NorbertHeusser force-pushed the feature-hostnames branch from 1d1767e to 07039e2 Compare September 10, 2022 08:10

Fixed ordering of libraries in linkage of executables (18.04 build fix)

0c370f0

Signed-off-by: Norbert Heusser <[email protected]>

NorbertHeusser force-pushed the feature-hostnames branch from 07039e2 to 0c370f0 Compare September 12, 2022 09:05

freeekanayaka reviewed Sep 13, 2022

View reviewed changes

src/uv_tcp_connect.c Outdated Show resolved Hide resolved

src/uv_tcp_connect.c Outdated Show resolved Hide resolved

src/uv_ip.c Outdated Show resolved Hide resolved

MathieuBordere reviewed Sep 13, 2022

View reviewed changes

test/lib/addrinfo.c Show resolved Hide resolved

test/lib/addrinfo.h Show resolved Hide resolved

Applied clang-format and fixed typos in comments

641de84

NorbertHeusser force-pushed the feature-hostnames branch from 8d3b93b to 641de84 Compare September 14, 2022 09:02

Removed default_listener

891d34a

freeekanayaka reviewed Sep 20, 2022

View reviewed changes

Added requested changes from review.

8c1dcd1

freeekanayaka reviewed Sep 20, 2022

View reviewed changes

src/uv_tcp.c Show resolved Hide resolved

src/uv_tcp.c Outdated Show resolved Hide resolved

src/uv_tcp.h Outdated Show resolved Hide resolved

src/uv_tcp_listen.c Outdated Show resolved Hide resolved

NorbertHeusser added 2 commits September 21, 2022 06:48

Moved UvTcpMaybeFireCloseCb from UvTcpListenClose to uvTcpClose and

9b440dc

added assertions back into UvTcpMaybeFireCloseCb

Renamed num_listeners to n_listeners in struct UvTcp and changed type to

0a3a323

unsigned

freeekanayaka approved these changes Sep 21, 2022

View reviewed changes

src/uv_tcp.c Outdated Show resolved Hide resolved

Made comparsion of listeners to NULL explicit in uv_tcp.c

528b00d

MathieuBordere merged commit 5b2d180 into canonical:master Sep 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature hostnames #309

Feature hostnames #309

NorbertHeusser commented Sep 8, 2022

NorbertHeusser commented Sep 8, 2022 •

edited

Loading

codecov-commenter commented Sep 9, 2022 •

edited

Loading

NorbertHeusser commented Sep 12, 2022

MathieuBordere commented Sep 12, 2022

NorbertHeusser commented Sep 12, 2022 •

edited

Loading

freeekanayaka commented Sep 13, 2022

NorbertHeusser commented Sep 13, 2022

freeekanayaka commented Sep 13, 2022

NorbertHeusser commented Sep 14, 2022

freeekanayaka commented Sep 14, 2022

NorbertHeusser commented Sep 14, 2022 •

edited

Loading

freeekanayaka commented Sep 14, 2022

freeekanayaka commented Sep 15, 2022

NorbertHeusser commented Sep 19, 2022

freeekanayaka commented Sep 20, 2022

freeekanayaka left a comment

freeekanayaka left a comment

freeekanayaka commented Sep 21, 2022

stgraber commented Sep 21, 2022

MathieuBordere commented Sep 21, 2022

Feature hostnames #309

Feature hostnames #309

Conversation

NorbertHeusser commented Sep 8, 2022

NorbertHeusser commented Sep 8, 2022 • edited Loading

codecov-commenter commented Sep 9, 2022 • edited Loading

Codecov Report

NorbertHeusser commented Sep 12, 2022

MathieuBordere commented Sep 12, 2022

NorbertHeusser commented Sep 12, 2022 • edited Loading

freeekanayaka commented Sep 13, 2022

NorbertHeusser commented Sep 13, 2022

freeekanayaka commented Sep 13, 2022

NorbertHeusser commented Sep 14, 2022

freeekanayaka commented Sep 14, 2022

NorbertHeusser commented Sep 14, 2022 • edited Loading

freeekanayaka commented Sep 14, 2022

freeekanayaka commented Sep 15, 2022

NorbertHeusser commented Sep 19, 2022

freeekanayaka commented Sep 20, 2022

freeekanayaka left a comment

Choose a reason for hiding this comment

freeekanayaka left a comment

Choose a reason for hiding this comment

freeekanayaka commented Sep 21, 2022

stgraber commented Sep 21, 2022

MathieuBordere commented Sep 21, 2022

NorbertHeusser commented Sep 8, 2022 •

edited

Loading

codecov-commenter commented Sep 9, 2022 •

edited

Loading

NorbertHeusser commented Sep 12, 2022 •

edited

Loading

NorbertHeusser commented Sep 14, 2022 •

edited

Loading