Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

producer_test.go:54: should lazily connect - dial tcp 127.0.0.2:0->127.0.0.1:4150: bind: can't assign requested address #161

Closed
glycerine opened this issue Sep 20, 2015 · 8 comments

Comments

@glycerine
Copy link
Contributor

I pulled to latest go-nsq, and find that when running test.sh, the TestProducerConnection fails. I'm on OSX 10.10.2, go 1.5.1, nsqd v0.3.6-alpha (built w/go1.5.1).

=== RUN   TestProducerConnection
--- FAIL: TestProducerConnection (0.00s)
        producer_test.go:54: should lazily connect - dial tcp 127.0.0.2:0->127.0.0.1:4150: bind: can't assign requested address
@glycerine
Copy link
Contributor Author

Hi Matt @mreiferson, any update on this?

The test error message about "should lazily connect" doesn't make much sense to me given the context.

@mreiferson
Copy link
Member

Hi @glycerine - sorry, I have not circled back to this and it works properly on linux (the travis environment) so it's not incredibly high priority to fix.

@glycerine
Copy link
Contributor Author

I can fix this and I'm happy to do so, but reading the test doesnt convent its intent--so I don't know what the right test should actually be. If you could just take 20 seconds and explain what it is trying to test, I'll happily put in a pull request.

@mreiferson
Copy link
Member

Part of this test is attempting to assert that when asking go-nsq to bind to a specific interface that we can't then connect over a different interface.

@glycerine
Copy link
Contributor Author

From that description it sounds like operating system functionality. If the OS lets interfaces bleed one into another, I'm not sure there's anything application level can do.

@mreiferson
Copy link
Member

Well, specifically, it's the LocalAddr option that it's trying to test, which go-nsq passes over to the connection.

@glycerine
Copy link
Contributor Author

I dug into this.

Background: what 127.0.0.2 means: it is defined by RFC3330[1] as a loopback address, so it should always refer to the local host. However OSX, without additional sudo commands, only implements 127.0.0.1 by default[2] as a loopback address.

Background for the failing test: LocalAddr is defined in the net.Dialer struct as being the local network endpoint (ip address + port) that the connection should come from when connecting from a local address to a remote (IP) address. It makes sense that this local ip address must always be a valid address. However as OSX does not support 127.0.0.2 as a loopback address without additional superuser configuration, the test is failing on OSX machines.

Intent of the test: the LocalAddr option used should reflect the parameter passed to it. This can be evaluated by checking either the IP address passed in, or by checking the port passed in. It would be quite bizarre that any implementation would bind an IP address that was not specified, and I doubt that this is the subject of the tests scrutiny. As the port is chosen at random by requesting a 0 port, in the net.ResolveTCPAddr("tcp", laddr+":0") (see github.com/nsqio/go-nsq/producer_test.go:47), verifying that this is indeed the port in use is indeed quite a strong test that the Producer is actually using the LocalAddr that was specified; as it would be a low-odds difficult proposition for code to have randomly selected the same port as the one the operating system chooses randomly at runtime.

I conclude that it does not materially weaken the test to restrict it to verifying that the port found in the conn.conn.LocalAddr() matches the specified port. Moreover OSX and other linux servers may find these alternative loopback addresses such as 127.0.0.2 already in use or reserved for other purposes. Hence it is a fragile assumption that 127.0.0.2 is available, and the coverage of the test is maintained by verifying that 127.0.0.1 plus the randomly chosen port is in fact still being used.

In conclusion, converting the line github.com/nsqio/go-nsq/producer_test.go:45 to read
laddr := "127.0.0.1"
as suggested by #160
does, it turns out, appear to be the appropriate course of action.

[1] http://superuser.com/questions/458875/how-do-you-get-loopback-addresses-other-than-127-0-0-1-to-work-on-os-x

[2] https://books.google.com/books?id=_jQFCAAAQBAJ&pg=PA714&lpg=PA714&dq=osx+127.0.0.+loopback&source=bl&ots=WxRRUPzDTv&sig=1vnJGYZUD7Qf5LU0ur6hlyvkGo4&hl=en&sa=X&ved=0CC8Q6AEwA2oVChMIrqTH97-hyAIVyqSICh0cZg55#v=onepage&q=osx%20127.0.0.%20loopback&f=false

@mreiferson
Copy link
Member

@glycerine nice research! Thanks for taking the time to dig into this. Unfortunately, I can't re-open #160, so if you want to open a similar PR, we can land it.

💯

glycerine added a commit to glycerine/go-nsq that referenced this issue Oct 1, 2015
mreiferson added a commit that referenced this issue Oct 3, 2015
TestProducerConnection() now green on OSX. fixes #161.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants