Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Client.Do blocks on DNS AAAA record until timeout even after A record succeeds #57697

Closed
islinwb opened this issue Jan 9, 2023 · 11 comments
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@islinwb
Copy link

islinwb commented Jan 9, 2023

What version of Go are you using (go version)?

$ go version
go version go1.18.7 linux/amd64

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/root/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.18.7"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/root/gop/test/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2745863489=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I use func (c *Client) Do(req *Request) to make a http GET request with the pure Go dns resolver.

/etc/resolv.conf:

# Generated by NetworkManager
search openstacklocal novalocal
nameserver A.A.A.A 
nameserver B.B.B.B
nameserver C.C.C.C

It queries A record successfully from nameserver A.A.A.A in the first try.
However, it queries AAAA record form 3 nameserver with 2 attemps, ending in failure until timeout.

I check the source code and find it will always query A and AAAA records.

// LookupIPAddr looks up host using the local resolver.
// It returns a slice of that host's IPv4 and IPv6 addresses.
func (r *Resolver) LookupIPAddr(ctx context.Context, host string) ([]IPAddr, error) {
	return r.lookupIPAddr(ctx, "ip", host)
}

func (r *Resolver) goLookupIPCNAMEOrder(ctx context.Context, network, name string, order hostLookupOrder) (addrs []IPAddr, cname dnsmessage.Name, err error) {
//...
	qtypes := []dnsmessage.Type{dnsmessage.TypeA, dnsmessage.TypeAAAA}
	switch ipVersion(network) {
	case '4':
		qtypes = []dnsmessage.Type{dnsmessage.TypeA}
	case '6':
		qtypes = []dnsmessage.Type{dnsmessage.TypeAAAA}
	}
// ...
}

Is this expected?

What did you expect to see?

it returns after getting A record successfully. (or is there a way to configure this)

What did you see instead?

it tries to query AAAA record until timeout

@islinwb islinwb changed the title net.http Do() query DNS A recrod succeeds but try to query AAAA record until timeout net/http Do() query DNS A recrod succeeds but try to query AAAA record until timeout Jan 9, 2023
@bcmills
Copy link
Contributor

bcmills commented Jan 9, 2023

Does the call to Do itself succeed?

(Is it just the AAAA DNS request that is blocked, or the entire call to Do?)

(CC @neild)

@bcmills bcmills added this to the Backlog milestone Jan 9, 2023
@bcmills bcmills added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jan 9, 2023
@bcmills bcmills changed the title net/http Do() query DNS A recrod succeeds but try to query AAAA record until timeout net/http: Client.Do queries DNS AAAA record until timeout even though A record succeeds Jan 9, 2023
@islinwb
Copy link
Author

islinwb commented Jan 10, 2023

@bcmills the Do blocked. It tries to query AAAA record until timeout.
more detail on my case:

  1. nameserver A.A.A.A returns A record successfully and refused for AAAA record query
  2. nameserver B.B.B.B and C.C.C.C are not accessible.
  3. no AAAA record for the domain.

It takes 20+ seconds to finish Do (2 attempt * 2 nameserver * 5 second).
In the end, Do returns expected Get result but it takes too much time.

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jan 10, 2023
@bcmills bcmills changed the title net/http: Client.Do queries DNS AAAA record until timeout even though A record succeeds net/http: Client.Do blocks on DNS AAAA record until timeout even after A record succeeds Jan 10, 2023
@bcmills
Copy link
Contributor

bcmills commented Jan 10, 2023

In the current implementation, Dialer.DialContext first resolves all addresses for the given host, then partitions them by IPv4 / IPv6, then dials those in parallel per the configured FallbackDelay:
https://cs.opensource.google/go/go/+/master:src/net/dial.go;l=422-438;drc=8232a09e3ed7d315a90ac059ee542ecaf0f6b4c2

In order to use the A record without waiting for the AAAA records to resolve, we would have to instead dial each address as it is returned. That would require a very different internal concurrency model from what is implemented today.

@jfesler
Copy link

jfesler commented Jan 10, 2023

That would also push a lot of traffic towards IPv4 (possibly NAT'd) instead of using IPv6.

Any DNS server giving zero response to the AAAA lookup is broken. Fix it.

The more sophisticated models of Happy Eyeballs will wait at least a /little/ on the slower response; not necessarily giving up but trying to still ensure the currently favored protocol is used first.

Happy Eyeballs: Success with Dual-Stack Hosts [rfc6555, 2012],
Happy Eyeballs Version 2: Better Connectivity Using Concurrency [rfc8305 2017]

@leosocy
Copy link

leosocy commented May 18, 2023

In the current implementation, Dialer.DialContext first resolves all addresses for the given host, then partitions them by IPv4 / IPv6, then dials those in parallel per the configured FallbackDelay: https://cs.opensource.google/go/go/+/master:src/net/dial.go;l=422-438;drc=8232a09e3ed7d315a90ac059ee542ecaf0f6b4c2

In order to use the A record without waiting for the AAAA records to resolve, we would have to instead dial each address as it is returned. That would require a very different internal concurrency model from what is implemented today.

But according to the implementation of the partition method, the primaries after partition depend on whether the first record of the addrs slice is IPv4 or IPv6, If the first is IPv4 then the primaries are IPv4 slice, otherwise primaries are IPv6 slice. I don't know if this is as expected.

@mateusz834
Copy link
Member

From #64783 this seem to happen while using non-recursive DNS resolvers in resolv.conf

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/550435 mentions this issue: net: Prevent unintended retries upon receiving an empty answer response from the DNS server.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/565295 mentions this issue: net: prevent unintended retries upon receiving an empty answer response from the DNS server.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/565296 mentions this issue: net: prevent unintended retries upon receiving an empty answer response from the DNS server.

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Mar 1, 2024
@dmitshur dmitshur modified the milestones: Backlog, Go1.23 Mar 1, 2024
@kkHAIKE
Copy link
Contributor

kkHAIKE commented Mar 1, 2024

@gopherbot please consider this for backport to 1.22

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #66050 (for 1.22).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

8 participants