When the DNS resolver with CGO fails for statically linked binaries, it should fall back to the pure-go DNS resolver, if I understand correctly. This fallback mechanism does work at least on Linux for x86_64 and ppc64le. The variable ok in lookupIP in go/net/lookup_unix.go seems always set to true, so the fallback mechanism would never be called. Here is an example code to demonstrate the problem. $ gccgo -v Using built-in specs. COLLECT_GCC=gccgo COLLECT_LTO_WRAPPER=/usr/local/gccgo-216834/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../src/configure --enable-threads=posix --enable-shared --enable-__cxa_atexit --enable-languages=c,c++,go --enable-secureplt --enable-checking=yes --with-long-double-128 --enable-decimal-float --disable-bootstrap --disable-alsa --disable-multilib --prefix=/usr/local/gccgo-216834 Thread model: posix gcc version 5.0.0 20141029 (experimental) (GCC) $ cat lookup.go package main import ( "fmt" "net" ) func main() { addrs, err := net.LookupHost("gcc.gnu.org") if err != nil { fmt.Println(err) } else { for i := 0; i < len(addrs); i++ { fmt.Println(addrs[i]) } } } $ gccgo lookup.go $ ./a.out 209.132.180.131 $ gccgo -static lookup.go /usr/local/gccgo-216834/lib/gcc/x86_64-unknown-linux-gnu/5.0.0/../../../../lib64/libgo.a(net.o): In function `net.cgoLookupPort': /home/yohei/gccgo.216834/bld/x86_64-unknown-linux-gnu/libgo/../../../src/libgo/go/net/cgo_unix.go:83: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking $ ./a.out lookup gcc.gnu.org: Name or service not known $ LD_LIBRARY_PATH=/lib/x86_64-linux-gnu ./a.out 209.132.180.131
Created attachment 33966 [details] Fix the DNS lookup problem for statically linked binaries I created a patch file that fixes this problem.
I tested this issue with the latest GC trunk again, and noticed that GC always compiles programs that contains DNS lookups with dynamic linking even if -static is specified. $ go version go version devel +ae495517bd72 Fri Nov 14 11:43:01 2014 +1100 linux/amd64 $ go build -ldflags '-extldflags "-static"' lookup.go $ ./lookup 209.132.180.131 $ file lookup lookup: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), not stripped $ ldd lookup linux-vdso.so.1 => (0x00007fff81550000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f50b943d000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f50b9078000) /lib64/ld-linux-x86-64.so.2 (0x00007f50b9677000) Any suggestions? Should I report this issue to GC?
The gc linker does not use the external linker merely because you link against the net package. You need to also pass -linkmode external in ldflags.
Thank you for the correction. I need to use the following instructions to enable netgo with GC. $ go version go version devel +ae495517bd72 Fri Nov 14 11:43:01 2014 +1100 linux/amd64 $ go build -ldflags '-linkmode external -extldflags -static' -a -tags netgo lookup.go $ ./lookup 209.132.180.131 Now, I have a question about how to do the same thing with GCCGO. I think the -a option for the go command is essential to enable netgo, but the -a option seems not working with the standard GO library of GCCGO. $ go build -compiler gccgo -gccgoflags '-static' -a -tags netgo lookup.go # command-line-arguments /usr/local/gccgo-216834/lib/gcc/x86_64-unknown-linux-gnu/5.0.0/../../../../lib64/libgo.a(net.o): In function `net.cgoLookupPort': /home/yohei/gccgo.216834/bld/x86_64-unknown-linux-gnu/libgo/../../../src/libgo/go/net/cgo_unix.go:83: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking $ ./lookup lookup gcc.gnu.org: Name or service not known $ LD_LIBRARY_PATH=/lib/x86_64-linux-gnu ./lookup 209.132.180.131 I confirmed that the source code exists under $GOROOT/src where $GOROOT is the value reported by go env GOROOT. I also put libgo/go in the GCC source code into $(PREFIX)/src/pkg.
You are correct. The go command does not rebuild the standard library for gccgo. There is at present no reasonable way to use the netgo tag with gccgo.
I understand why some functions like getaddrinfo don't work with static linking unless the LD_LIBRARY_PATH is set to find libc.so, but then what should happen if it isn't?
That's really a question for the glibc maintainers. It's entirely a glibc issue. My understanding is that it's not actually libc.so that needs to be found. Glibc implements /etc/nsswitch.conf by loading a shared library for each entry listed there ("db", "files", etc.). That is a very flexible mechanism that allows nsswitch.conf to be extended on a system-specific basis. However, it only works if those shared libraries are available. It is those libraries (and libdl.so) that must be found at runtime. If the libraries can not be found, /etc/nsswitch.conf can not be implemented, and name lookups of various sorts fail by returning "not found."
If we can distinguish between the "not found" errors due to invalid host names and no nss, we can fall back to netgo only when nss is not available. I noticed that errno is set to 0 for invalid hostnames, and is set to ENOENT when nss is not available. However, this behavior is not documented, so it is probably implementation dependent. I think a practical solution to this issue is to enable the -a option of the go command for the standard library in GCCGO. I know Lynn is working on the go command for GCCGO. https://groups.google.com/forum/#!topic/gofrontend-dev/1Gm87ieI47Q What do you think?
My question was: what is supposed to happen on fallback? Sounds like some code gets rebuilt and used instead?
https://code.google.com/p/go/source/browse/src/net/lookup_unix.go#63 func lookupIP(host string) (addrs []IP, err error) { addrs, err, ok := cgoLookupIP(host) if !ok { addrs, err = goLookupIP(host) } return } This code shows how fallback works. If cgoLookup returns ok == false, then pure-Go goLookupIP is called. When the netgo tag is set, ok is always false. https://code.google.com/p/go/source/browse/src/net/cgo_stub.go#5 When the netgo tag is not set, ok is always true. https://code.google.com/p/go/source/browse/src/net/cgo_unix.go#147 https://code.google.com/p/go/source/browse/src/net/cgo_unix.go#84 If cgoLookupIP can also return ok == false when no nss is available, goLookupIP will be called. This is my first idea. https://code.google.com/p/go/source/browse/src/net/cgo_unix.go#116 However, I realized that we cannot easily distinguish the "not found" errors. getaddrinfo returns "not found" in both cases of invalid hostname lookups and static binaries. If cgoLookup returns ok == false even when an invalid host name is looked up with nss available, goLookupIP also looks it up again, so IP lookup occurs twice, and both fails with "not found" error. I don't think this is desired behavior.
What I was asking is: what does it mean to call the pure-Go goLookupIP?
"pure-Go goLookupIP" means that goLookupIP is written in Go as usual. https://code.google.com/p/go/source/browse/src/net/dnsclient_unix.go#364 cgoLookupIP uses getaddrinfo via CGO unless the netgo tag is set. https://code.google.com/p/go/source/browse/src/net/cgo_unix.go#147 Fallback means that goLookupIP is called when cgoLookup fails with ok == false in this code. https://code.google.com/p/go/source/browse/src/net/lookup_unix.go#63 func lookupIP(host string) (addrs []IP, err error) { addrs, err, ok := cgoLookupIP(host) if !ok { addrs, err = goLookupIP(host) } return } When code that uses LookupHost is compiled with "go build -ldflags '-linkmode external -extldflags -static' -a -tags netgo", the Go standard library including cgoLookupIP is rebuilt. In this case, cgoLookupIP always returns ok == false as defined here. https://code.google.com/p/go/source/browse/src/net/cgo_stub.go#19 func cgoLookupIP(name string) (addrs []IP, err error, completed bool) { return nil, nil, false } This leads to calling goLookupIP from lookupIP. I called this mechanism "fallback".
Then my question is: why isn't this "fallback" code always built for GO and available to run instead of waiting until it hits this situation and then built and run? In the situation you are describing it sounds like it is related to whether or not there is a static libnss available which could be determined at GO build time.
I am not the original author, so my description may be inaccurate. > why isn't this "fallback" code always built for GO and available > to run instead of waiting until it hits this situation and then > built and run? The fallback code (goLookupIP) is always built, but is called at run time only when the code is built with the netgo tag. When the netgo tag is not set at build time, both cgoLookupIP and goLookupIP are always built. However, only cgoLookupIP is called, and goLookupIP is never called at run time. So, lookup fails when it is statically linked. When the netgo tag is set at build time, cgoLookupIP is empty and goLookupIP is built. So only goLookupIP is called at run time. lookupIP looks like it has a run time fallback mechanism, but it is not true. The current code only selects cgoLookupIP or goLookIP at build time depending on the netgo tag setting. If we could distinguish "not found" errors of getaddrinfo as described earlier, lookupIP could have a run time fallback mechanism from cgoLookIP to goLookupIP. However, it is impossible to distinguish "not found" errors as far as I know, so I guess the current code depends on the netgo tag to select cgoLookupIP or goLookupIP at build time. > In the situation you are describing it sounds like it is related to > whether or not there is a static libnss available which could be determined > at GO build time. The existence of libnss at build time does not affect the behavior at run time. The netgo tag affects which lookup function is called at runtime.
I think what Ian is saying is that mechanism to rebuild packages in this way doesn't work with gccgo (and probably never should?) Now I'm finally understanding this. Originally with gc the net package is built with netgo off, but the netgo tag says to rebuild the GO standard library with the netgo tag set on and then build the program. Yohei's original fix used the code that was built with netgo off but allowed the go resolver to be called if the call to getaddrinfo failed. There are probably a small set of errnos that could be checked to determine if the go resolver should be called after getaddrinfo failed, but is that important? I would expect the errnos are consistent across platforms but don't know for sure. To me it seems like Yohei's fix is probably OK for gccgo with some additional checks for errno if needed, but would change existing behavior for gc and because of that should not change there?
Gccgo and GCC act the same on glibc systems: if you choose static linking, DNS lookups only work if the dynamic libraries are available. The only difference between Go and C here is that the Go library happens to have code that will work for some people some of the time. I don't want the library to automatically fall back to the Go code in all cases, because in some cases it will turn a possibly-explicable failure into a completely inexplicable failure, and in a few cases it will turn a correct failure into an incorrect success. It's OK with me to make the go tool's -a option work with gccgo. It will only work if people have the library sources available. If we do that, this issue will be fixed. I don't know how hard that would be--it might be easy. I don't know of another good way to fix this.
Can you clarify how using -a -tags netgo actually works. I know it requires that the source be available, but it must mean that it rebuilds the package for the current link only, throws it away after using it and next time it has to rebuild it? Couldn't some of your concerns about unexpected failures be resolved by providing better error information when there were failures, or provide other ways to indicate that the go resolver shouldn't be called (like an environment variable)? It just seems that rebuilding the package every time is a heavy hammer to resolve this problem. I think if someone wants to have their GO program use libnss if it is present and then the go resolver if not, that should be an option and currently it is not.
The -a option to "go build" means to rebuild all packages rather than using the installed versions (see http://golang.org/cmd/go for documentation). The "-tags netgo" option means to build with the build tag netgo (see the build constraints section in http://golang.org/pkg/go/build/). So, yes, it rebuilds the packages for the current link only. This is more reasonable with the gc compiler than with gccgo, since the gc compiler is so fast. I'm OK in principle with coming up with some other approach to direct the Go library to use the Go DNS lookup rather than calling getaddrinfo. I don't think that can be the default. I don't think we want a program to unpredictably sometimes use one and sometimes use the other. I don't think an environment variable would work well, since Yohei presumably wants the statically linked binary to work this way by default. Unfortunately all I can think of would be adding another function to the net package directing lookups to use pure DNS; this is unfortunate because the net package already has the most complex and confusing API of all the standard Go packages.
(In reply to Ian Lance Taylor from comment #18) > The -a option to "go build" means to rebuild all packages rather than using > the installed versions (see http://golang.org/cmd/go for documentation). > The "-tags netgo" option means to build with the build tag netgo (see the > build constraints section in http://golang.org/pkg/go/build/). So, yes, it > rebuilds the packages for the current link only. This is more reasonable > with the gc compiler than with gccgo, since the gc compiler is so fast. > Most of the examples in the documentation show that the built packages are put into the same directories as the source. I assume that for an official release with a binary distribution, that is not the way it works. That's how it would have to work with gccgo. In that case everyone must share the same copy of the source but then if build options are used that would cause packages to be rebuilt, they must go somewhere that is only used for the curent build. And I don't understand what 'go install' would mean in that case. The 'go install' command documentation has very little information on where built packages are stored or if there are cases when 'go install' can't be used. > I'm OK in principle with coming up with some other approach to direct the Go > library to use the Go DNS lookup rather than calling getaddrinfo. I don't > think that can be the default. I don't think we want a program to > unpredictably sometimes use one and sometimes use the other. I don't think > an environment variable would work well, since Yohei presumably wants the > statically linked binary to work this way by default. Unfortunately all I > can think of would be adding another function to the net package directing > lookups to use pure DNS; this is unfortunate because the net package already > has the most complex and confusing API of all the standard Go packages. I think providing another function that called the pure GO resolver would be best. Then the GO programmer can decide how to handle it if the first call failed.
I noticed a Docker issue saying GC 1.4 does not rebuild the standard library with -a. https://github.com/docker/docker/issues/9449 I think the problem is now not limited to GCCGO.
I'm confused by the description of -a in the go1.4 documentation. I asked about this before and the answer was that each invocation of 'go build' would create a copy of the built package which was then used for the current build but then thrown away. But that must not be the way it works?
I'm not sure why you say that it must not be the way it works. It is the way it works. The recent change to Go 1.4 is that the -a option does not apply to the standard library. I don't know whether that is a good idea or not.
If I look at this documentation: http://tip.golang.org/doc/go1.4#gocmd It says this: The behavior of the go build subcommand's -a flag has been changed for non-development installations. For installations running a released distribution, the -a flag will no longer rebuild the standard library and commands, to avoid overwriting the installation's files. When I read this it sounds like the previous behavior with the -a option was to rebuild the packages and put the newly built packages into the installed directories, including the standard library. If everyone who used 'go build' with -a generated their own copy of the built packages and then threw them away, how would the installation's files ever get overwritten?
They would not have been overwritten, unless you used "go install -a". That line in the doc may be misleading.
What is the most recommended way when we want to use the net package in a statically linked binary? My impression is that a statically linked binary also should call dlopen() (and thus we should export LD_LIBRARY_PATH), if the corresponding dynamically linked binary do so to resolve DNS. https://sourceware.org/glibc/wiki/FAQ#Even_statically_linked_programs_need_some_shared_libraries_which_is_not_acceptable_for_me.__What_can_I_do.3F Or, can we expect that netgo can be enabled with 'go build -a' again in Go 1.5? https://github.com/golang/go/issues/9369
Tatsushi: are you asking about gccgo, or about gc?
(In reply to Ian Lance Taylor from comment #26) > Tatsushi: are you asking about gccgo, or about gc? I'm asking about gccgo.
Currently there is no reasonable way to use the Go DNS resolver when using gccgo. Any program that uses the net package will call glibc for DNS resolution, meaning that you are limited to what glibc will do, which, as you say, means calling dlopen. go build -a does not work with gccgo. The problem is that gccgo uses its own copy of the Go library sources and they can not be built with go build -a. It would be nice to fix this but it is not at all a priority, since most people will use the installed libgo.
Yohei noted in comment 20 that this is also broken with gc in 1.4 when using static linking. That was a while ago -- is that no longer a problem?
The problem mentioned in comment #20 has nothing to do with gccgo. To get around that problem, use the -installsuffix option. See http://golang.org/issue/9344 . Note that the docker issue mentioned in comment #20 has been closed.
Here are two suggestions to solve this issue without having to use the -a and -tags netgo options to rebuild packages at build time. Since this is a common problem, it seems best to provide a way to have the packages built and provided with the gccgo build instead of requiring users to know how to rebuild the net package with netgo. 1) Since libgo is provided for both static (libgo.a) and dynamic (libgo.so.N) linking, and the problem only happens with static linking, we could build libgo.a with netgo enabled, so that would be the expected/default behavior with static linking with gccgo. 2) As part of the gccgo and libgo build, build just the net package with netgo enabled and install it somewhere that is easily found, using the normal directory conventions for GO. Then if someone builds a program for static linking and want the GO DNS resolver, they could link in that package before they link in libgo.a.
I have a prototype working for #2. I am assuming #1 would not be accepted. This fix consists of building a library called libnetgo.a which consists of the net files that would be built if the netgo tag was used. This new library was installed into the same directory as libgo.a. Once this library has been built and installed in the correct location, I was able to get this to work by explicitly linking in this lib: go build -gccgoflags '-static -lnetgo' lookup.go I will attach a patch after some more testing.
Created attachment 35195 [details] "Fallback" netgo solution for gccgo This patch updates the libgo Makefile to build and install the library libnetgo.a into the same install directory as libgo.a. When libnetgo.a is available then the user can link it into their statically linked program and get the same result as when using the netgo fallback mechanism that is available with golang as mentioned in this bugzilla. Once libnetgo.a is built and installed then a statically linked program can link this into their program as follows: go build -gccgoflags '-static -lnetgo' lookup.go
Created a codereview: https://codereview.appspot.com/217620043
Author: ian Date: Tue Apr 7 18:09:28 2015 New Revision: 221906 URL: https://gcc.gnu.org/viewcvs?rev=221906&root=gcc&view=rev Log: PR go/63731 libgo: Build and install libnetgo.a libnetgo.a provides the net package built with the netgo tag enabled. This provides the netgo fallback solution for gccgo. This lib must be explicitly linked in using the -gccgoflags, so is not included by default. Modified: trunk/libgo/Makefile.am trunk/libgo/Makefile.in
Lynn added a new facility. Some notes on docs: As far as documentation, I tried to find some documentation on build tags in general and netgo specifically because it seems like this should be documented there, but did not find much. Here are some ideas: - add something to the 'go help build' output about the use of the netgo tag in general and how it would be used to work around the static linking warning/problem against libnss and the workarounds for gc and gccgo - if there is something somewhere else that describes the netgo tag and when to use it, then add something about how to achieve the same effect in gccgo - seems like it would be good to have documentation describing the differences in using gccgo vs. gc and this would be included in that, however that is a bigger work item and would require more thought on what to include in such a document.
Created attachment 35260 [details] libgo/go/go/build/doc.go documentation update Adding comments about the use of the netgo tag and the equivalent method for use with gccgo.
This seems to be fixed, and the core problem has become less important now that the net package prefers to use the Go DNS library when possible.