Created attachment 30206 [details] config.log.gz As mentioned on gcc-help (http://gcc.gnu.org/ml/gcc-help/2013-05/msg00120.html), bootstrap of gcc trunk (and 4.8 as well) has been failing for me on x86_64-darwin. System is running OS X 10.8.3, with newest XCode 4.6.1 and associated tools. I've run into the problem on two different machines. $ ../gcc-trunk/configure --enable-languages=c,c++,objc,obj-c++,fortran,lto --disable-checking --prefix=/usr/local/gcc-trunk $ make bootstrap-lean ... /Users/dara/Downloads/objdir/./prev-gcc/xg++ -B/Users/dara/Downloads/objdir/./prev-gcc/ -B/usr/local/gcc-trunk/x86_64-apple-darwin12.3.0/bin/ -nostdinc++ -B/Users/dara/Downloads/objdir/prev-x86_64-apple-darwin12.3.0/libstdc++-v3/src/.libs -B/Users/dara/Downloads/objdir/prev-x86_64-apple-darwin12.3.0/libstdc++-v3/libsupc++/.libs -I/Users/dara/Downloads/objdir/prev-x86_64-apple-darwin12.3.0/libstdc++-v3/include/x86_64-apple-darwin12.3.0 -I/Users/dara/Downloads/objdir/prev-x86_64-apple-darwin12.3.0/libstdc++-v3/include -I/Users/dara/Downloads/gcc-trunk/libstdc++-v3/libsupc++ -L/Users/dara/Downloads/objdir/prev-x86_64-apple-darwin12.3.0/libstdc++-v3/src/.libs -L/Users/dara/Downloads/objdir/prev-x86_64-apple-darwin12.3.0/libstdc++-v3/libsupc++/.libs -g -O2 -mdynamic-no-pic -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc -Wl,-no_pie -o cc1 c/c-lang.o c-family/stub-objc.o attribs.o c/c-errors.o c/c-decl.o c/c-typeck.o c/c-convert.o c/c-aux-info.o c/c-objc-common.o c/c-parser.o c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o c-family/c-ada-spec.o tree-mudflap.o i386-c.o darwin-c.o \ cc1-checksum.o libbackend.a main.o libcommon-target.a libcommon.a ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a libcommon.a ../libcpp/libcpp.a ./../intl/libintl.a -liconv ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a -L/usr/local/gcc-trunk/lib -lcloog-isl -L/usr/local/gcc-trunk/lib -lisl -L/usr/local/gcc-trunk/lib -L/usr/local/gcc-trunk/lib -L/usr/local/gcc-trunk/lib -lmpc -lmpfr -lgmp -L../zlib -lz 0 0x102b49098 __assert_rtn + 144 1 0x102b60334 mach_o::relocatable::Parser<x86_64>::parse(mach_o::relocatable::ParserOptions const&) + 1044 2 0x102b4ff4b mach_o::relocatable::Parser<x86_64>::parse(unsigned char const*, unsigned long long, char const*, long, ld::File::Ordinal, mach_o::relocatable::ParserOptions const&) + 313 3 0x102b4dadc mach_o::relocatable::parse(unsigned char const*, unsigned long long, char const*, long, ld::File::Ordinal, mach_o::relocatable::ParserOptions const&) + 208 4 0x102b6f74c archive::File<x86_64>::makeObjectFileForMember(archive::File<x86_64>::Entry const*) const + 794 5 0x102b6f261 archive::File<x86_64>::justInTimeforEachAtom(char const*, ld::File::AtomHandler&) const + 139 6 0x102b7fb06 ld::tool::InputFiles::searchLibraries(char const*, bool, bool, bool, ld::File::AtomHandler&) const + 210 7 0x102b86978 ld::tool::Resolver::resolveUndefines() + 200 8 0x102b888a3 ld::tool::Resolver::resolve() + 75 9 0x102b49380 main + 370 A linker snapshot was created at: /tmp/cc1-2013-04-27-193733.ld-snapshot ld: Assertion failed: (cfiStartsArray[i] != cfiStartsArray[i-1]), function parse, file /SourceCache/ld64/ld64-136/src/ld/parsers/macho_relocatable_file.cpp, line 1555. collect2: error: ld returned 1 exit status make[3]: *** [cc1] Error 1 make[2]: *** [all-stage2-gcc] Error 2 make[1]: *** [stage2-bubble] Error 2 make: *** [bootstrap-lean] Error 2
This seems better reported to Apple than here as it is Apple's provided ld that is crashing.
Are these failures limited to 'make bootstrap-lean' on your machines? What happens if you just use 'make' without arguments.
The trigger for this bug is the use of --disable-checking. The linker crash doesn't occur when --enable-checking=release or --enable-checking=yes is passed to configure instead.
Aha! I will try using plain make and leaving checking alone. I don't suppose this is documented anywhere? As to reporting the bug to Apple, is this in fact a linker bug, as opposed to a bad-code-generation bug?
(In reply to Dara Hazeghi from comment #4) > Aha! I will try using plain make and leaving checking alone. I don't > suppose this is documented anywhere? make (not bootstrap) with --enable-checking=release does work. I'll try again with bootstrap-lean to verify whether checking is the sole cause of the failure.
I've opened radar://14005298, "linker crash when building FSF gcc with --disable-checking" with a standalone test case of the failing linkage of cc1.
(In reply to Jack Howarth from comment #6) > I've opened radar://14005298, "linker crash when building FSF gcc with > --disable-checking" with a standalone test case of the failing linkage of > cc1. Thanks a bunch! make bootstrap-lean works fine with --enable-checking=release, so the checking is definitely the cause here.
The darwin linker developer's analysis of the failing linkage of cc1 is below... The assertion is about the file libbackend.a(varasm.o). There are overlapping FDEs. If you run dwarfdump in verify mode, it will complain about it to:: [/tmp/newlinkerbug/lib]> dwarfdump --eh-frame --verify varasm.o ---------------------------------------------------------------------- File: varasm.o (x86_64) ---------------------------------------------------------------------- Verifying EH Frame... error: FDE row for address 0x0000000000005900 is not in the FDE address range. 0x000020e0: FDE length: 0x0000001c CIE_pointer: 0x00000000 start_addr: 0x0000000000005900 __Z24default_no_named_sectionPKcjP9tree_node range_size: 0x0000000000000000 (end_addr = 0x0000000000005900) DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop Instructions: 0x0000000000005900: CFA=rsp+8 rip=[rsp] 1 errors found in EH frame for varasm.o (x86_64). Dumping the whole file, there is an FD for a zero length function, so two FDEs have the same function start address: 0x000020e0: FDE length: 0x0000001c CIE_pointer: 0x00000000 start_addr: 0x0000000000005900 __Z24default_no_named_sectionPKcjP9tree_node range_size: 0x0000000000000000 (end_addr = 0x0000000000005900) DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop Instructions: 0x0000000000005900: CFA=rsp+8 rip=[rsp] 0x00002100: FDE alength: 0x0000006c CIE_pointer: 0x00000000 start_addr: 0x0000000000005900 __Z24default_no_named_sectionPKcjP9tree_node range_size: 0x0000000000000154 (end_addr = 0x0000000000005a54) Instructions: 0x0000000000005900: CFA=rsp+8 rip=[rsp]
If you can attach the .s file for varasm.c that does result in the crash that would be good. If this is a regression, identifying the change that broken it would be handy. Thanks.
Created attachment 30211 [details] varasm.s.gz varasm.s resulting in the crash
Curious, varasm.s has: __Z24default_no_named_sectionPKcjP9tree_node: LFB588: nop # Required to be here, or the pair must be removed. LFE588: well, except for the nop. If I add the nop, we get a non-zero size and it works, if the nop is missing, zero size, and it fails. So, now the question is, what broke this, a tools upgrade on the OS, or a update on the gcc trunk? If gcc, which update. If a tools update on the OS, then I think we need to add code to dwarf to find and remove the trivial bits.
Ok, new theory. Does this patch fix it for you: Index: varasm.c =================================================================== --- varasm.c (revision 199270) +++ varasm.c (working copy) @@ -6052,7 +6052,7 @@ default_no_named_section (const char *na { /* Some object formats don't support named sections at all. The front-end should already have flagged this as an error. */ - gcc_unreachable (); + gcc_assert (0); } #ifndef TLS_SECTION_ASM_FLAG
(In reply to mrs@gcc.gnu.org from comment #12) > Ok, new theory. Does this patch fix it for you: Thanks for the patch. Just tried bootstrapping with it and checking disabled, and the same assertion still triggers.
Thanks, how about this one? Index: target.def =================================================================== --- target.def (revision 199270) +++ target.def (working copy) @@ -225,7 +225,7 @@ DEFHOOK (named_section, "", void, (const char *name, unsigned int flags, tree decl), - default_no_named_section) + 0) /* Return preferred text (sub)section for function DECL. Main purpose of this function is to separate cold, normal and hot Index: varasm.c =================================================================== --- varasm.c (revision 199270) +++ varasm.c (working copy) @@ -6042,19 +6042,6 @@ have_global_bss_p (void) return bss_noswitch_section || targetm.have_switchable_bss_sections; } -/* Output assembly to switch to section NAME with attribute FLAGS. - Four variants for common object file formats. */ - -void -default_no_named_section (const char *name ATTRIBUTE_UNUSED, - unsigned int flags ATTRIBUTE_UNUSED, - tree decl ATTRIBUTE_UNUSED) -{ - /* Some object formats don't support named sections at all. The - front-end should already have flagged this as an error. */ - gcc_unreachable (); -} -
(In reply to mrs@gcc.gnu.org from comment #14) > Thanks, how about this one? Seems to be the same - assert in the same spot. Shall I upload the varasm.s produced with the second patch?
Yes please. If you can run: dwarfdump --eh-frame --verify file.o on all the .o files and see if there are any more lurking in there. Any that fail verification will need to be fixed, one way, or another.
(In reply to mrs@gcc.gnu.org from comment #16) > Yes please. If you can run: > > dwarfdump --eh-frame --verify file.o > > on all the .o files and see if there are any more lurking in there. Any > that fail verification will need to be fixed, one way, or another. From gcc/ I see the following: 1 errors found in EH frame for dfp.o (x86_64). 1 errors found in EH frame for gengtype-state.o (x86_64). 1 errors found in EH frame for hooks.o (x86_64). 3 errors found in EH frame for i386.o (x86_64). 3 errors found in EH frame for insn-output.o (x86_64). 2 errors found in EH frame for langhooks.o (x86_64). 1 errors found in EH frame for sched-deps.o (x86_64). 9 errors found in EH frame for targhooks.o (x86_64). 1 errors found in EH frame for tree-profile.o (x86_64). 1 errors found in EH frame for tree-ssa-loop-im.o (x86_64). 2 errors found in EH frame for tree.o (x86_64). 1 errors found in EH frame for var-tracking.o (x86_64). Shall I upload the object code or the assembly code?
Do we have any idea why this problem is latent with --checking=release and --checking=yes but is triggered by --disable-checking?
I'll build my own tree, thanks. I was hoping that it was a singular issue and we'd be done with it.
Still present at revision 203491 and the patch in comment #14 does not help.
(In reply to Dominique d'Humieres from comment #20) > Still present at revision 203491 and the patch in comment #14 does not help. Trivial reproducer: ===== __attribute__((noinline)) void foo (void) { __builtin_unreachable(); } int main (int ac, char *av[]) { foo (); return 0; } ===== As Mike surmises this is another case where we emit code that does not comply with the "atom model" that ld64 (and lld) uses. foo() and main() both end up empty for -O > 0. ==== Mike: any thoughts on this? - seems you were intending to take a look. (it also breaks bootstrapping llvm with GCC in Release mode)
OK, So this has been biting me some more. It might be another case where Darwin has thrown up a more general problem. What's happening is that, where functions are ending up zero-sized, an FDE is still being emitted. we get for DWARF FDE, .globl foo foo: LFBxxx LFExxx and for .cfi_xxxx .globl foo foo: LFBxxx .cfi_startproc .cfi_endproc LFExxx ... both produce FDEs with 0 PC ranges. This upsets ld64. 1. GCC - it seems a waste of binary file space to emit FDEs with 0 PC range, since they can neither be the site of an exception, nor can they participate in unwinding; however, it might be rather intrusive for the current phase to fix that - if it's not causing any other port problems. * I haven't thought about it much harder than that - any reason anyone can see for wanting to emit an FDE with 0 PC range? 2. ld64 - should, perhaps, be more defensive, and discard 0-length FDEs when pulling in object files. I've patched my version to do this and testing - will post a revised version when it's done. Meanwhile, are there any other thoughts from folks on the best way forward?
On the platform, external symbols are defined to be 1 or more bytes. 0 is not one or more. Once that is fixed, then the problem goes away. If you want to have Apple update their abi for future systems to include zero byte objects, you will have to ask them to change their abi.
(In reply to mrs@gcc.gnu.org from comment #23) > On the platform, external symbols are defined to be 1 or more bytes. 0 is > not one or more. Once that is fixed, then the problem goes away. If you > want to have Apple update their abi for future systems to include zero byte > objects, you will have to ask them to change their abi. Well, I'm very aware of the constraint that has been applied our output to date (having implemented some of the "fixes" for it). However, it was my understanding that the constraint was one of the tools; I.E. ld64's ability to determine an unambiguous 'atom'. I can't find anything in the written ABI or assembler documentation that makes such a statement (although we accept that "what the other tools do" is the effective ABI). It appears that (recently) ld64 [and the assembler] have been modified to support symbol aliases. Thus the constraint you mention has been amended/modified; newer versions of ld64 are not complaining about the 0-sized functions (or co-incident symbols), only the 0-sized FDE. It would be quite a useful step forward to support symbol aliases - since the absence has been a source of difficulty for us - but let's not get side-tracked from the actual problem. ---- 1. FWIW There is code in i386.c [12410 - 12438] that is supposed to ensure that functions on Darwin contain at least a NOP. However, it clearly isn't working in these cases. ---- 2. The issue of whether Darwin can have 0-sized functions is actually a separate one from whether GCC should emit FDEs for 0-sized functions (since other platforms can clearly support them).
Created attachment 37324 [details] Avoid empty function bodies V1 So... The #if TARGET_MACHO code in ix86_output_function_epilogue () is supposed to prevent trailing labels on Darwin functions (because that creates another problem if those are used in relocations). However, the code doesn't work for multiple reasons - not least of which is that ix86_output_function_epilogue() is called before the last function lables are emitted. Ironically, if it was working - it would have suppressed the current bug since we typically get: …. globl foo foo: LFBxxx <=== ix86_output_function_epilogue() is called here. LFExxx and, in theory, the trailing LFBxxx should have fired the output of a nop. However, the presence of the barrier seems to undo this. ---- Given that the ifdef-d code cannot do what it intends (it would need to be called later), it might as well be removed. We can, however, detect empty function bodies at this point an emit some instruction to avoid the circumstance. At present, there doesn't seem to be any legitimate case where an empty function body could be validly executed. Note that the usal reason for function bodies to be completely empty is when the code in the function is made unreachable (with __builtin_unreachable()). The GCC manual says that reaching such code is UB, so we are free to do whatever seems most useful. In this case making the function one insn long and making that insn "hlt" seems useful - so that if such a function is actually called it does something that will provide a hint to the user. Bootstrapped on trunk (and 5.3) testing as and when - folks, please comment/try out.
+ /* If we don't find any, we've got an empty function body; i.e. + completely empty - without a return or branch. Reaching an + empty function body means UB. Let's trap it. */ + if (insn == NULL) + fputs ("\thlt\n", file); Probably sou want to use ud2 instruction here.
(In reply to Uroš Bizjak from comment #26) > + /* If we don't find any, we've got an empty function body; i.e. > + completely empty - without a return or branch. Reaching an > + empty function body means UB. Let's trap it. */ > + if (insn == NULL) > + fputs ("\thlt\n", file); > > Probably sou want to use ud2 instruction here. yeah, hlt is a little drastic ;-)
For those people still running into this (problem still exists with GCC 6.2), the following workaround will do the job on OS X / Mac OS: simply add this definition to your compile commands: -D__builtin_unreachable=__builtin_trap
posted https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00541.html I have backports for 6.x and 5.x if wanted.
Author: iains Date: Sun Nov 27 14:50:58 2016 New Revision: 242897 URL: https://gcc.gnu.org/viewcvs?rev=242897&root=gcc&view=rev Log: [Darwin] Fix PR57438 by avoiding empty function bodies and trailing labels. A. Empty function bodies causes two problems for Darwin's linker (i) zero-length FDEs and (ii) coincident label addresses that might point to items of differing weakness. B. Trailing local labels can be problematic when they end a function because similarly they might apparently point to a following weak function, leading to the linker concluding that there's a pointer-diff to a weak symbol (which is not allowed). Both conditions arise from __builtin_unreachable() lowering to a barrier. The solution for both is to emit some finite amount of code; in the case of A a trap is emitted, in the case of B a nop. gcc/ 2016-11-27 Iain Sandoe <iain@codesourcery.com> PR target/57438 * config/i386/i386.c (ix86_code_end): Note that we emitted code where the function might otherwise appear empty for picbase thunks. (ix86_output_function_epilogue): If we find a zero-sized function assume that reaching it is UB and trap. If we find a trailing label append a nop. * config/rs6000/rs6000.c (rs6000_output_function_epilogue): If we find a zero-sized function assume that reaching it is UB and trap. If we find a trailing label, append a nop. gcc/testsuite/ 2016-11-27 Iain Sandoe <iain@codesourcery.com> PR target/57438 * gcc.dg/pr57438-1.c: New Test. * gcc.dg/pr57438-2.c: New Test. Added: trunk/gcc/testsuite/gcc.dg/pr57438-1.c trunk/gcc/testsuite/gcc.dg/pr57438-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/rs6000/rs6000.c trunk/gcc/testsuite/ChangeLog
On Nov 6, 2016, at 12:22 PM, iains at gcc dot gnu.org <gcc-bugzilla@gcc.gnu.org> wrote: > I have backports for 6.x and 5.x if wanted. Yes please. I think it is safe enough and the problem is really kinda nasty.
Author: iains Date: Sun Dec 11 16:10:48 2016 New Revision: 243525 URL: https://gcc.gnu.org/viewcvs?rev=243525&root=gcc&view=rev Log: [Darwin] Back-port fix for PR57438. gcc/ 2016-12-11 Iain Sandoe <iain@codesourcery.com> Backport from mainline 2016-11-27 Iain Sandoe <iain@codesourcery.com> PR target/57438 * config/i386/i386.c (ix86_code_end): Note that we emitted code where the function might otherwise appear empty for picbase thunks. (ix86_output_function_epilogue): If we find a zero-sized function assume that reaching it is UB and trap. If we find a trailing label append a nop. * config/rs6000/rs6000.c (rs6000_output_function_epilogue): If we find a zero-sized function assume that reaching it is UB and trap. If we find a trailing label, append a nop. gcc/testsuite/ 2016-12-11 Iain Sandoe <iain@codesourcery.com> Backport from mainline 2016-11-27 Iain Sandoe <iain@codesourcery.com> PR target/57438 * gcc.dg/pr57438-1.c: New Test. * gcc.dg/pr57438-2.c: New Test. Added: branches/gcc-6-branch/gcc/testsuite/gcc.dg/pr57438-1.c branches/gcc-6-branch/gcc/testsuite/gcc.dg/pr57438-2.c Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/i386/i386.c branches/gcc-6-branch/gcc/config/rs6000/rs6000.c branches/gcc-6-branch/gcc/testsuite/ChangeLog
*** Bug 78077 has been marked as a duplicate of this bug. ***
So, since backports were requested for 5 and 6, and the 5 branch is closed now, and the backport for 6 is already done, I'm going to close this as FIXED now.