GCC 6.0 RC fails bootstrap on AIX due to comparison failure of all object files for --enable-checking=release. This appears to be an unstable sort issue.
Cnofirmed.
Created attachment 38299 [details] Stage2 sbitmap.s
Created attachment 38300 [details] Stage3 sbitmap.s
sbitmap.s was one of the smaller files. All of the debugging information seems to be present, but emitted in a different order in the stage2 versus stage3.
Strange, I thought the default bootstrap is to build stage2 without debug info, while you have stabs in both files. Have you looked at where the differences start first (build both the stage2 and stage3 with -fdump-tree-all-nouid -fdump-rtl-all-nouid --param min-nondebug-insn-uid=10000 ? Are the stage2/stage3 sbitmap.c sources built with the same options? Do you know which sort is unstable, or is that just a guess?
Waiting for David to investigate.
I don't know that a sort is unstable or a particular sort. It is a hunch based on similar failures in the past -- some output in a different order. Why would stage2 and stage3 be built with different options, especially if their object files are compared? I added -frandom-seed=0 to the options, but I still see differences in addresses, which makes comparison a little difficult. sbitmap.c.001.tu differ, with the @XXXX addresses. The first difference where one dump file contains information not present in another dump file is sbitmap.c.003t.original. Stage2 contains three lines of ;; Function constexpr bool std::_ImplicitlyConvertiblePair() [with _T1 = mem_usage*; _T2 = mem_usage*; _U1 = mem_usage*; _U2 = mem_usage*] (null) ;; enabled by -tree-original return <retval> = 1; while Stage3 contains only one line. The rest of the file contains identical statements but different labels. sbitmap.c.006.omplower begin to show differences in funcdef_no, cgraph_uid and symbol_order.
Development branch prior to debug-early merge works.
r224187 works
This is starting to look like PR60984 all over again. Testing trunk with --enable-checking=release succeeds.
I have tried bootstrap on AIX with .../configure --prefix=`pwd` --enable-checking=release --with-gmp=/opt/cfarm/gmp-latest/ --with-mpc=/opt/cfarm/mpc-latest/ --with-mpfr=/opt/cfarm/mpfr-latest/ && gmake -j16 2>&1 | tee LOG and both trunk and gcc-6-branch work.
Current trunk works. I am testing gcc-6-branch now. But the RC itself does not work.
A source tree checked out from r235040 (the same as the tarball) works. It looks more likely that the problem is some difference between the repository and the tarball.
The problem likely is due to gcc/gentype-lex.c distributed in the tarball. The AIX systems currently use flex 2.5.3, which produces working gengtype-lex.c on AIX.
gcc-6-20160410 snapshot tarball (without gengtype-lex.c) works.
So, what is the diff in between a working and non-working gengtype-lex.c ? I don't have access right now to the WS I've built the RC1 on (travelling), but guess it is 2.5.37.
Yes, WS1 is Flex 2.5.37. I will upload both. There are many differences.
Created attachment 38310 [details] gengtype-lex.c distributed in GCC-6.0.1-RC1
Created attachment 38311 [details] gengtype-lex.c generated on AIX
So lots of macro/code formatting and other minor changes, function names changed etc., but the actual table content looks the same to me. But the amount of changes is huge. Perhaps try some flex versions in between (~2.5.18 or so) etc., so that we narrow it down a little bit more? Does it matter whether gengtype-lex.c is built with or without optimizations? Say, diff the various gengtype created gt*.h headers between versions where gengtype-lex.c is from flex 2.5.3 (I'd expect that both stage1 and stage2 gt*.h are identical then) and when built with flex 2.5.37 (do those gt*.h headers at stage1, stage2 actually match the flex 2.5.3 built ones)?
The recent flex adds a number of its own C int type definitions and ranges.
I see flex 2.6 has been released (already in November last year), does that help? I could do the final release and/or rc2 with flex 2.6 instead of flex 2.5.37...
Older releases of Flex are no longer available as source code. Flex now is distributed through sourceforge, not gnu.org. Newer releases of Flex don't build on AIX.
Actually, I finally was able to convince Flex 2.6.0 to build. I'll try with that.
Looking at the history of flex, flex 2.5.3 is something pre-1997, then there used to be 2.5.4 and 2.5.4a, and at least RHL updated from the 2.5.4a to 2.5.33 early in 2007, so the question is if there has actually been any official flex releases between 2.5.4a and 2.5.33. I can find tarball with 2.5.33 around, finding something older is already getting harder.
After some more tests, I don't believe that flex is the culprit. I removed gengtype-lex.c from GCC-6.0.1-RC and allowed the flex to rebuild it, but the build still failed with the miscompare. The problem remains that r235040 works and the tarball does not.
I performed a recursive diff of r235040 vs gcc-6.0.1-rc-20160415. Other than .svn directories, the only differences are: Only in gcc6rc/INSTALL: binaries.html Only in gcc6rc/INSTALL: build.html Only in gcc6rc/INSTALL: configure.html Only in gcc6rc/INSTALL: download.html Only in gcc6rc/INSTALL: finalinstall.html Only in gcc6rc/INSTALL: gfdl.html Only in gcc6rc/INSTALL: index.html Only in gcc6rc/INSTALL: old.html Only in gcc6rc/INSTALL: prerequisites.html Only in gcc6rc/INSTALL: specific.html Only in gcc6rc/INSTALL: test.html diff -r gcc6rc/LAST_UPDATED gcc-6-branch/LAST_UPDATED 1c1,2 < Obtained from SVN: branches/gcc-6-branch revision 235040 --- > Tue Apr 19 15:12:18 EDT 2016 > Tue Apr 19 19:12:18 UTC 2016 (revision 235040) Only in gcc6rc: MD5SUMS Only in gcc6rc: NEWS Only in gcc-6-branch/gcc: REVISION Only in gcc6rc/gcc/doc: aot-compile.1 Only in gcc6rc/gcc/doc: cpp.1 Only in gcc6rc/gcc/doc: cpp.info Only in gcc6rc/gcc/doc: cppinternals.info Only in gcc6rc/gcc/doc: fsf-funding.7 Only in gcc6rc/gcc/doc: g++.1 Only in gcc6rc/gcc/doc: gc-analyze.1 Only in gcc6rc/gcc/doc: gcc.1 Only in gcc6rc/gcc/doc: gcc.info Only in gcc6rc/gcc/doc: gccinstall.info Only in gcc6rc/gcc/doc: gccint.info Only in gcc6rc/gcc/doc: gcj-dbtool.1 Only in gcc6rc/gcc/doc: gcj.1 Only in gcc6rc/gcc/doc: gcj.info Only in gcc6rc/gcc/doc: gcov-tool.1 Only in gcc6rc/gcc/doc: gcov.1 Only in gcc6rc/gcc/doc: gfdl.7 Only in gcc6rc/gcc/doc: gfortran.1 Only in gcc6rc/gcc/doc: gij.1 Only in gcc6rc/gcc/doc: gpl.7 Only in gcc6rc/gcc/doc: grmic.1 Only in gcc6rc/gcc/doc: jcf-dump.1 Only in gcc6rc/gcc/doc: jv-convert.1 Only in gcc6rc/gcc/doc: rebuild-gcj-db.1 Only in gcc6rc/gcc/fortran: gfortran.info Only in gcc6rc/gcc: gengtype-lex.c Only in gcc6rc/gcc/po: be.gmo Only in gcc6rc/gcc/po: da.gmo Only in gcc6rc/gcc/po: de.gmo Only in gcc6rc/gcc/po: el.gmo Only in gcc6rc/gcc/po: es.gmo Only in gcc6rc/gcc/po: fi.gmo Only in gcc6rc/gcc/po: fr.gmo Only in gcc6rc/gcc/po: hr.gmo Only in gcc6rc/gcc/po: id.gmo Only in gcc6rc/gcc/po: ja.gmo Only in gcc6rc/gcc/po: nl.gmo Only in gcc6rc/gcc/po: ru.gmo Only in gcc6rc/gcc/po: sr.gmo Only in gcc6rc/gcc/po: sv.gmo Only in gcc6rc/gcc/po: tr.gmo Only in gcc6rc/gcc/po: uk.gmo Only in gcc6rc/gcc/po: vi.gmo Only in gcc6rc/gcc/po: zh_CN.gmo Only in gcc6rc/gcc/po: zh_TW.gmo Only in gcc6rc/libcpp/po: be.gmo Only in gcc6rc/libcpp/po: ca.gmo Only in gcc6rc/libcpp/po: da.gmo Only in gcc6rc/libcpp/po: de.gmo Only in gcc6rc/libcpp/po: el.gmo Only in gcc6rc/libcpp/po: eo.gmo Only in gcc6rc/libcpp/po: es.gmo Only in gcc6rc/libcpp/po: fi.gmo Only in gcc6rc/libcpp/po: fr.gmo Only in gcc6rc/libcpp/po: id.gmo Only in gcc6rc/libcpp/po: ja.gmo Only in gcc6rc/libcpp/po: nl.gmo Only in gcc6rc/libcpp/po: pt_BR.gmo Only in gcc6rc/libcpp/po: ru.gmo Only in gcc6rc/libcpp/po: sr.gmo Only in gcc6rc/libcpp/po: sv.gmo Only in gcc6rc/libcpp/po: tr.gmo Only in gcc6rc/libcpp/po: uk.gmo Only in gcc6rc/libcpp/po: vi.gmo Only in gcc6rc/libcpp/po: zh_CN.gmo Only in gcc6rc/libcpp/po: zh_TW.gmo Only in gcc6rc/libffi/doc: libffi.info Only in gcc6rc/libgomp: libgomp.info Only in gcc6rc/libitm: libitm.info Only in gcc6rc/libjava/classpath/doc: cp-tools.info Only in gcc6rc/libjava/classpath/doc: gappletviewer.1 Only in gcc6rc/libjava/classpath/doc: gjar.1 Only in gcc6rc/libjava/classpath/doc: gjarsigner.1 Only in gcc6rc/libjava/classpath/doc: gjavah.1 Only in gcc6rc/libjava/classpath/doc: gjdoc.1 Only in gcc6rc/libjava/classpath/doc: gkeytool.1 Only in gcc6rc/libjava/classpath/doc: gnative2ascii.1 Only in gcc6rc/libjava/classpath/doc: gorbd.1 Only in gcc6rc/libjava/classpath/doc: grmid.1 Only in gcc6rc/libjava/classpath/doc: grmiregistry.1 Only in gcc6rc/libjava/classpath/doc: gserialver.1 Only in gcc6rc/libjava/classpath/doc: gtnameserv.1 Only in gcc6rc/libquadmath: libquadmath.info
I copied gcc/REVISION to the release candidate to remove one additional difference and tried bootstrap, but it still failed.
Flex 2.6.0 works.
Created attachment 38320 [details] 2.5.37 -> 2.6 So, can you please verify that the RC1 tarball bootstraps if you apply the attached patch (which should change the file as if I've created rc1 with flex 2.6 instead of 2.5.37)? Or do the #line filenames matter instead? I normally bootstrap with ../configure and therefore the paths are like ../../gcc/something, but in the RC tarballs it is /d/gcc-6.0.1-RC-20160415/gcc-6.0.1-RC-20160415/gcc/something (the /d is my dest dir symlink to make those as short as possible, the rest comes from gcc_release script).
I will test, but Flex and gengtype-lex.c does not appear to be the issue. If the change works, it will be coincidental. I have built the RC with gengtype-lex.c removed so that it is regenerated with the system Flex -- it still fails. I have build gcc-6-branch r235040 with gengtype-lex.c from the RC -- it works.
But if gengtype-lex.c is not it, what it is then? I can't see how the generated man pages or *.html files or *.gmo or *.info files could affect it, so is the pathname? If you check out r235040 into the same directory as you tested the tarball in, does that work? Or is it the LAST_UPDATED file missing in the rc tarball?
I'm completely confused as well. The bits seem to be identical. The only other obvious difference is ordering of timestamps of the source files that would cause Make to build files in a different order.
The tarball contains LAST_UPDATED, although different contents. I previously copied gcc/REVISION from svn checkout to the RC (which is referenced by Makefile). That showed no difference.
Flex 2.6.0 works with --enable-checking=yes, but may not work with --enable-checking=release. I believe that Flex may be the culprit. If the current bootstrap confirms that, I am going to bootstrap with gengtype-lex.c compiled with -fsigned-char.
It definitely is Flex. gcc-6-branch r235040 and r235340 fail when built with Flex 2.6.0. gcc-6.0.1-RC-20160415 fails using the supplied gengtype-lex.c created with Flex 2.5.37.
Marek tried to reproduce this using the RC1 tarball, but it seems it went through comparison just fine; configure line has been: /home/xxxxxxx/gcc-6.0.1-RC-20160415/configure --prefix=/home/xxxxxxx/rc --enable-checking=release --with-gmp=/opt/cfarm/gmp-latest/ --with-mpc=/opt/cfarm/mpc-latest/ --with-mpfr=/opt/cfarm/mpfr-latest/ In the failed build tree, can you please do: cd gcc for i in gt-* gtype*.[ch]; do diff ../stage1-gcc/$i $i; done cd ../prev-gcc for i in gt-* gtype*.[ch]; do diff ../stage1-gcc/$i $i; done (or change cd gcc and cd ../prev-gcc to cd stage3-gcc and cd ../stage2-gcc if upon the failure the dirs stay named stage{1,2,3}-gcc instead of {stage1-,prev-,}gcc ?
The gt* files don't differ. I normally use --disable-werror --enable-languages=c,c++,fortran,objc --with-gmp=/opt/cfarm --with-libiconv-prefix=/opt/cfarm --disable-libstdcxx-pch --with-included-gettext --enable-checking=release
(In reply to David Edelsohn from comment #38) > The gt* files don't differ. > > I normally use > > --disable-werror --enable-languages=c,c++,fortran,objc --with-gmp=/opt/cfarm > --with-libiconv-prefix=/opt/cfarm --disable-libstdcxx-pch > --with-included-gettext --enable-checking=release Used /home/jakub/gcc-6.0.1-RC-20160415/configure --disable-werror --enable-languages=c,c++,fortran,objc --with-gmp=/opt/cfarm --with-libiconv-prefix=/opt/cfarm --disable-libstdcxx-pch --with-included-gettext --enable-checking=release myself now (inside of /home/jakub/rc1/) and it bootstrapped just fine too.
I see that you did not have /opt/freeware/bin in your path on AIX. How did it even build without GNU Make and other build requirements?
(In reply to David Edelsohn from comment #40) > I see that you did not have /opt/freeware/bin in your path on AIX. How did > it even build without GNU Make and other build requirements? I've used gmake -j64 instead of make -j64. I can retry with PATH=/opt/freeware/bin:$PATH instead, sure. Do you use relative or absolute path from the build dir to the source dir btw? On my boxes I almost always create a subdir of the source dir and do ../configure ..., but on this box I've placed the build dir next to the source dir and used absolute pathname.
Even PATH=/opt/freeware/bin/:$PATH /home/jakub/gcc-6.0.1-RC-20160415/configure --disable-werror --enable-languages=c,c++,fortran,objc --with-gmp=/opt/cfarm --with-libiconv-prefix=/opt/cfarm --disable-libstdcxx-pch --with-included-gettext --enable-checking=release PATH=/opt/freeware/bin/:$PATH make -j48 > LOG 2>&1 & got through comparison without failure (still finishing bootstrap, but compare file already exists).
I tried RC2 and it again failed. I configured again with your configure command and what appears to be your build command, and it succeeded. One difference is my normal bootstrap script still use the contrib/gcc_build shell script, which invokes "make -jX bootstrap". I am testing if the "bootstrap" target specifically elicits the comparison failure.
Created attachment 38338 [details] FLEX listens to M4 envvar, unset. Does it help to set M4=/opt/freeware/bin/m4 in the environment, so configure is forced to take GNU m4? Once upon a time, I've had similar problems here, although with older gcc, and use attached patch to fix it - maybe it is of some help here as well. Problem is that flex does listen to the M4 environment variable. Dependent on PATH, configure may find AIX m4 (/usr/bin) or GNU m4 (/opt/freeware/bin). When M4=AIX-m4, flex-generated code breaks.
I don't think M4 env var should make a difference in this case, in the release tarballs the gengtype-lex.c file is already built (on x86_64-linux) and nothing should be changing that. That said, I've managed to reproduce the miscompare on build/ggc-none.o as an example of very small source file, even with -g0. But it looks really weird. First of all, it seems that the system g++ 4.8 and g++ 6.1-rc2 is probably ABI incompatible, at least my attempts to mix stage1 and stage2 objects into the same binary failed miserably. I've rebuilt stage2-gcc/ cc1plus with CXXFLAGS='-g -O0', thus most if not all *.o files linked into it should be (like in stage1) built without optimizations, and still see the differences. Trying to find out where the differences start now using parallel gdb sessions.
So, my current thinking is that this is related to make -jN bootstrap doing stage1 checking by default. Guess with --enable-stage1-checking=release it would bootstrap fine, but haven't verified that. But, what I see is that if I build build/ggc-none.o with stage1 cc1plus with additional -fno-checking -g0, the result is the same as when it is built with stage2 cc1plus -g0, and similarly if it is built with stage1 cc1plus with -g0, the result is the same when it is built with stage2 cc1plus -g0 -fchecking. I've applied the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70594#c20 hack (only the tree.c part of it) and I can see that with -fchecking there are different DECL_UIDs (more of them too) created.
So, I believe it is the: /* When checking, try to get a constant value for all non-dependent expressions in order to expose bugs in *_dependent_expression_p and constexpr. */ if (flag_checking && cxx_dialect >= cxx11 /* Don't do this during nsdmi parsing as it can lead to unexpected recursive instantiations. */ && !parsing_nsdmi ()) fold_non_dependent_expr (expr); hunk in cp/pt.c (build_nondependent_expr), which results both in some extra DECL_UIDs being created (in theory it could be fine), but also in funcdef_no differences etc.: #0 _Z24allocate_struct_functionP9tree_nodeb (fndecl=0x7061b280, abstract_p=false) at /home/dje/src/gcc-6.0.1-RC2/gcc/function.c:4874 #1 0x10070e60 in _Z24start_preparsed_functionP9tree_nodeS0_i (decl1=0x7061b280, attrs=0x0, flags=1) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/decl.c:14059 #2 0x105feb9c in _Z16instantiate_declP9tree_nodeib (d=0x7061b280, defer_ok=0, expl_inst_class_mem_p=false) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/pt.c:21969 #3 0x1094fd38 in _Z9mark_usedP9tree_nodei (decl=0x7061b280, complain=0) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/decl2.c:5273 #4 0x1023262c in _ZL15build_over_callP11z_candidateii (cand=0x306323c0, flags=262145, complain=0) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/call.c:7707 #5 0x10221f8c in _Z23build_new_function_callP9tree_nodePP3vecIS0_5va_gc8vl_embedEbi (fn=0x7045e468, args=0x2ff200f0, koenig_p=false, complain=0) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/call.c:4152 #6 0x100dba64 in _Z16finish_call_exprP9tree_nodePP3vecIS0_5va_gc8vl_embedEbbi (fn=0x7045e468, args=0x2ff200f0, disallow_virtual=false, koenig_p=false, complain=0) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/semantics.c:2461 #7 0x105e30ac in _Z21tsubst_copy_and_buildP9tree_nodeS0_iS0_bb (t=0x7061e180, args=0x0, complain=0, in_decl=0x0, function_p=false, integral_constant_expression_p=true) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/pt.c:16629 #8 0x105a4bf8 in _Z39instantiate_non_dependent_expr_internalP9tree_nodei (expr=0x7061e180, complain=0) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/pt.c:5640 #9 0x114c1278 in _Z23fold_non_dependent_exprP9tree_node (t=0x7061e180) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/constexpr.c:4388 #10 0x10606e08 in _Z24build_non_dependent_exprP9tree_node (expr=0x7061e180) at /home/dje/src/gcc-6.0.1-RC2/gcc/cp/pt.c:23631 As stage1 is built with the default --enable-stage1-checking, the result between stage1 and stage2 auto-host.h is: CHECKING_P, ENABLE_GC_CHECKING, ENABLE_GIMPLE_CHECKING, ENABLE_RTL_FLAG_CHECKING, ENABLE_TREE_CHECKING, ENABLE_TYPES_CHECKING, and allocate_struct_function allocates a new cfun->funcdef_no, which is then emitted in the LFB..*/LFE..* labels, seems to affect the debug info (which on AIX is emitted in both stage2 and stage3) etc. Additionally, wonder if retrieve_specialization's if (flag_checking) verify_unstripped_args (args); couldn't also affect bootstrap, but haven't seen that during bootstrap. The question is what to do, comment out the fold_non_dependent_expr stuff altogether on the release branch (at least temporarily), guard with some other option (which unlike -fchecking would be documented to possibly affect code generation and which would be based on some other configure checking flag (and the default for that flag would be always the same between stage1 and later stages), something different? I'm in any case starting a bootstrap now with the fold_non_dependent_expr call commented out.
Commenting out the fold_non_dependent_expr call seems to work for me using the build method that regularly was failing before.
Can we add some testcases to ensure that -fchecking and similar flags don't accidentally affect code generation due to future changes?
Author: jakub Date: Tue Apr 26 06:08:20 2016 New Revision: 235429 URL: https://gcc.gnu.org/viewcvs?rev=235429&root=gcc&view=rev Log: PR bootstrap/70704 * pt.c (build_non_dependent_expr): Temporarily disable flag_checking guarded code. Modified: branches/gcc-6-branch/gcc/cp/ChangeLog branches/gcc-6-branch/gcc/cp/pt.c
Author: jakub Date: Tue Apr 26 06:10:43 2016 New Revision: 235430 URL: https://gcc.gnu.org/viewcvs?rev=235430&root=gcc&view=rev Log: PR bootstrap/70704 * configure.ac (--enable-stage1-checking): For --disable-checking or implicit --enable-checking, make sure extra flag matches in between stage1 and later checking. * configure: Regenerated. gcc/ * configure.ac (--enable-checking): Document extra flag, for non-release builds default to --enable-checking=yes,extra. If misc checking and extra checking, define CHECKING_P to 2 instead of 1. * common.opt (fchecking=): Add. * doc/invoke.texi (-fchecking=): Document. * doc/install.texi: Document --enable-checking changes. * configure: Regenerated. * config.in: Regenerated. gcc/cp/ * pt.c (build_non_dependent_expr): Use flag_checking > 1 instead of just flag_checking. Modified: trunk/ChangeLog trunk/configure trunk/configure.ac trunk/gcc/ChangeLog trunk/gcc/common.opt trunk/gcc/config.in trunk/gcc/configure trunk/gcc/configure.ac trunk/gcc/cp/ChangeLog trunk/gcc/cp/pt.c trunk/gcc/doc/install.texi trunk/gcc/doc/invoke.texi
Fixed. As for adding testcases, that is not trivial, we'd need to add a new framework for that that would compile some tests both with -fchecking and -fno-checking, and strip/sanitize the result and compare that.
Author: jakub Date: Sun May 1 10:49:25 2016 New Revision: 235692 URL: https://gcc.gnu.org/viewcvs?rev=235692&root=gcc&view=rev Log: PR bootstrap/70704 * configure.ac (--enable-stage1-checking): Add missing --enable-checking=. * configure: Regenerated. Modified: trunk/ChangeLog trunk/configure trunk/configure.ac