Bug 18574 - [4.0 Regression] bootstrap comprison failed
Summary: [4.0 Regression] bootstrap comprison failed
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.0.0
: P2 critical
Target Milestone: 4.0.0
Assignee: Jeffrey A. Law
URL:
Keywords: build, wrong-code
Depends on:
Blocks:
 
Reported: 2004-11-20 00:42 UTC by H.J. Lu
Modified: 2004-11-21 18:12 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2004-11-20 14:12:48


Attachments
PPP (352 bytes, text/plain)
2004-11-21 15:03 UTC, Jeffrey A. Law
Details

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2004-11-20 00:42:48 UTC
With gcc 4.0 from CVS at Sat Nov 20 00:13:50 UTC 2004, I got

Bootstrap comparison failure!
./fold-const.o differs
cp/decl.o differs
make[3]: *** [gnucompare] Error 1
make[3]: Leaving directory `/export/build/gnu/gcc/build-x86_64-linux/gcc'
make[2]: *** [bootstrap] Error 2

The differences are in debug sections.
Comment 1 Eric Botcazou 2004-11-20 12:59:03 UTC
I can't reproduce with:

cat LAST_UPDATED
Sat Nov 20 11:33:24 CET 2004
Sat Nov 20 10:33:24 UTC 2004

Configured with: /home/eric/cvs/gcc/configure amd64-mandrake-linux-gnu
--prefix=/home/eric/install/gcc/native --enable-__cxa_atexit
--enable-languages=c,c++,objc,f95,java
--enable-checking=assert,misc,tree,rtl,rtlflag --disable-libmudflap
Thread model: posix
gcc version 4.0.0 20041120 (experimental)

However, I'm seeing a bootstrap comparison failure on sparc-sun-solaris2.5.1:

Bootstrap comparison failure!
./fold-const.o differs
cp/typeck.o differs

and sparc-sun-solaris2.6:

Bootstrap comparison failure!
./fold-const.o differs

but not on sparc*-sun-solaris2.x (x >=7).
Comment 2 Eric Botcazou 2004-11-20 13:40:48 UTC
> However, I'm seeing a bootstrap comparison failure on sparc-sun-solaris2.5.1:
> 
> Bootstrap comparison failure!
> ./fold-const.o differs
> cp/typeck.o differs
> 
> and sparc-sun-solaris2.6:
> 
> Bootstrap comparison failure!
> ./fold-const.o differs

The differences are only in debug sections too.  And both targets use STABS,
unlike sparc*-sun-solaris2.x (x>=7) that have switched to DWARF-2.
Comment 3 Serge Belyshev 2004-11-20 14:01:46 UTC
I can confirm this on i686-pc-linux-gnu:

Bootstrap comparison failure!
./fold-const.o differs
./loop.o differs
Comment 4 Eric Botcazou 2004-11-20 14:12:47 UTC
At this point we can say that there is a problem somewhere.
Comment 5 Andreas Tobler 2004-11-20 15:47:09 UTC
I get differs too on powerpc-apple-darwin7.6.0 with --enable-checking (and without):
Bootstrap comparison failure!
./combine.o differs
./convert.o differs
./fold-const.o differs
cp/parser.o differs
cp/typeck.o differs
java/parse.o differs

These failure starts just after my patch to tree-vectorizer.c (My patch
bootstraps successful)

Either this: 
http://gcc.gnu.org/ml/gcc-cvs/2004-11/msg00935.html

Or this patch seem responsible:

http://gcc.gnu.org/ml/gcc-cvs/2004-11/msg00936.html

I could reproduce it.

with cvs update -D before these patches -> ok.
cvs update -D after these patches, nok.
Comment 6 H.J. Lu 2004-11-20 17:33:20 UTC
I have verified that this patch

http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01624.html

is the cause.
Comment 7 H.J. Lu 2004-11-20 21:47:47 UTC
FYI, I saw the same problem on Linux/ia64, Linux/ia32 and Linux/x86_64. I am
using the last binutils from CVS if that matters.
Comment 8 Andrew Pinski 2004-11-20 21:56:53 UTC
I see it on i686-pc-linux-gnu, i686-pc-openbsd3.1 and powerpc-darwin.  My linux box has 1GB of 
memory, maybe this is a GC problem but I really dout it.
Comment 9 Jeffrey A. Law 2004-11-20 23:20:46 UTC
Subject: Re:  [4.0 Regression] bootstrap comprison
	failed

On Sat, 2004-11-20 at 21:56 +0000, pinskia at gcc dot gnu dot org wrote:
> ------- Additional Comments From pinskia at gcc dot gnu dot org  2004-11-20 21:56 -------
> I see it on i686-pc-linux-gnu, i686-pc-openbsd3.1 and powerpc-darwin.  My linux box has 1GB of 
> memory, maybe this is a GC problem but I really dout it.
Actually, it has all the classic signs of a GC issue.

jeff


Comment 10 Jeffrey A. Law 2004-11-21 04:24:33 UTC
Subject: Re:  [4.0 Regression] bootstrap comprison
	failed

On Sat, 2004-11-20 at 21:56 +0000, pinskia at gcc dot gnu dot org wrote:
> ------- Additional Comments From pinskia at gcc dot gnu dot org  2004-11-20 21:56 -------
> I see it on i686-pc-linux-gnu, i686-pc-openbsd3.1 and powerpc-darwin.  My linux box has 1GB of 
> memory, maybe this is a GC problem but I really dout it.
Unable to reproduce on my PPC box either.

I'm going to try a couple more things tonight to try and ferret out
this problem.  But if those are not successful I'm going to need
someone to do a little legwork to help me nail this thing down.

jeff


Comment 11 Jeffrey A. Law 2004-11-21 05:15:25 UTC
I've found something that might be of interest.  It's certainly odd, but
I don't yet know if the odd behavior I'm could explain the bootstrap
comparison failures yet.   I'm still poking... 

Jeff
Comment 12 Jeffrey A. Law 2004-11-21 06:31:54 UTC
OK.  I can see a way that we might be getting these comparison failures.
We're hashing on pointers, then doing table traversals.  If the memory
layout changes, the ordering of objects in the hash table can change.

Changing the order of objects in the hash table changes the order in
which they are presented to the callbacks during a hash table traversal.

That in turn can change the ordering of incoming edges in newly created
blocks.  That in turn can change the order of PHI arguments in those
newly created blocks.

Changing the order of arguments within the PHI nodes can change how
we coalesce during the out-of-ssa translation.

At least that's the best theory I've got.  I'm testing a patch which
loses the pointer hash (we can hash on the index of the destination block).
That ought to bring stability to the hash table traversals even if the
memory map changes.  It should also be slightly faster.

Of course since I haven't been able to actually reproduce the failure I
have no way of directly knowing if my theory is correct.  I'll have to
rely on y'all and the autotester to tell me if things are better.

Anyway, I'm hoping to get those changes in tonight if testing can complete
before I crash for the night.
Comment 13 Jeffrey A. Law 2004-11-21 15:03:15 UTC
Subject: Re:  [4.0 Regression] bootstrap comprison
	failed

I've been unable to reproduce the comparison failures.  However, as I
outlined in an earlier message, I have come up with a scenario in which
my patch might cause a comparison failure.

This patch changes the hashing routine to use block indices rather
than hash on pointers.  That ought to stabilize the hash (and thus the
hash table traversals and SSA_NAME coalescing) in cases where it
was unstable before.

Since I've been unable to trigger the failure here, I can't say for
certain whether or not this patch fixes the bootstrap failures others
have seen.

FWIW, this has been bootstrapped and regression tested on
i686-pc-linux-gnu.


Comment 14 Jeffrey A. Law 2004-11-21 15:03:16 UTC
Created attachment 7577 [details]
PPP
Comment 15 Diego Novillo 2004-11-21 15:22:28 UTC
Subject: Re:  [4.0 Regression] bootstrap comprison
	failed

On Sun, 2004-11-21 at 08:02 -0700, Jeffrey A Law wrote:

> Since I've been unable to trigger the failure here, I can't say for
> certain whether or not this patch fixes the bootstrap failures others
> have seen.
> 
If it helps, one of my i686 testers bootstraps with checking disabled,
the other one with checking enabled.  The bootstrap failure occurs with
checking disabled.  This is as of last night at 1am EST.

Bootstrap comparison failure!
./tree-ssa-loop-niter.o differs
cp/decl.o differs
cp/pt.o differs
cp/typeck.o differs
java/parse.o differs
make[1]: *** [gnucompare] Error 1
make[1]: Leaving directory `/notnfs/dnovillo/sbox/gcc/bld.tobiano/gcc'
make: *** [bootstrap] Error 2


Diego.

Comment 16 Eric Botcazou 2004-11-21 15:31:14 UTC
> Created an attachment (id=7577)

Testing it on sparc-sun-solaris2.5.1, sparc-sun-solaris2.6 and
i586-redhat-linux-gnu...
Comment 17 H.J. Lu 2004-11-21 15:53:16 UTC
gnucompare has passed on Linux/ia64, Linux/ia32 and Linux/x86_64 now.
Comment 18 Andrew Pinski 2004-11-21 15:55:41 UTC
I am testing it right now on ppc-darwin too.
Comment 19 Andrew Pinski 2004-11-21 17:11:14 UTC
Fixed so closing.
Comment 20 Eric Botcazou 2004-11-21 18:12:49 UTC
> Testing it on sparc-sun-solaris2.5.1, sparc-sun-solaris2.6 and
> i586-redhat-linux-gnu...

OK on the 3 platforms.
Comment 21 Stan Shebs 2004-11-22 00:24:49 UTC
Subject: Re:  [4.0 Regression] bootstrap comprison	failed

Jeffrey A Law wrote:

>I've been unable to reproduce the comparison failures.  However, as I
>outlined in an earlier message, I have come up with a scenario in which
>my patch might cause a comparison failure.
>
>
This fixes bootstrap compare failures on Darwin, thank you saving
me from the horrible debugging hell I was dreading when I saw
miscompares last night... :-)

Stan

>This patch changes the hashing routine to use block indices rather
>than hash on pointers.  That ought to stabilize the hash (and thus the
>hash table traversals and SSA_NAME coalescing) in cases where it
>was unstable before.
>
>Since I've been unable to trigger the failure here, I can't say for
>certain whether or not this patch fixes the bootstrap failures others
>have seen.
>
>FWIW, this has been bootstrapped and regression tested on
>i686-pc-linux-gnu.
>
>
>
>
>------------------------------------------------------------------------
>
>	* tree-ssa-threadupdate.c (redirection_data_hash): Use the
>	index of the destination block for the hash value rather than
>	hashing a pointer.
>
>Index: tree-ssa-threadupdate.c
>===================================================================
>RCS file: /cvs/gcc/gcc/gcc/tree-ssa-threadupdate.c,v
>retrieving revision 2.15
>diff -c -p -r2.15 tree-ssa-threadupdate.c
>*** tree-ssa-threadupdate.c	20 Nov 2004 12:48:13 -0000	2.15
>--- tree-ssa-threadupdate.c	21 Nov 2004 15:00:33 -0000
>*************** static hashval_t
>*** 203,209 ****
>  redirection_data_hash (const void *p)
>  {
>    edge e = ((struct redirection_data *)p)->outgoing_edge;
>!   return htab_hash_pointer (e);
>  }
>  
>  static int
>--- 203,209 ----
>  redirection_data_hash (const void *p)
>  {
>    edge e = ((struct redirection_data *)p)->outgoing_edge;
>!   return e->dest->index;
>  }
>  
>  static int
>