Bug 35518 - [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above
Summary: [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 an...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.4.0
: P1 blocker
Target Milestone: 4.4.0
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2008-03-09 22:22 UTC by John David Anglin
Modified: 2008-11-22 20:58 UTC (History)
5 users (show)

See Also:
Host:
Target: hppa2.0w-hp-hpux11.11, spu-elf
Build:
Known to work:
Known to fail:
Last reconfirmed: 2008-03-31 06:30:07


Attachments
20040709-1.c (606 bytes, text/plain)
2008-05-29 18:37 UTC, dave
Details
20040709-2i.c (427 bytes, text/plain)
2008-06-22 20:04 UTC, dave
Details
patch (1.13 KB, patch)
2008-06-23 13:39 UTC, Richard Biener
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John David Anglin 2008-03-09 22:22:21 UTC
Executing on host: /test/gnu/gcc/objdir/gcc/xgcc -B/test/gnu/gcc/objdir/gcc/ /te
st/gnu/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/20040709-1.c  -w  -O2  -fno-s
how-column  -lm   -o /test/gnu/gcc/objdir/gcc/testsuite/gcc/20040709-1.x2    (ti
meout = 300)
PASS: gcc.c-torture/execute/20040709-1.c compilation,  -O2 
Setting LD_LIBRARY_PATH to :/test/gnu/gcc/objdir/gcc::/test/gnu/gcc/objdir/gcc
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O2 

Breakpoint 2, 0x7afea688 in abort () from /usr/lib/libc.2
(gdb) bt
#0  0x7afea688 in abort () from /usr/lib/libc.2
#1  0x00006384 in testN ()
    at /test/gnu/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/20040709-1.c:103
#2  0x000078ec in main ()
    at /test/gnu/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/20040709-1.c:133

Revision 132898 was ok.
Comment 1 H.J. Lu 2008-03-11 14:24:09 UTC
Which revision isn't OK?
Comment 2 Andrew Pinski 2008-03-30 15:33:51 UTC
I also see this failure on powerpc-darwin after changing MOVE_RATIO.
Comment 3 Andrew Pinski 2008-03-31 06:30:07 UTC
retmeN is being miscompiled by SRA, at least for powerpc-darwin with a changed MOVE_RATIO.
retmeN (x)
{
  <unnamed-unsigned:29> x$i;
  <unnamed-unsigned:23> x$j;
  long long unsigned int SR.21;

<bb 2>:
  x$j = x.j;
  x$i = x.i;
  x.j = x$j;
  x.i = x$i;
  SR.21 = VIEW_CONVERT_EXPR<long long unsigned int>(x);
  <retval>.i = x$i;
  <retval>.j = x$j;
  <retval> = VIEW_CONVERT_EXPR<struct N>((long long unsigned int) ((<unnamed-unsigned:12>) (SR.21 >> 52) >> 4) << 52 | VIEW_CONVERT_EXPR<long long unsigned int>(<retval>) & 4503599627370495);
  return <retval>;

}

This also produces way crappy code rather just doing a non bit-field copy.  I don't know what SRA is trying to prove here.  The Move ratio value I changed rs6000 to is 5 and this is definitely more than 5 words.
Comment 4 Andrew Pinski 2008-04-02 20:01:24 UTC
spu-elf also fails the same way.
Comment 5 John David Anglin 2008-04-27 18:01:31 UTC
I first saw this in 133010.
Comment 6 dave 2008-04-29 16:08:46 UTC
Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> I first saw this in 133010.

It was introduced in 132969.

Dave
Comment 7 Richard Biener 2008-04-30 11:50:29 UTC
Can you reduce the testcase so I can try to analyze this with a cross?
Comment 8 dave 2008-05-29 18:37:16 UTC
Subject: Re:  [4.4 Regression] FAIL:
	gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> Can you reduce the testcase so I can try to analyze this with a cross?

Here is a somewhat reduced testcase.  The problem is with long long.

Dave
Comment 9 dave 2008-05-29 18:37:16 UTC
Created attachment 15703 [details]
20040709-1.c
Comment 10 John David Anglin 2008-06-22 20:02:21 UTC
The same change gcc.c-torture/execute/20040709-2.c on all hppa targets
(both 32 and 64 bit).  On hppa64-hp-hpux11.11, testK is the first test
to fail.  In this case, it appears retmeK is miscompiled.
Comment 11 dave 2008-06-22 20:04:43 UTC
Subject: Re:  [4.4 Regression] FAIL:
	gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> to fail.  In this case, it appears retmeK is miscompiled.

Simplified test attached.

Dave
Comment 12 dave 2008-06-22 20:04:43 UTC
Created attachment 15805 [details]
20040709-2i.c
Comment 13 John David Anglin 2008-06-22 20:44:19 UTC
retmeK (struct K x)
{
  <unnamed-unsigned:10> SR.14;
  <unnamed-unsigned:15> SR.13;
  <unnamed-unsigned:7> SR.12;
  <unnamed-unsigned:10> SR.11;
  <unnamed-unsigned:15> SR.10;
  <unnamed-unsigned:15> x$i;
  <unnamed-unsigned:10> x$j;
  <unnamed-unsigned:7> x$B0F7;
  struct K D.1258;

<bb 2>:
  x$B0F7_1 = 0;
  x$j_2 = 0;
  x$i_3 = 0;
  SR.13_4 = x.i;
  x$i_5 = SR.13_4;
  SR.14_6 = x.j;
  x$j_7 = SR.14_6;
  x$B0F7_8 = BIT_FIELD_REF <x, 7, 0>;
  D.1258.i ={v} x$i_5;
  D.1258.j ={v} x$j_7;
  SR.12_9 = x$B0F7_8 >> 1;
  BIT_FIELD_REF <D.1258, 7, 0> ={v} SR.12_9;
  return D.1258;

}

Don't understand...

At -O0, we just have

retmeK (struct K x)
{
  struct K D.1260;

<bb 2>:
  D.1260 = x;
  return D.1260;

}

and it works.
Comment 14 rguenther@suse.de 2008-06-22 21:13:44 UTC
Subject: Re:  [4.4 Regression] FAIL:
 gcc.c-torture/execute/20040709-1.c execution at -O2 and above

On Sun, 22 Jun 2008, danglin at gcc dot gnu dot org wrote:

> ------- Comment #13 from danglin at gcc dot gnu dot org  2008-06-22 20:44 -------
> retmeK (struct K x)
> {
>   <unnamed-unsigned:10> SR.14;
>   <unnamed-unsigned:15> SR.13;
>   <unnamed-unsigned:7> SR.12;
>   <unnamed-unsigned:10> SR.11;
>   <unnamed-unsigned:15> SR.10;
>   <unnamed-unsigned:15> x$i;
>   <unnamed-unsigned:10> x$j;
>   <unnamed-unsigned:7> x$B0F7;
>   struct K D.1258;
> 
> <bb 2>:
>   x$B0F7_1 = 0;
>   x$j_2 = 0;
>   x$i_3 = 0;
>   SR.13_4 = x.i;
>   x$i_5 = SR.13_4;
>   SR.14_6 = x.j;
>   x$j_7 = SR.14_6;
>   x$B0F7_8 = BIT_FIELD_REF <x, 7, 0>;
>   D.1258.i ={v} x$i_5;
>   D.1258.j ={v} x$j_7;
>   SR.12_9 = x$B0F7_8 >> 1;
>   BIT_FIELD_REF <D.1258, 7, 0> ={v} SR.12_9;
>   return D.1258;
> 
> }
> 
> Don't understand...
> 
> At -O0, we just have
> 
> retmeK (struct K x)
> {
>   struct K D.1260;
> 
> <bb 2>:
>   D.1260 = x;
>   return D.1260;
> 
> }
> 
> and it works.

Well, SRA is broken (cost-wise at least) since lxos changes.

Richard.
Comment 15 dave 2008-06-22 21:34:08 UTC
Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> >   x$B0F7_8 = BIT_FIELD_REF <x, 7, 0>;
> >   D.1258.i ={v} x$i_5;
> >   D.1258.j ={v} x$j_7;
> >   SR.12_9 = x$B0F7_8 >> 1;
> >   BIT_FIELD_REF <D.1258, 7, 0> ={v} SR.12_9;
> >   return D.1258;

> Well, SRA is broken (cost-wise at least) since lxos changes.

Why the shift?  It seems incorrect.

SRA also seems to be manipulating the k and l fields together for
some reason.  Why not all the fields since K is packed and the total
bit length is less than a word?

Dave
Comment 16 rguenther@suse.de 2008-06-22 22:23:38 UTC
Subject: Re:  [4.4 Regression] FAIL:
 gcc.c-torture/execute/20040709-1.c execution at -O2 and above

On Sun, 22 Jun 2008, dave at hiauly1 dot hia dot nrc dot ca wrote:

> 
> 
> ------- Comment #15 from dave at hiauly1 dot hia dot nrc dot ca  2008-06-22 21:34 -------
> Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c
> execution at -O2 and above
> 
> > >   x$B0F7_8 = BIT_FIELD_REF <x, 7, 0>;
> > >   D.1258.i ={v} x$i_5;
> > >   D.1258.j ={v} x$j_7;
> > >   SR.12_9 = x$B0F7_8 >> 1;
> > >   BIT_FIELD_REF <D.1258, 7, 0> ={v} SR.12_9;
> > >   return D.1258;
> 
> > Well, SRA is broken (cost-wise at least) since lxos changes.
> 
> Why the shift?  It seems incorrect.

It looks like it is only assigning the 6-bit part of the k, l
combination.  Is the above after the SRA pass in question or after
some more optimizations?

Eventually this is a bit-field-ref vs. BITS_BIG_ENDIAN/BYTES_BIG_ENDIAN
issue -- it was never clear to me how the bit-field position in
a bit-field-ref relates to those.

> SRA also seems to be manipulating the k and l fields together for
> some reason.  Why not all the fields since K is packed and the total
> bit length is less than a word?

The cost metric is of course simply broken.

I'll try to investigate at some point with a cross.

Richard.
Comment 17 dave 2008-06-22 22:43:58 UTC
Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> > ------- Comment #15 from dave at hiauly1 dot hia dot nrc dot ca  2008-06-22 21:34 -------
> > Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c
> > execution at -O2 and above
> > 
> > > >   x$B0F7_8 = BIT_FIELD_REF <x, 7, 0>;
> > > >   D.1258.i ={v} x$i_5;
> > > >   D.1258.j ={v} x$j_7;
> > > >   SR.12_9 = x$B0F7_8 >> 1;
> > > >   BIT_FIELD_REF <D.1258, 7, 0> ={v} SR.12_9;
> > > >   return D.1258;
> > 
> > > Well, SRA is broken (cost-wise at least) since lxos changes.
> > 
> > Why the shift?  It seems incorrect.
> 
> It looks like it is only assigning the 6-bit part of the k, l
> combination.  Is the above after the SRA pass in question or after
> some more optimizations?

It appears first in the esra pass dump:

;; Function retmeK (retmeK)

Initial instantiation for x
Using element-copy for x
  x$B0F7 -> x$B0F7
  x.j -> x$j
  x.i -> x$i
Initial instantiation for D.1258
Using block-copy for D.1258



Symbols to be put in SSA form

{ x x$B0F7 x$j x$i SR.10 SR.11 SR.12 SR.13 SR.14 }


Incremental SSA update started at block: 0

Number of blocks in CFG: 3
Number of blocks to update: 2 ( 67%)

Dave
Comment 18 Richard Biener 2008-06-23 13:39:03 UTC
Created attachment 15807 [details]
patch

This "fixes" it for me.  Can you check on hppa?
Comment 19 dave 2008-06-24 19:10:51 UTC
Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> This "fixes" it for me.  Can you check on hppa?

It also works on hppa.  Tested on hhpa-unknown-linux-gnu and
hppa64-hp-hpux11.11.

Thanks,
Dave
Comment 20 Richard Biener 2008-06-25 08:41:58 UTC
Subject: Bug 35518

Author: rguenth
Date: Wed Jun 25 08:41:14 2008
New Revision: 137100

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=137100
Log:
2008-06-25  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/35518
	* fold-const.c (fold_ternary): Strip trivial BIT_FIELD_REFs.
	* tree-sra.c (instantiate_element): Use fold_build3 to build
	BIT_FIELD_REFs.
	(try_instantiate_multiple_fields): Likewise.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c
    trunk/gcc/tree-sra.c

Comment 21 Richard Biener 2008-06-25 10:05:39 UTC
Fixed.
Comment 22 Alexandre Oliva 2008-06-25 20:20:43 UTC
Sorry that it took me so long to look at this.

Richi, I have a feeling that your patch will just paper over the problem.

See, if we take a bit-range that's not the entire bit-field, it will emit the same shifts, and it will break in the same way.

The problem is that there's a disconnect between the width of the underlying mode/type and the width of the variable in which it is held.

When I introduced bit-fields in SRA, one of my goals was to introduce scalars for "remaining" fields (i.e., those that didn't get scalar versions of their own) in such a way that no extraneous shifting was needed: we'd just extract the bits without shifting them around, for this would be just wasted computation.  It sufficed to apply masks to recombine these remaining fields into their proper place.

Now, when you changed type from a mode-sized type to a nonstandard bitfield-sized type and replaced some bitfield references with plain conversions, you broke this property and removed the fix-ups that would have taken care of ensuring the shifts (if any) matched.

At least that's my impression from looking at the dumps posted here, your recent patch to work around the exposed problem and your earlier patch that introduced it.
Comment 23 rguenther@suse.de 2008-06-25 21:48:59 UTC
Subject: Re:  [4.4 Regression] FAIL:
 gcc.c-torture/execute/20040709-1.c execution at -O2 and above

On Wed, 25 Jun 2008, aoliva at gcc dot gnu dot org wrote:

> ------- Comment #22 from aoliva at gcc dot gnu dot org  2008-06-25 20:20 -------
> Sorry that it took me so long to look at this.
>
> Richi, I have a feeling that your patch will just paper over the problem.

I know.

> See, if we take a bit-range that's not the entire bit-field, it will emit the
> same shifts, and it will break in the same way.
>
> The problem is that there's a disconnect between the width of the underlying
> mode/type and the width of the variable in which it is held.
>
> When I introduced bit-fields in SRA, one of my goals was to introduce scalars
> for "remaining" fields (i.e., those that didn't get scalar versions of their
> own) in such a way that no extraneous shifting was needed: we'd just extract
> the bits without shifting them around, for this would be just wasted
> computation.  It sufficed to apply masks to recombine these remaining fields
> into their proper place.
>
> Now, when you changed type from a mode-sized type to a nonstandard
> bitfield-sized type and replaced some bitfield references with plain
> conversions, you broke this property and removed the fix-ups that would have
> taken care of ensuring the shifts (if any) matched.

The _result_ type is what I changed.  The result type of a BIT_FIELD_REF
needs to match the bitfield width.  If you can point out where I changed
the field type that is referenced then you have found the error.

Richard.
Comment 24 Alexandre Oliva 2008-06-26 00:33:58 UTC
It's not just the result type that changed.  You actually changed the type of the variable created to hold the group of bit fields, out of which we further extract members that were not mapped to separate variables.  This might affect bitfield simplifications based on mode size rather than type width.  I can't say that's it, but I know I may have based some code on this assumption that you broke.

It also seems to me that this change to the base type of the variable breaks sra_build_elt_assignment(), because the by-design conditions might no longer be met.  Finally, I don't see how you could assume that the else block for the if full-width bit-field could be extracted with as little as a cast.

This is what jumped at me at first.  I haven't actually built compilers based on the state before and after your patch to tell whether that's it, but these are the most likely culprits.

I hope this helps,
Comment 25 rguenther@suse.de 2008-06-26 09:33:37 UTC
Subject: Re:  [4.4 Regression] FAIL:
 gcc.c-torture/execute/20040709-1.c execution at -O2 and above

On Thu, 26 Jun 2008, aoliva at gcc dot gnu dot org wrote:

> ------- Comment #24 from aoliva at gcc dot gnu dot org  2008-06-26 00:33 -------
> It's not just the result type that changed.  You actually changed the type of
> the variable created to hold the group of bit fields, out of which we further
> extract members that were not mapped to separate variables.  This might affect
> bitfield simplifications based on mode size rather than type width.  I can't
> say that's it, but I know I may have based some code on this assumption that
> you broke.
>
> It also seems to me that this change to the base type of the variable breaks
> sra_build_elt_assignment(), because the by-design conditions might no longer be
> met.  Finally, I don't see how you could assume that the else block for the if
> full-width bit-field could be extracted with as little as a cast.
>
> This is what jumped at me at first.  I haven't actually built compilers based
> on the state before and after your patch to tell whether that's it, but these
> are the most likely culprits.
>
> I hope this helps,

No, sorry.  Please point me to the place where I changed the type of the
variable created to hold the group of bit fields.

Richard.
Comment 26 Alexandre Oliva 2008-06-26 18:13:49 UTC
The place where you said you were just changing the type of the result.  Note that it's that type that determines the type of the scalar variable created to hold the group.  If it differs from the expected type, the type of the BIT_FIELD_EXPR is adjusted *after* the temporary is created in instantiate_missing_elements_1.
Comment 27 Andrew Pinski 2008-08-16 23:41:34 UTC
Fixed for spu between:
-LAST_UPDATED: Thu May  8 15:05:11 UTC 2008 (revision 135088)
+LAST_UPDATED: Sun Jun 29 21:32:49 UTC 2008 (revision 137266)

And for PPC (with the cost change);
-LAST_UPDATED: Mon Jun 23 19:12:32 UTC 2008 (revision 137049)
+LAST_UPDATED: Tue Jul  8 16:22:03 UTC 2008 (revision 137638)

Any news on a patch that does not paper over the issue?
Comment 28 John David Anglin 2008-09-01 17:40:24 UTC
On hppa64-hp-hpux11.11, the test still fails at certain optimizations:

FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O0 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O1 

# ./xgcc -B./ -v
Reading specs from ./specs
Target: hppa64-hp-hpux11.11
Configured with: ../gcc/configure --with-gnu-as --with-as=/opt/gnu64/bin/as --with-ld=/usr/ccs/bin/ld --enable-shared --disable-nls --with-local-prefix=/opt/gnu64 --prefix=/opt/gnu64/gcc/gcc-4.4.0 --build=hppa64-hp-hpux11.11 --enable-threads=posix --disable-libmudflap --with-gmp=/opt/gnu64/gcc/gcc-4.4.0 --enable-languages=c++
Thread model: posix
gcc version 4.4.0 20080831 (experimental) [trunk revision 139820] (GCC) 
Comment 29 dave 2008-09-01 18:17:29 UTC
Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> On hppa64-hp-hpux11.11, the test still fails at certain optimizations:
> 
> FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O0 
> FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O1 

I'm also seeing the following two fails which appear at first sight to
be related:

FAIL: gcc.c-torture/execute/920908-2.c execution,  -O0 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O1 

Dave
Comment 30 rguenther@suse.de 2008-09-02 09:16:20 UTC
Subject: Re:  [4.4 Regression] FAIL:
 gcc.c-torture/execute/20040709-1.c execution at -O2 and above

On Mon, 1 Sep 2008, dave at hiauly1 dot hia dot nrc dot ca wrote:

> ------- Comment #29 from dave at hiauly1 dot hia dot nrc dot ca  2008-09-01 18:17 -------
> Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c
> execution at -O2 and above
> 
> > On hppa64-hp-hpux11.11, the test still fails at certain optimizations:
> > 
> > FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O0 
> > FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O1 
> 
> I'm also seeing the following two fails which appear at first sight to
> be related:
> 
> FAIL: gcc.c-torture/execute/920908-2.c execution,  -O0 
> FAIL: gcc.c-torture/execute/920908-2.c execution,  -O1 

Interesting.  The -O0 failure hints at either wrong IL from the start,
problems with expansion or with the backend itself (expansion of
bitfield operations nowadays expects to be able to truncate intermediate
results to the respective bitfield precision, in former times this
was not done).

Please try to reduce these large testcases and analyze the -O0 failure.

Thanks,
Richard.
Comment 31 Jakub Jelinek 2008-10-23 15:13:08 UTC
Both testcases involve passing of small structures, so might as well be the PA small struct passing bug.
Comment 32 dave 2008-10-23 16:00:39 UTC
Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> Both testcases involve passing of small structures, so might as well be the PA
> small struct passing bug.

Both testcases are fixed by Jakub's proposed fix for PR middle-end/37316.

Dave
Comment 33 Jakub Jelinek 2008-10-29 08:29:30 UTC
In this case they should be fixed with your backend change too.
Comment 34 dave 2008-10-29 15:00:58 UTC
Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> In this case they should be fixed with your backend change too.

No, my change was only for the 64-bit targets.  32-bit targets
including hppa2.0w-hp-hpux11.11 pad downward.

The reason this PR is still open is comment #22.

Dave
Comment 35 Steven Bosscher 2008-11-21 21:29:14 UTC
So is there a test case with current top-of-trunk that fails?  This is marked as a P1 regression, but IIUC we don't even have a test case, after Jakub's fix for PR37316 ?
Comment 36 dave 2008-11-21 22:26:16 UTC
Subject: Re:  [4.4 Regression] FAIL: gcc.c-torture/execute/20040709-1.c execution at -O2 and above

> ------- Comment #35 from steven at gcc dot gnu dot org  2008-11-21 21:29 -------
> So is there a test case with current top-of-trunk that fails?  This is marked
> as a P1 regression, but IIUC we don't even have a test case, after Jakub's fix
> for PR37316 ?

As noted in comment #34, the original PR was for the 32-bit target
hppa2.0w-hp-hpux11.11.  The small struct fix that I applied for PR37316
doesn't affect this target.

Unless Alex can come up with a testcase to demonstrate his point, I
think the PR should be closed.

As a side note, I might have missed it, but I don't think Jakub's
change has been approved.  I can improve the backend code if it is
approved.

Dave
Comment 37 Jakub Jelinek 2008-11-21 23:18:58 UTC
It hasn't been, I've pinged it 10 days ago, will try again next Monday.
Comment 38 Jakub Jelinek 2008-11-22 20:58:12 UTC
Closing, if somebody comes up with a testcase, please reopen.