Bug 59470 - [4.8 Regression] libstdc++ miscompilation after r205709
[4.8 Regression] libstdc++ miscompilation after r205709
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: middle-end
4.8.3
: P1 normal
: 4.8.3
Assigned To: Not yet assigned to anyone
: wrong-code
Depends on:
Blocks: 58956
  Show dependency treegraph
 
Reported: 2013-12-11 14:13 UTC by Jakub Jelinek
Modified: 2014-01-08 10:23 UTC (History)
5 users (show)

See Also:
Host:
Target: i686-linux
Build:
Known to work: 4.8.2
Known to fail:
Last reconfirmed:


Attachments
11.ii.bz2 (134.64 KB, application/x-bzip2)
2013-12-11 14:15 UTC, Jakub Jelinek
Details
locale-inst.ii.bz2 (50.47 KB, application/x-bzip2)
2013-12-11 14:15 UTC, Jakub Jelinek
Details
The patch fixing incorrect code generation (1.80 KB, patch)
2013-12-12 02:00 UTC, Vladimir Makarov
Details | Diff
gcc49-pr59470.patch (3.31 KB, patch)
2013-12-12 09:54 UTC, Jakub Jelinek
Details | Diff
gcc48-pr59470-test.patch (3.70 KB, patch)
2013-12-12 12:56 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2013-12-11 14:13:56 UTC
I've noticed recent regressions
+FAIL: 22_locale/num_put/put/char/11.cc execution test
+FAIL: 22_locale/num_put/put/char/20914.cc execution test
+FAIL: 22_locale/num_put/put/char/23953.cc execution test
+FAIL: 22_locale/num_put/put/char/3.cc execution test
+FAIL: 22_locale/num_put/put/char/9780-2.cc execution test
+FAIL: 22_locale/num_put/put/char/wrapped_env.cc execution test
+FAIL: 22_locale/num_put/put/char/wrapped_locale.cc execution test
+FAIL: 27_io/manipulators/basefield/char/1.cc execution test
on i686-linux in our vendor build, but apparently this is reproduceable also on the vanilla branches/gcc-4_8-branch.  I'll attach locale-inst.ii (the libstdc++ src/c++98 source) and 11.ii (the first of the above mentioned failing tests).

/usr/src/gcc-4.8/obj/gcc/cc1plus.205708 -O2 -fstack-protector -m32 -fPIC \
locale-inst.ii -o locale-inst.s  -march=i686 -quiet; \
gcc -o 11 11.ii -m32 locale-inst.s -lstdc++ -lm; ./11
works fine, while:
/usr/src/gcc-4.8/obj/gcc/cc1plus.205709 -O2 -fstack-protector -m32 -fPIC \
locale-inst.ii -o locale-inst.s  -march=i686 -quiet; \
gcc -o 11 11.ii -m32 locale-inst.s -lstdc++ -lm; ./11
11: /builddir/build/BUILD/gcc-4.8.2-20131209/libstdc++-v3/testsuite/22_locale/num_put/put/char/11.cc:52: void test01(): Assertion `result1 == "1,2,3,45,678"' failed.
Aborted (core dumped)
Comment 1 Jakub Jelinek 2013-12-11 14:15:19 UTC
Created attachment 31418 [details]
11.ii.bz2
Comment 2 Jakub Jelinek 2013-12-11 14:15:52 UTC
Created attachment 31419 [details]
locale-inst.ii.bz2
Comment 3 H.J. Lu 2013-12-11 14:35:31 UTC
I have been tracking 4.8 branch on Linux/i686. I didn't
see any libstdc++ failures on Fedora 19 today:

http://gcc.gnu.org/ml/gcc-testresults/2013-12/msg01024.html
Comment 4 Jakub Jelinek 2013-12-11 14:37:11 UTC
(In reply to H.J. Lu from comment #3)
> I have been tracking 4.8 branch on Linux/i686. I didn't
> see any libstdc++ failures on Fedora 19 today:
> 
> http://gcc.gnu.org/ml/gcc-testresults/2013-12/msg01024.html

Do you build libstdc++ with -fstack-protector -march=i686 though?
Comment 5 H.J. Lu 2013-12-11 14:42:48 UTC
(In reply to Jakub Jelinek from comment #4)
> (In reply to H.J. Lu from comment #3)
> > I have been tracking 4.8 branch on Linux/i686. I didn't
> > see any libstdc++ failures on Fedora 19 today:
> > 
> > http://gcc.gnu.org/ml/gcc-testresults/2013-12/msg01024.html
> 
> Do you build libstdc++ with -fstack-protector -march=i686 though?

No, I didn't.  How do I enable -fstack-protector for libstdc++?
Comment 6 Jakub Jelinek 2013-12-11 15:52:32 UTC
CXXFLAGS='-fstack-protector ...' ../configure ...
(at least I believe so, we override a bunch of other variables in the gcc.spec file).

Anyway, I've instrumented gcc so that based on env var it used the r205708 way of TER for some functions and r205709 for others and it seems for this testcase the only problematic one is
_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE13_M_insert_intIlEES3_S3_RSt8ios_basecT_

On that function, I see that the r205709 change made _46, _47, _114, _119 and _126 not TERable.
That's:
  _46 = MEM[(const struct __cache_type *)prephitmp_130]._M_grouping_size;
  _47 = MEM[(const struct __cache_type *)prephitmp_130]._M_grouping;
  std::num_put<char>::_M_group_int (this_48(D), _47, _46, _45, __io_10(D), __cs_43, __cs_36, &__len);
...
  _114 = MEM[(const struct locale &)__io_10(D) + 108]._M_impl;
  std::locale::_Impl::_M_install_cache (_114, __tmp_113, __i_107);
...
  _119 = MEM[(int (*__vtbl_ptr_type) () *)_118 + 4B];
  OBJ_TYPE_REF(_119;__tmp_113->1) (__tmp_113);
...
  _126 = MEM[(int (*__vtbl_ptr_type) () *)_125 + 48B];
  _127 = OBJ_TYPE_REF(_126;__s$_M_sbuf_106->12) (__s$_M_sbuf_106, __cs_3, prephitmp_175);

All of these single use SSA_NAMEs.  I don't see why these shouldn't be TERable, function arguments necessarily are evaluated before the function is called and the aliasing stmt_may_clobber_ref_p talks about are either the side-effects of the call itself, or storing to the lhs.
So, for calls, can't we do only the first part of stmt_may_clobber_ref_p_1 handling and not the rest (i.e. the gimple_call_lhs handling there only), unless the SSA_NAME is used somewhere in the lhs of the call?
Similarly for ASMs, IMHO terring into inline asm is especially important, while your change pretty much disables it always:
  else if (gimple_code (stmt) == GIMPLE_ASM)
    return true;
What should be avoided is if SSA_NAME is used in some output operand (say as part of a reference), that I can understand why we shouldn't ter.  But otherwise?

Note, this of course doesn't explain why this PR breaks.
Comment 7 Jakub Jelinek 2013-12-11 16:28:19 UTC
For the inline asm, consider e.g.:
int a, b;
int
foo (void)
{
  int c;
  asm ("" : "=r" (c) : "rm" (a), "rm" (b) : "memory");
  return c;
}
where r205709 regresses expansion (not even combiner can fix it up, LRA cures it though), and:
int a, b, d[1024];
int
main ()
{
  asm ("movl $6, (%2); movl $1, %0" : "=r" (d[a]) : "rm" (b), "r" (&a) : "memory");
  if (d[0] != 1 || d[6] != 0)
    __builtin_abort ();
  return 0;
}
(which was miscompiled before PR58956, shall we add it into the testsuite?).
While at it, for trunk, I wonder if:
      if (is_gimple_call (stmt)
          && !((fndecl = gimple_call_fndecl (stmt))
               && DECL_BUILT_IN (fndecl)))
        cur_call_cnt++;
shouldn't also not increment for gimple_call_internal_p calls, they are builtins always.
Comment 8 Jakub Jelinek 2013-12-11 18:24:18 UTC
So, debugging and inspection shows that it is the
_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE12_M_group_intEPKcjcRSt8ios_basePcS9_Ri
call in the _M_insert_int method that gets bogus arguments, in particular
the __new argument for it, which is supposed to be what the second alloca returned plus 2, has the same value as the next argument __cs, both are set to the result of the first alloca + 20 - __len (8 on this testcase).
In the assembly one can easily see it:
        movl    %ecx, 28(%esp)
        movl    -84(%ebp), %ecx
        movl    %edx, 24(%esp)  <--- here, correct value
        movl    %edx, 20(%esp)  <--- here, incorrect value
        movsbl  37(%edi), %edx
        movl    %eax, 8(%esp)
        movl    %ecx, 4(%esp)
        movl    -72(%ebp), %ecx
        movl    %edx, 12(%esp)
        movl    %ecx, (%esp)
        call    _ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE12_M_group_intEPKcjcRSt8ios_basePcS9_Ri@PLT
Comment 9 Jakub Jelinek 2013-12-11 18:42:56 UTC
So, it looks like a register allocation bug.  We have in -fdump-rtl-ira-slim IMHO correct:
   92: {sp:SI=sp:SI-0x30;clobber flags:CC;}
      REG_UNUSED flags:CC
   94: {r165:SI=sp:SI+0x2f;clobber flags:CC;}
      REG_UNUSED flags:CC
   95: NOTE_INSN_DELETED
   96: {r165:SI=r165:SI&0xfffffffffffffff0;clobber flags:CC;}
      REG_UNUSED flags:CC
...
  152: {r175:SI=r165:SI-r74:SI;clobber flags:CC;}
      REG_DEAD r165:SI
      REG_UNUSED flags:CC
  153: r59:SI=r175:SI+0x14
      REG_DEAD r175:SI
...
  164: {sp:SI=sp:SI-r185:SI;clobber flags:CC;}
      REG_DEAD r185:SI
      REG_UNUSED flags:CC
  166: {r189:SI=sp:SI+0x2f;clobber flags:CC;}
      REG_UNUSED flags:CC
  167: NOTE_INSN_DELETED
  168: {r189:SI=r189:SI&0xfffffffffffffff0;clobber flags:CC;}
      REG_UNUSED flags:CC
  170: {r82:SI=r189:SI+0x2;clobber flags:CC;}
      REG_DEAD r189:SI
      REG_UNUSED flags:CC
...
  174: [sp:SI+0x1c]=r190:SI
      REG_DEAD r190:SI
      REG_EQUAL frame:SI-0x8
  175: [sp:SI+0x18]=r59:SI
      REG_DEAD r59:SI
  176: [sp:SI+0x14]=r82:SI
  177: [sp:SI+0x10]=r139:SI
  178: r191:SI=sign_extend([r124:SI+0x25])
      REG_EQUIV [sp:SI+0xc]
  179: [sp:SI+0xc]=r191:SI
      REG_DEAD r191:SI
  180: [sp:SI+0x8]=r85:SI
      REG_DEAD r85:SI
  181: [sp:SI+0x4]=r86:SI
      REG_DEAD r86:SI
  182: [sp:SI]=r137:SI
  183: call [`_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE12_M_group_intEPKcjcRSt8ios_basePcS9_Ri'] argc:0x20

Note that r165 pseudo is live across call to another function which is in between insn 96 and insn 152.  In -fdump-rtl-reload-slim we have incorrect:

   92: {sp:SI=sp:SI-0x30;clobber flags:CC;}
   94: {dx:SI=sp:SI+0x2f;clobber flags:CC;}
   95: NOTE_INSN_DELETED
   96: {dx:SI=dx:SI&0xfffffffffffffff0;clobber flags:CC;}
...
  441: [bp:SI-0x2c]=dx:SI
      REG_DEAD dx:SI
...
  442: dx:SI=[bp:SI-0x2c]
  152: {dx:SI=dx:SI-cx:SI;clobber flags:CC;}
      REG_DEAD dx:SI
  153: ax:SI=dx:SI+0x14
      REG_DEAD dx:SI
  461: [bp:SI-0x2c]=ax:SI
      REG_DEAD ax:SI
...
  164: {sp:SI=sp:SI-ax:SI;clobber flags:CC;}
      REG_DEAD ax:SI
  166: {ax:SI=sp:SI+0x2f;clobber flags:CC;}
  167: NOTE_INSN_DELETED
  168: {ax:SI=ax:SI&0xfffffffffffffff0;clobber flags:CC;}
  170: {ax:SI=ax:SI+0x2;clobber flags:CC;}
      REG_DEAD ax:SI
  423: dx:SI=ax:SI
      REG_DEAD ax:SI
...
  174: [sp:SI+0x1c]=cx:SI
      REG_DEAD cx:SI
      REG_EQUAL frame:SI-0x8
  460: dx:SI=[bp:SI-0x2c]
  175: [sp:SI+0x18]=dx:SI
      REG_DEAD dx:SI
  444: [bp:SI-0x2c]=dx:SI
  425: dx:SI=dx:SI
      REG_DEAD dx:SI
  176: [sp:SI+0x14]=dx:SI
      REG_DEAD dx:SI
  177: [sp:SI+0x10]=si:SI
  178: dx:SI=sign_extend([di:SI+0x25])
      REG_EQUIV [sp:SI+0xc]
  179: [sp:SI+0xc]=dx:SI
      REG_DEAD dx:SI
  180: [sp:SI+0x8]=ax:SI
      REG_DEAD ax:SI
  426: cx:SI=[bp:SI-0x54]
  181: [sp:SI+0x4]=cx:SI
      REG_DEAD cx:SI
  427: cx:SI=[bp:SI-0x48]
  182: [sp:SI]=cx:SI
      REG_DEAD cx:SI
  183: call [`_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE12_M_group_intEPKcjcRSt8ios_basePcS9_Ri'] argc:0x20

The bug I see is in the 460/444 reloads for insn 175.  The correct value that insn 176 is supposed to store is live in edx register iup to insn 174,
but LRA? decides to throw away it's value when reloading insn 175 and loads there the value of former pseudo r59 from [bp-0x2c], stores that correctly into [sp+0x18] and saves to [bp-0x2c] again (why?  the value hasn't really changed).  But the old edx (pseudo r82) is lost.  Vlad, can you please have a look?
Comment 10 Jakub Jelinek 2013-12-11 19:45:58 UTC
Re: the #c7 second testcase, rth thinks it is undefined behavior since there is no sequence point.  So here is a better testcase that should have defined behavior, still before r205709 we miscompile it at -O2:

int a, b, d[1024];
int
main ()
{
  int c = a;
  asm ("movl $6, (%2); movl $1, %0" : "=r" (d[c])
       : "rm" (b), "r" (&a) : "memory");
  if (d[0] != 1 || d[6] != 0)
    __builtin_abort ();
  return 0;
}
Comment 11 Vladimir Makarov 2013-12-12 02:00:32 UTC
Created attachment 31423 [details]
The patch fixing incorrect code generation
Comment 12 Vladimir Makarov 2013-12-12 02:01:13 UTC
(In reply to Jakub Jelinek from comment #9)
>  In -fdump-rtl-reload-slim we have incorrect:
> 
>    92: {sp:SI=sp:SI-0x30;clobber flags:CC;}
>    94: {dx:SI=sp:SI+0x2f;clobber flags:CC;}
>    95: NOTE_INSN_DELETED
>    96: {dx:SI=dx:SI&0xfffffffffffffff0;clobber flags:CC;}
> ...
>   441: [bp:SI-0x2c]=dx:SI
>       REG_DEAD dx:SI
> ...
>   442: dx:SI=[bp:SI-0x2c]
>   152: {dx:SI=dx:SI-cx:SI;clobber flags:CC;}
>       REG_DEAD dx:SI
>   153: ax:SI=dx:SI+0x14
>       REG_DEAD dx:SI
>   461: [bp:SI-0x2c]=ax:SI
>       REG_DEAD ax:SI
> ...
>   164: {sp:SI=sp:SI-ax:SI;clobber flags:CC;}
>       REG_DEAD ax:SI
>   166: {ax:SI=sp:SI+0x2f;clobber flags:CC;}
>   167: NOTE_INSN_DELETED
>   168: {ax:SI=ax:SI&0xfffffffffffffff0;clobber flags:CC;}
>   170: {ax:SI=ax:SI+0x2;clobber flags:CC;}
>       REG_DEAD ax:SI
>   423: dx:SI=ax:SI
>       REG_DEAD ax:SI
> ...
>   174: [sp:SI+0x1c]=cx:SI
>       REG_DEAD cx:SI
>       REG_EQUAL frame:SI-0x8
>   460: dx:SI=[bp:SI-0x2c]
>   175: [sp:SI+0x18]=dx:SI
>       REG_DEAD dx:SI
>   444: [bp:SI-0x2c]=dx:SI
>   425: dx:SI=dx:SI
>       REG_DEAD dx:SI
>   176: [sp:SI+0x14]=dx:SI
>       REG_DEAD dx:SI
>   177: [sp:SI+0x10]=si:SI
>   178: dx:SI=sign_extend([di:SI+0x25])
>       REG_EQUIV [sp:SI+0xc]
>   179: [sp:SI+0xc]=dx:SI
>       REG_DEAD dx:SI
>   180: [sp:SI+0x8]=ax:SI
>       REG_DEAD ax:SI
>   426: cx:SI=[bp:SI-0x54]
>   181: [sp:SI+0x4]=cx:SI
>       REG_DEAD cx:SI
>   427: cx:SI=[bp:SI-0x48]
>   182: [sp:SI]=cx:SI
>       REG_DEAD cx:SI
>   183: call
> [`_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE12_M_group_int
> EPKcjcRSt8ios_basePcS9_Ri'] argc:0x20
> 
> The bug I see is in the 460/444 reloads for insn 175.  The correct value
> that insn 176 is supposed to store is live in edx register iup to insn 174,
> but LRA? decides to throw away it's value when reloading insn 175 and loads
> there the value of former pseudo r59 from [bp-0x2c], stores that correctly
> into [sp+0x18] and saves to [bp-0x2c] again (why?  the value hasn't really
> changed).  But the old edx (pseudo r82) is lost.  Vlad, can you please have
> a look?

 The generated code is wrong.  The added patch fixes it but it does not fix libstdc++11 regressions.  The path results in the following code generation: 

  170: {ax:SI=ax:SI+0x2;clobber flags:CC;}
      REG_DEAD ax:SI
  423: dx:SI=ax:SI
      REG_DEAD ax:SI
  446: NOTE_INSN_DELETED
  171: ax:SI=[di:SI+0xc]
  172: cx:SI=[di:SI+0x8]
  424: [bp:SI-0x54]=cx:SI
      REG_DEAD cx:SI
  470: NOTE_INSN_DELETED
  445: NOTE_INSN_DELETED
  173: {cx:SI=bp:SI-0x20;clobber flags:CC;}
      REG_EQUIV frame:SI-0x8
  174: [sp:SI+0x1c]=cx:SI
      REG_DEAD cx:SI
      REG_EQUAL frame:SI-0x8
  460: cx:SI=[bp:SI-0x2c]
  175: [sp:SI+0x18]=cx:SI
      REG_DEAD cx:SI
  444: [bp:SI-0x2c]=dx:SI
  425: dx:SI=dx:SI
      REG_DEAD dx:SI
  176: [sp:SI+0x14]=dx:SI
      REG_DEAD dx:SI
  177: [sp:SI+0x10]=si:SI
  178: dx:SI=sign_extend([di:SI+0x25])
      REG_EQUIV [sp:SI+0xc]
  179: [sp:SI+0xc]=dx:SI
      REG_DEAD dx:SI
  180: [sp:SI+0x8]=ax:SI
      REG_DEAD ax:SI
  426: cx:SI=[bp:SI-0x54]
  181: [sp:SI+0x4]=cx:SI
      REG_DEAD cx:SI
  427: cx:SI=[bp:SI-0x48]
  182: [sp:SI]=cx:SI
      REG_DEAD cx:SI
  183: call [`_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE12_M_group_intEPKcjcRSt8ios_basePcS9_Ri'] argc:0x20
  184: cx:SI=[bp:SI-0x20]
Comment 13 Jakub Jelinek 2013-12-12 07:14:11 UTC
Strange.  From my limited testing, it does fix the regressions.  I can fire off now full scratch rpm builds with your patch.
Comment 14 rguenther@suse.de 2013-12-12 07:49:43 UTC
"jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote:
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59470
>
>--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
>CXXFLAGS='-fstack-protector ...' ../configure ...
>(at least I believe so, we override a bunch of other variables in the
>gcc.spec
>file).
>
>Anyway, I've instrumented gcc so that based on env var it used the
>r205708 way
>of TER for some functions and r205709 for others and it seems for this
>testcase
>the only problematic one is
>_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE13_M_insert_intIlEES3_S3_RSt8ios_basecT_
>
>On that function, I see that the r205709 change made _46, _47, _114,
>_119 and
>_126 not TERable.
>That's:
>_46 = MEM[(const struct __cache_type *)prephitmp_130]._M_grouping_size;
>  _47 = MEM[(const struct __cache_type *)prephitmp_130]._M_grouping;
>std::num_put<char>::_M_group_int (this_48(D), _47, _46, _45,
>__io_10(D),
>__cs_43, __cs_36, &__len);
>...
>  _114 = MEM[(const struct locale &)__io_10(D) + 108]._M_impl;
>  std::locale::_Impl::_M_install_cache (_114, __tmp_113, __i_107);
>...
>  _119 = MEM[(int (*__vtbl_ptr_type) () *)_118 + 4B];
>  OBJ_TYPE_REF(_119;__tmp_113->1) (__tmp_113);
>...
>  _126 = MEM[(int (*__vtbl_ptr_type) () *)_125 + 48B];
>_127 = OBJ_TYPE_REF(_126;__s$_M_sbuf_106->12) (__s$_M_sbuf_106, __cs_3,
>prephitmp_175);
>
>All of these single use SSA_NAMEs.  I don't see why these shouldn't be
>TERable,
>function arguments necessarily are evaluated before the function is
>called and
>the aliasing stmt_may_clobber_ref_p talks about are either the
>side-effects of
>the call itself, or storing to the lhs.
>So, for calls, can't we do only the first part of
>stmt_may_clobber_ref_p_1
>handling and not the rest (i.e. the gimple_call_lhs handling there
>only),
>unless the SSA_NAME is used somewhere in the lhs of the call?

No, it's exactly the side.effect that matters here.  The call invalidates the address of the memory we store the call result to.  We can probably restrict this to calls with non.register lhs and asm statements with memory outputs though.  Of course that complicates the patch.

Richard.

>Similarly for ASMs, IMHO terring into inline asm is especially
>important, while
>your change pretty much disables it always:
>  else if (gimple_code (stmt) == GIMPLE_ASM)
>    return true;
>What should be avoided is if SSA_NAME is used in some output operand
>(say as
>part of a reference), that I can understand why we shouldn't ter.  But
>otherwise?
>
>Note, this of course doesn't explain why this PR breaks.
Comment 15 Jakub Jelinek 2013-12-12 09:54:03 UTC
Created attachment 31425 [details]
gcc49-pr59470.patch

Untested TER changes I've meant.  I believe that for gimple_assign_single_p (stmt) the current code pretty much matches the pre-r205709 code (assuming refs_may_alias_p_1 argument order doesn't matter), perhaps with some inspection
of assignment expansion code it could be improved to only the
gimple_assign_rhs1 (stmt) == use case (if the lhs and rhs of assignment expressions are evaluated always before the actual assignment, the problematic case would be if for assignments that require multiple stores we'd evaluate the
expressions after any of the stores), but that is not a regression from that patch.
For calls and inline asm it is IMHO desirable to avoid TER only when we really need, both to potentially generate better code or code that LRA doesn't have to fix up that much, and for 4.8 branch also to decrease the amount of code generation changes to decrease risks.
Comment 16 Jakub Jelinek 2013-12-12 12:56:08 UTC
Created attachment 31426 [details]
gcc48-pr59470-test.patch

Runtime testcase that shows the LRA problem.
Comment 17 Jakub Jelinek 2013-12-12 13:26:08 UTC
The #c11 fix has been successfully bootstrapped/regtested on x86_64-linux and i686-linux on trunk (--enable-checking=yes,rtl) and on 4.8 branch (also both targets, though regtest is still pending there).
Comment 18 Vladimir Makarov 2013-12-12 15:39:39 UTC
(In reply to Jakub Jelinek from comment #13)
> Strange.  From my limited testing, it does fix the regressions.  I can fire
> off now full scratch rpm builds with your patch.

Sorry. My bad. I did not rebuild the library (I rebuilt only compiler) when I tested it.

In general, the probability of this bug is quite tiny so many conditions must come up for this.  That is why we found it only recently.  Thanks for working on it, Jakub.  Your analysis helped me a lot.  You can commit it into gcc-4.8 and gcc-4.9.
Comment 19 Jakub Jelinek 2013-12-12 15:41:51 UTC
(In reply to Vladimir Makarov from comment #18)
> (In reply to Jakub Jelinek from comment #13)
> > Strange.  From my limited testing, it does fix the regressions.  I can fire
> > off now full scratch rpm builds with your patch.
> 
> Sorry. My bad. I did not rebuild the library (I rebuilt only compiler) when
> I tested it.
> 
> In general, the probability of this bug is quite tiny so many conditions
> must come up for this.  That is why we found it only recently.  Thanks for
> working on it, Jakub.  Your analysis helped me a lot.  You can commit it
> into gcc-4.8 and gcc-4.9.

Regtest finished on i686-linux/4.8 and looks fine, x86_64-linux/4.8 is still pending.
Comment 20 Vladimir Makarov 2013-12-12 15:48:25 UTC
Author: vmakarov
Date: Thu Dec 12 15:48:23 2013
New Revision: 205929

URL: http://gcc.gnu.org/viewcvs?rev=205929&root=gcc&view=rev
Log:
2013-12-12  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/59470
	* lra-coalesce.c (lra_coalesce): Invalidate inheritance pseudo
	values if necessary.


Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/lra-coalesce.c
Comment 21 Vladimir Makarov 2013-12-12 15:51:51 UTC
Author: vmakarov
Date: Thu Dec 12 15:51:49 2013
New Revision: 205930

URL: http://gcc.gnu.org/viewcvs?rev=205930&root=gcc&view=rev
Log:
2013-12-12  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/59470
	* lra-coalesce.c (lra_coalesce): Invalidate inheritance pseudo
	values if necessary.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-coalesce.c
Comment 22 Jakub Jelinek 2013-12-12 17:55:46 UTC
Author: jakub
Date: Thu Dec 12 17:55:44 2013
New Revision: 205934

URL: http://gcc.gnu.org/viewcvs?rev=205934&root=gcc&view=rev
Log:
	PR middle-end/59470
	* g++.dg/opt/pr59470.C: New test.

Added:
    trunk/gcc/testsuite/g++.dg/opt/pr59470.C
Modified:
    trunk/gcc/testsuite/ChangeLog
Comment 23 Jakub Jelinek 2013-12-12 17:56:54 UTC
Author: jakub
Date: Thu Dec 12 17:56:51 2013
New Revision: 205935

URL: http://gcc.gnu.org/viewcvs?rev=205935&root=gcc&view=rev
Log:
	PR middle-end/59470
	* g++.dg/opt/pr59470.C: New test.

Added:
    branches/gcc-4_8-branch/gcc/testsuite/g++.dg/opt/pr59470.C
Modified:
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
Comment 24 Jakub Jelinek 2013-12-16 08:09:08 UTC
Author: jakub
Date: Mon Dec 16 08:09:05 2013
New Revision: 206009

URL: http://gcc.gnu.org/viewcvs?rev=206009&root=gcc&view=rev
Log:
	PR middle-end/58956
	PR middle-end/59470
	* gimple-walk.h (walk_stmt_load_store_addr_fn): New typedef.
	(walk_stmt_load_store_addr_ops, walk_stmt_load_store_ops): Use it
	for callback params.
	* gimple-walk.c (walk_stmt_load_store_ops): Likewise.
	(walk_stmt_load_store_addr_ops): Likewise.  Adjust all callback
	calls to supply the gimple operand containing the base tree
	as an extra argument.
	* tree-ssa-ter.c: Include gimple-walk.h.
	(find_ssaname, find_ssaname_in_store): New helper functions.
	(find_replaceable_in_bb): For calls or GIMPLE_ASM, only set
	same_root_var if USE is used somewhere in the stores of the stmt.
	* ipa-prop.c (visit_ref_for_mod_analysis): Remove name of the stmt
	argument and ATTRIBUTE_UNUSED, add another unnamed tree argument.
	* ipa-pure-const.c (check_load, check_store, check_ipa_load,
	check_ipa_store): Likewise.
	* gimple.c (gimple_ior_addresses_taken_1, check_loadstore): Likewise.
	* ipa-split.c (test_nonssa_use, mark_nonssa_use): Likewise.
	(verify_non_ssa_vars, visit_bb): Adjust their callers.
	* cfgexpand.c (add_scope_conflicts_1): Use
	walk_stmt_load_store_addr_fn type for visit variable.
	(visit_op, visit_conflict): Remove name of the stmt
	argument and ATTRIBUTE_UNUSED, add another unnamed tree argument.
	* tree-sra.c (asm_visit_addr): Likewise.  Remove name of the data
	argument and ATTRIBUTE_UNUSED.
	* cgraphbuild.c (mark_address, mark_load, mark_store): Add another
	unnamed tree argument.
	* gimple-ssa-isolate-paths.c (check_loadstore): Likewise.  Remove
	ATTRIBUTE_UNUSED from stmt parameter.

	* gcc.target/i386/pr59470.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr59470.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgexpand.c
    trunk/gcc/cgraphbuild.c
    trunk/gcc/gimple-ssa-isolate-paths.c
    trunk/gcc/gimple-walk.c
    trunk/gcc/gimple-walk.h
    trunk/gcc/gimple.c
    trunk/gcc/ipa-prop.c
    trunk/gcc/ipa-pure-const.c
    trunk/gcc/ipa-split.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-sra.c
    trunk/gcc/tree-ssa-ter.c
Comment 25 Jakub Jelinek 2014-01-07 16:49:25 UTC
Author: jakub
Date: Tue Jan  7 16:49:22 2014
New Revision: 206396

URL: http://gcc.gnu.org/viewcvs?rev=206396&root=gcc&view=rev
Log:
	Backported from mainline
	2013-12-16  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/58956
	PR middle-end/59470
	* gimple.h (walk_stmt_load_store_addr_fn): New typedef.
	(walk_stmt_load_store_addr_ops, walk_stmt_load_store_ops): Use it
	for callback params.
	* gimple.c (walk_stmt_load_store_ops): Likewise.
	(walk_stmt_load_store_addr_ops): Likewise.  Adjust all callback
	calls to supply the gimple operand containing the base tree
	as an extra argument.
	* tree-ssa-ter.c (find_ssaname, find_ssaname_in_store): New helper
	functions.
	(find_replaceable_in_bb): For calls or GIMPLE_ASM, only set
	same_root_var if USE is used somewhere in the stores of the stmt.
	* ipa-prop.c (visit_ref_for_mod_analysis): Remove name of the stmt
	argument and ATTRIBUTE_UNUSED, add another unnamed tree argument.
	* ipa-pure-const.c (check_load, check_store, check_ipa_load,
	check_ipa_store): Likewise.
	* gimple.c (gimple_ior_addresses_taken_1): Likewise.
	* ipa-split.c (test_nonssa_use, mark_nonssa_use): Likewise.
	(verify_non_ssa_vars, visit_bb): Adjust their callers.
	* cfgexpand.c (add_scope_conflicts_1): Use
	walk_stmt_load_store_addr_fn type for visit variable.
	(visit_op, visit_conflict): Remove name of the stmt
	argument and ATTRIBUTE_UNUSED, add another unnamed tree argument.
	* tree-sra.c (asm_visit_addr): Likewise.  Remove name of the data
	argument and ATTRIBUTE_UNUSED.
	* cgraphbuild.c (mark_address, mark_load, mark_store): Add another
	unnamed tree argument.

	* gcc.target/i386/pr59470.c: New test.

Added:
    branches/gcc-4_8-branch/gcc/testsuite/gcc.target/i386/pr59470.c
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/cfgexpand.c
    branches/gcc-4_8-branch/gcc/cgraphbuild.c
    branches/gcc-4_8-branch/gcc/gimple.c
    branches/gcc-4_8-branch/gcc/gimple.h
    branches/gcc-4_8-branch/gcc/ipa-prop.c
    branches/gcc-4_8-branch/gcc/ipa-pure-const.c
    branches/gcc-4_8-branch/gcc/ipa-split.c
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_8-branch/gcc/tree-sra.c
    branches/gcc-4_8-branch/gcc/tree-ssa-ter.c
Comment 26 Jakub Jelinek 2014-01-07 16:52:12 UTC
Fixed.
Comment 27 Hans-Peter Nilsson 2014-01-08 10:15:59 UTC
The commit and backport just above this comment caused PR59584.