Bug 47216 - [4.6 Regression] gcc.dg/torture/pr43360.c FAILs with -O -fPIC -fgcse -fgcse-sm -fnon-call-exceptions -fno-tree-dse
Summary: [4.6 Regression] gcc.dg/torture/pr43360.c FAILs with -O -fPIC -fgcse -fgcse-s...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: 4.6.0
Assignee: Richard Biener
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2011-01-07 17:41 UTC by Zdenek Sojka
Modified: 2011-01-19 09:55 UTC (History)
4 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Build:
Known to work: 4.5.3
Known to fail: 4.6.0
Last reconfirmed: 2011-01-07 23:21:14


Attachments
preprocessed source (248 bytes, text/plain)
2011-01-07 17:41 UTC, Zdenek Sojka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2011-01-07 17:41:48 UTC
Created attachment 22927 [details]
preprocessed source

Output:
$ gcc -O -fPIC -fgcse -fgcse-sm -fnon-call-exceptions -fno-tree-dse pr43360.i
$ valgrind -q ./a.out 
Aborted

I wasn't able to further reduce the testcase.

Tested revisions:
r168552 - fail
r161659 - fail
r158095 - OK
4.5 r168062 - OK
Comment 1 Zdenek Sojka 2011-01-07 21:50:30 UTC
At the assembly level:

func_1:
	mov	rax, QWORD PTR g_3@GOTPCREL[rip]
	mov	rdx, QWORD PTR l_5_5_2@GOTPCREL[rip]
	mov	ecx, DWORD PTR [rdx] # ecx = 4
	lea	edx, 7[rcx]          # edx = 4+7 = 11
	mov	DWORD PTR [rax], edx # g_3[0][0] = edx = 11
	cmp	edx, 7
	jg	.L4 # jump taken
.L3:
	mov	DWORD PTR [rax], edx
	jmp	.L3
.L4:
	# this instruction overwrites previously stored 11 by 4
	mov	DWORD PTR [rax], ecx # g_3[0][0] = ecx = 4
	ret
Comment 2 H.J. Lu 2011-01-07 23:21:14 UTC
It is caused by revision 161655:

http://gcc.gnu.org/ml/gcc-cvs/2010-07/msg00006.html
Comment 3 Richard Biener 2011-01-11 11:43:51 UTC
Mine then.
Comment 4 Richard Biener 2011-01-12 13:03:18 UTC
We have in store-motion.c:

static inline bool
store_killed_in_pat (const_rtx x, const_rtx pat, int after)
{
...
      /* Check for memory stores to aliased objects.  */
      if (MEM_P (dest)
          && !exp_equiv_p (dest, x, 0, true))
        {
          if (after)
            {
              if (output_dependence (dest, x))
                return true;
            }
          else
            {
              if (output_dependence (x, dest))
                return true;
            }
        }

and coming in with

gdb) call debug_rtx (dest)
(mem/s/j:SI (reg/f:DI 67) [0 g_3+0 S4 A32])
(gdb) call debug_rtx (x)
(mem/s/c:SI (reg/f:DI 67) [0 MEM[(int *)&g_3]+0 S4 A32])

and exp_equiv_p returns true.  But then we seem to completely ignore
that kill(!?), thus we happily sink the first store across the second.
Huh.  Doesn't make sense to me unless the SET_SRCs are also equivalent.

Of course it may be that this shouldn't happen because if it does then
the antic sets are not as expected?

I guess it makes sense if you only look at the description for
store_killed_in_pat and adjust it to "looking for any loads which
might make the store in X live" (loads don't kill anything).

Ultimatively this leads to a wrong answer from store_killed_after.

Simplified situation is like

 for (;;)
   {
     *x = 1;
     *x = 2;
   }

and store-motion sinks *x = 1 across *x = 2.

svn blame blames steven for store-motion (of course - he split it out).
Steven - do you by any chance have any idea how it works?  The code
dates back to

2001-04-09  Andrew MacLeod  <amacleod@redhat.com>
           Jeff Law  <law@redhat.com>

And I'd just do an uninformed

Index: gcc/store-motion.c
===================================================================
--- gcc/store-motion.c  (revision 168707)
+++ gcc/store-motion.c  (working copy)
@@ -367,8 +367,7 @@ store_killed_in_pat (const_rtx x, const_
        dest = XEXP (dest, 0);
 
       /* Check for memory stores to aliased objects.  */
-      if (MEM_P (dest)
-         && !exp_equiv_p (dest, x, 0, true))
+      if (MEM_P (dest))
        {
          if (after)
            {
Comment 5 Richard Biener 2011-01-18 15:50:59 UTC
Author: rguenth
Date: Tue Jan 18 15:50:55 2011
New Revision: 168951

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168951
Log:
2011-01-18  Richard Guenther  <rguenther@suse.de>

	PR rtl-optimization/47216
	* emit-rtl.c: Include tree-flow.h.
	(set_mem_attributes_minus_bitpos): Use tree_could_trap_p instead
	of replicating it with different semantics.
	* Makefile.in (emit-rtl.o): Adjust.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/Makefile.in
    trunk/gcc/emit-rtl.c
Comment 6 Richard Biener 2011-01-18 15:53:22 UTC
Fixed.
Comment 7 stevenb.gcc@gmail.com 2011-01-18 21:42:50 UTC
Resolved alright -- but including tree-flow.h in emit-rtl.c??? :-(
Comment 8 rguenther@suse.de 2011-01-19 09:55:48 UTC
On Tue, 18 Jan 2011, stevenb.gcc at gmail dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47216
> 
> --- Comment #7 from stevenb.gcc at gmail dot com <stevenb.gcc at gmail dot com> 2011-01-18 21:42:50 UTC ---
> Resolved alright -- but including tree-flow.h in emit-rtl.c??? :-(

We don't have tree-eh.h and tree-flow.h is yet another 
container-of-random-crap.   Btw, emit-rtl.c is part of expand, so it
definitely crosses the border and uses lots of tree stuff.

Richard.