User account creation filtered due to spam.

Bug 44838 - [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
Summary: [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: 4.6.0
Assignee: Richard Biener
URL:
Keywords: alias, wrong-code
Depends on:
Blocks:
 
Reported: 2010-07-06 14:03 UTC by H.J. Lu
Modified: 2010-07-08 09:16 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-07-07 13:58:10


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2010-07-06 14:03:51 UTC
On Linux/ia32, revision 161849 gave:

FAIL: gcc.dg/pr39794.c execution test

Revision 161840 is OK.
Comment 1 H.J. Lu 2010-07-06 15:21:55 UTC
It is caused by revision 161844:

http://gcc.gnu.org/ml/gcc-cvs/2010-07/msg00198.html
Comment 2 Sandra Loosemore 2010-07-06 15:57:55 UTC
s/caused by/exposed by/ ?

The patch to ivopts likely results in it selecting a different/smaller set of loop induction variables, but I don't see how this change by itself could have introduced a wrong-code error.
Comment 3 Steven Bosscher 2010-07-06 16:58:39 UTC
Caused by, or exposed by ... in both cases your responsibility to investigate.
Comment 4 Sandra Loosemore 2010-07-06 21:10:37 UTC
Well, I'm *trying* to investigate....  but I haven't been able to reproduce the problem yet.  I checked out r161844 and built for i686-pc-linux-gnu, and the gcc.dg/pr39794.c execution test passes.  If this requires some other target and/or options to trigger the failure, can you be more specific about what they are?
Comment 5 H.J. Lu 2010-07-06 21:24:15 UTC
Looking closely at my results, this test will only fail with
"-m32 -O2 -funroll-loops" on Linux/x86-64.
Comment 6 Richard Biener 2010-07-06 21:40:24 UTC
Confirmed.  Fails with -m32 testing on x86_64.
Comment 7 Sandra Loosemore 2010-07-07 00:42:41 UTC
Hmmm.  It's possible I built my toolchain incorrectly, but I'm seeing that it aborts when compiled with -m64 but not with -m32.  The failure mode looks identical to that reported in PR39794:

(gdb) print a
$1 = {0, 1, 4, 2, 10, 12, 24, 44, 72, 92, 60, 34, 244, 47, 58, 291}
(gdb) print ref
$2 = {0, 1, 4, 2, 10, 12, 24, 44, 72, 136, 232, 416, 736, 1296, 2304, 2032}

This slightly modified version of the test case fails when compiled with -m64 -O2 -funroll-loops -fno-ivopts:

extern void abort ();

void
foo (int *a, int n)
{
  int *lasta = a + n;
  for (; a != lasta; a++)
    {
      *a *= 2;
      a[1] = a[-1] + a[-2];
    }
}

int a[16];
int ref[16] = { 0, 1, 4, 2, 10, 12, 24, 44,
		72, 136, 232, 416, 736, 1296, 2304, 2032 };

int
main ()
{
  int i;
  for (i = 0; i < 16; i++)
    a[i] = i;
  foo (a + 2, 16 - 3);
  for (i = 0; i < 16; i++)
    if (ref[i] != a[i])
      abort ();
  return 0;
}

So, not an ivopts problem at all?


Comment 8 H.J. Lu 2010-07-07 00:48:06 UTC
(In reply to comment #7)
> Hmmm.  It's possible I built my toolchain incorrectly, but I'm seeing that it
> aborts when compiled with -m64 but not with -m32.  The failure mode looks
> identical to that reported in PR39794:
> 

Please make sure that you aren't using Ubuntu since it is different
from other Linux/x86-64 OSes.
Comment 9 Sandra Loosemore 2010-07-07 01:09:02 UTC
Yes, this is on an Ubuntu system, but one of my co-workers says GCC multilibs work with Ubuntu now; the support is in gcc/config/i386/t-linux64.  Me, I'm clueless about anything configury-related.  :-(  I can try again on another machine, but this being my third try already, I'm not terribly confident I'll get it right the next time, either.  Frankly I do not see what effect Ubuntu vs non-Ubuntu multilib arrangements would have to do with ivopts behavior anyway.

Can you try out my -fno-ivopts example in the configuration you found the original problem in?  That would rule out my cluelessness in configuring the toolchain as a source of differing behavior, at least.
Comment 10 H.J. Lu 2010-07-07 03:22:00 UTC
My mistake. gcc.dg/pr39794.c failed with -m64 on Linux/x86-64, not
on Linux/ia32. The testcase in comment #7 started to fail between
revision 161671 and 161840. I am doing a binary search. It may be
the real cause.
Comment 11 H.J. Lu 2010-07-07 04:53:05 UTC
(In reply to comment #10)
> My mistake. gcc.dg/pr39794.c failed with -m64 on Linux/x86-64, not
> on Linux/ia32. The testcase in comment #7 started to fail between
> revision 161671 and 161840. I am doing a binary search. It may be
> the real cause.
> 

It is caused by revision 161840:

http://gcc.gnu.org/ml/gcc-cvs/2010-07/msg00194.html
Comment 12 Richard Biener 2010-07-07 08:43:08 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > My mistake. gcc.dg/pr39794.c failed with -m64 on Linux/x86-64, not
> > on Linux/ia32. The testcase in comment #7 started to fail between
> > revision 161671 and 161840. I am doing a binary search. It may be
> > the real cause.
> > 
> 
> It is caused by revision 161840:
> 
> http://gcc.gnu.org/ml/gcc-cvs/2010-07/msg00194.html

How can that be the cause if the failure happens without IVOPTs?
Comment 13 Richard Biener 2010-07-07 08:51:03 UTC
It's a scheduling issue (and thus an alias issue).  -fno-schedule-insns2 fixes
the problem.  We mis-schedule the unrolled part.
Comment 14 Richard Biener 2010-07-07 09:01:11 UTC
Huh.  Unrolling preserves MEM_ATTRs even though it re-writes the RTXen.  That
causes scheduling to see just a bunch of repeated

(insn 218 309 219 18 t.c:9 (parallel [
            (set (mem:SI (reg/v/f:DI 1 dx [orig:100 a ] [100]) [2 *a_22+0 S4 A32])
                (ashift:SI (mem:SI (reg/v/f:DI 1 dx [orig:100 a ] [100]) [2 *a_22+0 S4 A32])
                    (const_int 1 [0x1])))
            (clobber (reg:CC 17 flags))
        ]) 490 {*ashlsi3_1} (expr_list:REG_UNUSED (reg:CC 17 flags)
        (expr_list:REG_EQUAL (ashift:SI (mem:SI (plus:DI (reg/v/f:DI 0 ax [orig:84 a ] [84])
                        (const_int 16 [0x10])) [2 *a_22+0 S4 A32])
                (const_int 1 [0x1]))
            (nil))))

(insn 220 219 221 18 t.c:10 (set (reg:SI 4 si [103])
        (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100])
                (const_int -8 [0xfffffffffffffff8])) [2 MEM[(int *)a_22 + -8B]+0 S4 A32])) 63 {*movsi_internal} (nil))

(insn 221 220 222 18 t.c:10 (parallel [
            (set (reg:SI 4 si [103])
                (plus:SI (reg:SI 4 si [103])
                    (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100])
                            (const_int -4 [0xfffffffffffffffc])) [2 MEM[(int *)a_22 + -4B]+0 S4 A32])))
            (clobber (reg:CC 17 flags))
        ]) 251 {*addsi_1} (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))

(insn 222 221 310 18 t.c:10 (set (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100])
                (const_int 4 [0x4])) [2 MEM[(int *)a_22 + 4B]+0 S4 A32])
        (reg:SI 4 si [103])) 63 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 4 si [103])
        (expr_list:REG_DEAD (reg/v/f:DI 1 dx [orig:100 a ] [100])
            (expr_list:REG_EQUAL (plus:SI (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100])
                            (const_int -8 [0xfffffffffffffff8])) [2 MEM[(int *)a_22 + -8B]+0 S4 A32])
                    (mem:SI (plus:DI (reg/v/f:DI 1 dx [orig:100 a ] [100])
                            (const_int -4 [0xfffffffffffffffc])) [2 MEM[(int *)a_22 + -4B]+0 S4 A32]))
                (nil)))))

where there is obviously no conflicts between the above patterns during
different unrolled copies.

Who is supposed to magically deal with that?  (or what is supposed to prevent
this from happening?)
Comment 15 rakdver@kam.mff.cuni.cz 2010-07-07 09:37:52 UTC
Subject: Re:  [4.6 regression] RTL loop
	unrolling causes FAIL: gcc.dg/pr39794.c

> ------- Comment #14 from rguenth at gcc dot gnu dot org  2010-07-07 09:01 -------
> Huh.  Unrolling preserves MEM_ATTRs even though it re-writes the RTXen.  That
> causes scheduling to see just a bunch of repeated
> 
> 
> where there is obviously no conflicts between the above patterns during
> different unrolled copies.
> 
> Who is supposed to magically deal with that?  (or what is supposed to prevent
> this from happening?)

I am not sure what you mean -- I may be misunderstanding how rtl alias analysis
works, but as far as I can tell, what unroller does (just preserving the
MEM_ATTRs) is conservatively correct (so, potentially it may make us believe
that there are dependences that are not really present, but it should not cause
a wrong-code bug).
Comment 16 Alexander Monakov 2010-07-07 09:54:16 UTC
(In reply to comment #15)
> Subject: Re:  [4.6 regression] RTL loop
>         unrolling causes FAIL: gcc.dg/pr39794.c
> 
> I am not sure what you mean -- I may be misunderstanding how rtl alias analysis
> works, but as far as I can tell, what unroller does (just preserving the
> MEM_ATTRs) is conservatively correct (so, potentially it may make us believe
> that there are dependences that are not really present, but it should not cause
> a wrong-code bug).

Consider this simplified example:

for (i ...)
  {
/*A*/  t = a[i];
/*B*/  a[i+1] = t;
  }
MEM_ATTRS would indicate that memory references in A and B do not alias.

Unrolling by 2 produces:
for (i ...)
  {
/*A */ t = a[i];
/*B */ a[i+1] = t;
/*A'*/ t = a[i+1];
/*B'*/ a[i+2] = t;
  }
Preserving MEM_ATTRS wrongly indicates that memory references in B and A' do not alias, and the scheduler then may happen to lift A' above B.


Comment 17 rakdver@kam.mff.cuni.cz 2010-07-07 10:09:56 UTC
Subject: Re:  [4.6 regression] RTL loop
	unrolling causes FAIL: gcc.dg/pr39794.c

> (In reply to comment #15)
> > Subject: Re:  [4.6 regression] RTL loop
> >         unrolling causes FAIL: gcc.dg/pr39794.c
> > 
> > I am not sure what you mean -- I may be misunderstanding how rtl alias analysis
> > works, but as far as I can tell, what unroller does (just preserving the
> > MEM_ATTRs) is conservatively correct (so, potentially it may make us believe
> > that there are dependences that are not really present, but it should not cause
> > a wrong-code bug).
> 
> Consider this simplified example:
> 
> for (i ...)
>   {
> /*A*/  t = a[i];
> /*B*/  a[i+1] = t;
>   }
> MEM_ATTRS would indicate that memory references in A and B do not alias.

but this is clearly wrong, since B in iteration X aliases with A in iteration X+1.
So, not a problem in unroller.
Comment 18 Richard Biener 2010-07-07 10:30:56 UTC
(In reply to comment #17)
> Subject: Re:  [4.6 regression] RTL loop
>         unrolling causes FAIL: gcc.dg/pr39794.c
> 
> > (In reply to comment #15)
> > > Subject: Re:  [4.6 regression] RTL loop
> > >         unrolling causes FAIL: gcc.dg/pr39794.c
> > > 
> > > I am not sure what you mean -- I may be misunderstanding how rtl alias analysis
> > > works, but as far as I can tell, what unroller does (just preserving the
> > > MEM_ATTRs) is conservatively correct (so, potentially it may make us believe
> > > that there are dependences that are not really present, but it should not cause
> > > a wrong-code bug).
> > 
> > Consider this simplified example:
> > 
> > for (i ...)
> >   {
> > /*A*/  t = a[i];
> > /*B*/  a[i+1] = t;
> >   }
> > MEM_ATTRS would indicate that memory references in A and B do not alias.
> 
> but this is clearly wrong, since B in iteration X aliases with A in iteration
> X+1.
> So, not a problem in unroller.

It is not wrong.  You have the two identical pointers p = &a[i] and
q = p + 1.  *p and *q do not alias.  Never.

Now unrolling rewrites p and q but does not adjust MEM_ATTRs.  So
alias information still claims the same pointer bases are used for
every unrolled load/store, which is certainly not true.

(In the past we didn't preserve pointer bases at all, which is why
we didn't hit this before.  Starting with 4.5.0 and export of
points-to information we do, so passes need to fix MEM_ATTRs accordingly)
Comment 19 rakdver@kam.mff.cuni.cz 2010-07-07 10:35:32 UTC
Subject: Re:  [4.6 regression] RTL loop
	unrolling causes FAIL: gcc.dg/pr39794.c

> > > > I am not sure what you mean -- I may be misunderstanding how rtl alias analysis
> > > > works, but as far as I can tell, what unroller does (just preserving the
> > > > MEM_ATTRs) is conservatively correct (so, potentially it may make us believe
> > > > that there are dependences that are not really present, but it should not cause
> > > > a wrong-code bug).
> > > 
> > > Consider this simplified example:
> > > 
> > > for (i ...)
> > >   {
> > > /*A*/  t = a[i];
> > > /*B*/  a[i+1] = t;
> > >   }
> > > MEM_ATTRS would indicate that memory references in A and B do not alias.
> > 
> > but this is clearly wrong, since B in iteration X aliases with A in iteration
> > X+1.
> > So, not a problem in unroller.
> 
> It is not wrong.  You have the two identical pointers p = &a[i] and
> q = p + 1.  *p and *q do not alias.  Never.

Well, then you have some other definition of aliasing than me.  For me, two
memory references M1 and M2 do not alias, if on every code path, the locations
accessed by M1 and M2 are different.  With this definition, *p and *q may alias,
as the example above shows.  What is your definition?
Comment 20 Richard Biener 2010-07-07 10:43:22 UTC
(In reply to comment #19)
> Subject: Re:  [4.6 regression] RTL loop
>         unrolling causes FAIL: gcc.dg/pr39794.c
> 
> > > > > I am not sure what you mean -- I may be misunderstanding how rtl alias analysis
> > > > > works, but as far as I can tell, what unroller does (just preserving the
> > > > > MEM_ATTRs) is conservatively correct (so, potentially it may make us believe
> > > > > that there are dependences that are not really present, but it should not cause
> > > > > a wrong-code bug).
> > > > 
> > > > Consider this simplified example:
> > > > 
> > > > for (i ...)
> > > >   {
> > > > /*A*/  t = a[i];
> > > > /*B*/  a[i+1] = t;
> > > >   }
> > > > MEM_ATTRS would indicate that memory references in A and B do not alias.
> > > 
> > > but this is clearly wrong, since B in iteration X aliases with A in iteration
> > > X+1.
> > > So, not a problem in unroller.
> > 
> > It is not wrong.  You have the two identical pointers p = &a[i] and
> > q = p + 1.  *p and *q do not alias.  Never.
> 
> Well, then you have some other definition of aliasing than me.  For me, two
> memory references M1 and M2 do not alias, if on every code path, the locations
> accessed by M1 and M2 are different.  With this definition, *p and *q may
> alias,
> as the example above shows.  What is your definition?
> 

My definition is that the two statements in sequence A, B have a
true dependence if stmt B accesses memory written to by A.
Thus, in this context *p and *q do not "alias" (and this is what
the scheduler and every other optimization pass queries).
Comment 21 rakdver@kam.mff.cuni.cz 2010-07-07 10:47:19 UTC
Subject: Re:  [4.6 regression] RTL loop
	unrolling causes FAIL: gcc.dg/pr39794.c

> > > > > Consider this simplified example:
> > > > > 
> > > > > for (i ...)
> > > > >   {
> > > > > /*A*/  t = a[i];
> > > > > /*B*/  a[i+1] = t;
> > > > >   }
> > > > > MEM_ATTRS would indicate that memory references in A and B do not alias.
> > > > 
> > > > but this is clearly wrong, since B in iteration X aliases with A in iteration
> > > > X+1.
> > > > So, not a problem in unroller.
> > > 
> > > It is not wrong.  You have the two identical pointers p = &a[i] and
> > > q = p + 1.  *p and *q do not alias.  Never.
> > 
> > Well, then you have some other definition of aliasing than me.  For me, two
> > memory references M1 and M2 do not alias, if on every code path, the locations
> > accessed by M1 and M2 are different.  With this definition, *p and *q may
> > alias,
> > as the example above shows.  What is your definition?
> > 
> 
> My definition is that the two statements in sequence A, B have a
> true dependence if stmt B accesses memory written to by A.
> Thus, in this context *p and *q do not "alias" (and this is what
> the scheduler and every other optimization pass queries).

what do you mean by "statements in sequence"?
Comment 22 Richard Biener 2010-07-07 10:48:49 UTC
(In reply to comment #21)
> Subject: Re:  [4.6 regression] RTL loop
>         unrolling causes FAIL: gcc.dg/pr39794.c
> 
> > > > > > Consider this simplified example:
> > > > > > 
> > > > > > for (i ...)
> > > > > >   {
> > > > > > /*A*/  t = a[i];
> > > > > > /*B*/  a[i+1] = t;
> > > > > >   }
> > > > > > MEM_ATTRS would indicate that memory references in A and B do not alias.
> > > > > 
> > > > > but this is clearly wrong, since B in iteration X aliases with A in iteration
> > > > > X+1.
> > > > > So, not a problem in unroller.
> > > > 
> > > > It is not wrong.  You have the two identical pointers p = &a[i] and
> > > > q = p + 1.  *p and *q do not alias.  Never.
> > > 
> > > Well, then you have some other definition of aliasing than me.  For me, two
> > > memory references M1 and M2 do not alias, if on every code path, the locations
> > > accessed by M1 and M2 are different.  With this definition, *p and *q may
> > > alias,
> > > as the example above shows.  What is your definition?
> > > 
> > 
> > My definition is that the two statements in sequence A, B have a
> > true dependence if stmt B accesses memory written to by A.
> > Thus, in this context *p and *q do not "alias" (and this is what
> > the scheduler and every other optimization pass queries).
> 
> what do you mean by "statements in sequence"?

statement B executes after A.

Note that the issue we run into here is partly (or completely?) due to
the fact that the pointer variables in MEM_ATTRs are SSA names and
that we still honor their single-definition (and thus trivial
equality) property on RTL.
Comment 23 rakdver@kam.mff.cuni.cz 2010-07-07 10:51:22 UTC
Subject: Re:  [4.6 regression] RTL loop
	unrolling causes FAIL: gcc.dg/pr39794.c

> > > > > > > Consider this simplified example:
> > > > > > > 
> > > > > > > for (i ...)
> > > > > > >   {
> > > > > > > /*A*/  t = a[i];
> > > > > > > /*B*/  a[i+1] = t;
> > > > > > >   }
> > > > > > > MEM_ATTRS would indicate that memory references in A and B do not alias.
> > > > > > 
> > > > > > but this is clearly wrong, since B in iteration X aliases with A in iteration
> > > > > > X+1.
> > > > > > So, not a problem in unroller.
> > > > > 
> > > > > It is not wrong.  You have the two identical pointers p = &a[i] and
> > > > > q = p + 1.  *p and *q do not alias.  Never.
> > > > 
> > > > Well, then you have some other definition of aliasing than me.  For me, two
> > > > memory references M1 and M2 do not alias, if on every code path, the locations
> > > > accessed by M1 and M2 are different.  With this definition, *p and *q may
> > > > alias,
> > > > as the example above shows.  What is your definition?
> > > > 
> > > 
> > > My definition is that the two statements in sequence A, B have a
> > > true dependence if stmt B accesses memory written to by A.
> > > Thus, in this context *p and *q do not "alias" (and this is what
> > > the scheduler and every other optimization pass queries).
> > 
> > what do you mean by "statements in sequence"?
> 
> statement B executes after A.

which means what?  In the example above, due to the loop, you cannot say which statement
executes after which.
Comment 24 Richard Biener 2010-07-07 11:06:05 UTC
In

   ...
   *p_1 = x;
   y = *(p_1 + 1);
   ...

I can say that *p_1 does not alias *(p_1 + 1) independent on what code
is around.  If it would be

BB3:
  # p_1 = PHI <p_0, p_2(3)>
  *p_1 = x;
   y = *(p_1 + 1);
  p_2 = p_1 + 1;
  goto BB3;

that would be still correct (I can exchange those two statements).

For cross loop-iteration dependence after unrolling you would see
accesses based on different pointer SSA name bases.

Now on RTL we are not in SSA form and so yes, this change might be
a bit fishy (I, too, just discovered this side-effect and I assumed
passes would already to something here).

A way around this is to either adjust or clear MEM_OFFSET.
Comment 25 Michael Matz 2010-07-07 11:15:32 UTC
Due to SSA form the alias information reflects dependencies only between
accesses as if it ignores back edges.  Hence any transformation that
transforms a back edge into a forward edge, or moves code over back edges
needs to do adjustment to the alias info (effectively doing something like
PHI translation, or making the alias info simply more imprecise).  Hmpf.
Comment 26 rakdver@kam.mff.cuni.cz 2010-07-07 11:19:57 UTC
Subject: Re:  [4.6 regression] RTL loop
	unrolling causes FAIL: gcc.dg/pr39794.c

> In
> 
>    ...
>    *p_1 = x;
>    y = *(p_1 + 1);
>    ...
> 
> I can say that *p_1 does not alias *(p_1 + 1) independent on what code
> is around.  If it would be
> 
> BB3:
>   # p_1 = PHI <p_0, p_2(3)>
>   *p_1 = x;
>    y = *(p_1 + 1);
>   p_2 = p_1 + 1;
>   goto BB3;
> 
> that would be still correct (I can exchange those two statements).

Well, yes.  Still, I would like to hear your formal definition of what it means
for two memory references (not to) alias.  We certainly can modify the code to
ensure such a property, but just toying around without knowing precisely what
this property is definitely is not a good idea.
Comment 27 rakdver@kam.mff.cuni.cz 2010-07-07 11:31:53 UTC
Subject: Re:  [4.6 regression] RTL loop
	unrolling causes FAIL: gcc.dg/pr39794.c

> Due to SSA form the alias information reflects dependencies only between
> accesses as if it ignores back edges.

Well, this is closer to what I was asking for; so, the actual definition that
we use is:

Two memory references M1 and M2 (appearing in statements S1 and S2) if for
every code execution path P, and every appearance A1 of S1 and A2 of S2 in P
such that no backedge is taken between A1 and A2, the memory locations accessed
in A1 and A2 are different.

Still, this is somewhat ambiguous (in the presence of irreducible loops, it
is not quite clear what is a backedge).

> Hence any transformation that
> transforms a back edge into a forward edge, or moves code over back edges
> needs to do adjustment to the alias info (effectively doing something like
> PHI translation, or making the alias info simply more imprecise).  Hmpf.

It is kind of unpleasant that this affects optimizations like loop unrolling,
which should make sheduling better (but likely won't do as well if we have
to just throw away the results of alias analysis).
Comment 28 Richard Biener 2010-07-07 11:59:49 UTC
The following is a fix (or workaround) for the problem.

Index: gcc/tree-ssa-alias.c
===================================================================
--- gcc/tree-ssa-alias.c        (revision 161869)
+++ gcc/tree-ssa-alias.c        (working copy)
@@ -801,7 +780,8 @@ indirect_refs_may_alias_p (tree ref1 ATT
   /* If both bases are based on pointers they cannot alias if they may not
      point to the same memory object or if they point to the same object
      and the accesses do not overlap.  */
-  if (operand_equal_p (ptr1, ptr2, 0))
+  if (gimple_in_ssa_p (cfun)
+      && operand_equal_p (ptr1, ptr2, 0))
     {
       if (TREE_CODE (base1) == MEM_REF)
        offset1 += mem_ref_offset (base1).low * BITS_PER_UNIT;


In SSA form we are sure that if two SSA names are equal their (same) definition
dominates them.  So if you ask whether the two memory references do alias
if they are still in loopy form they do not.  For every iteration they
have a strict ordering with respect to the definition of their name.
Now if you unroll the loop and re-instantiate SSA form you can't use the
previous alias query result to determine cross-loop-iteration dependences.

The above patch disables offset-based disambiguation for accesses via
pointers (technically a nice thing to have).
Comment 29 Michael Matz 2010-07-07 12:10:50 UTC
[just for completeness to not lose the thought:]
Thinking about this some more (triggered by the problem of not having nice
back edges in irreducible loops), it's not really the back edges that are
interesting but the underlying property of SSA, namely the
correspondence between static single assignments and dynamic single
assignments: The alias oracle will give correct answers only for memory
references when it can infer runtime equality of values from syntactic
equality, which it can for a correct SSA program.

So, if M1 and M2 (two memrefs) contain mentions of syntactically the same
values, then A1/A2 (two accesses to M1/M2) have to be dominated by the
dynamically same definitions of those values.  For SSA form that's trivially
true, for RTL of course it isn't.
Comment 30 Richard Biener 2010-07-07 13:58:10 UTC
I'm going to test the patch in comment #28.
Comment 31 Richard Biener 2010-07-08 09:09:44 UTC
Subject: Bug 44838

Author: rguenth
Date: Thu Jul  8 09:09:15 2010
New Revision: 161945

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161945
Log:
2010-07-08  Richard Guenther  <rguenther@suse.de>

	PR rtl-optimization/44838
	* tree-ssa-alias.c (indirect_refs_may_alias_p): When not in
	SSA form do not use pointer equivalence.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-alias.c

Comment 32 Richard Biener 2010-07-08 09:16:49 UTC
Fixed.
Comment 33 hjl@gcc.gnu.org 2010-07-08 13:40:43 UTC
Subject: Bug 44838

Author: hjl
Date: Thu Jul  8 13:40:24 2010
New Revision: 161953

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161953
Log:
Add gcc.dg/pr44838.c.

2010-07-08  H.J. Lu  <hongjiu.lu@intel.com>

	PR rtl-optimization/44838
	* gcc.dg/pr44838.c: New.

Added:
    trunk/gcc/testsuite/gcc.dg/pr44838.c
Modified:
    trunk/gcc/testsuite/ChangeLog