This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch, rfc] Make store motion use alias oracle


Victor Kaplansky <VICTORK@il.ibm.com> writes:

  > > would make us replace "n" with a local variable, but not to move the
  > > invariants out of the loop).  I will try to prepare a patch that fixes
  > > these problems.
  > 
  > We have seen another case where alias oracle can help vectorizer and
  > provide significant improvement. Here is a simplified example:
  > 
  >   1 struct
  >   2 {
  >   3   int x;
  >   4   int y;
  >   5 } S[100];
  >   6
  >   7 int z[100];
  >   8
  >   9 void
  >  10 foo (void)
  >  11 {
  >  12   int i;
  >  13   int x, y;
  >  14
  >  15   S[5].x = 0;
  >  16   S[5].y = 0;
  >  17
  >  18   x = S[5].x;
  >  19   y = S[5].y;
  >  20
  >  21   x = x + z[0];
  >  22   y = y + z[0];
  >  23
  >  24   S[5].x = x;
  >  25   S[5].y = y;
  >  26
  >  27 }
  > 
  > If STORE_CCP can use extra aliasing info, it should be able
  > with help of DSE to get rid of stores and loads in lines 15-19.
  > 
  > Do you think your patch would/could cover this as well?  What is
  > the right way to deal with this?

IMHO the right way is probably to improve the tree level alias
analysis which is still _weaker_ that the RTL one in solving 3
fundamental problems:

 - accesses to different fields of the same struct
 - accesses to different elements of the same array
 - restricted pointers

An example:

struct s {  int a;  int b;};
void foo (struct s *ps,  int *p, int *__restrict__ rp, int
*__restrict__ rq)
{
  ps->a = 0;
  ps->b = 1;
  if (ps->a != 0)    abort ();
  p[0] = 0;
  p[1] = 1;
  if (p[0] != 0)     abort ();
  rp[0] = 0;
  rq[0] = 1;
  if (rp[0] != 0)     abort();
}

The tree optimizers don't do anything interesting with this function,
cse eliminates all the ifs.

These 3 types of anti-aliasing are quite important for vectorization
as in your example. But also for Fortran, where a lot of code tends to
computations with arrays. And also for C++, where a lot of memory
references are of the for this->var1.

[Ooop it looks like there's a regression on the code above!] 
The 4.1 (and earlier)  -O2 assembly looks like this: 

bar:
        movl    4(%esp), %eax
        movl    8(%esp), %edx
        movl    $0, (%eax)
        movl    $1, 4(%eax)
        movl    12(%esp), %eax
        movl    $0, (%edx)
        movl    $1, 4(%edx)
        movl    $0, (%eax)
        movl    16(%esp), %eax
        movl    $1, (%eax)
        ret

whereas now SVN HEAD generates:

        subl    $12, %esp
        movl    16(%esp), %eax
        movl    20(%esp), %edx
        movl    24(%esp), %ecx
        movl    $0, (%eax)
        movl    $1, 4(%eax)
        movl    (%eax), %eax
        testl   %eax, %eax
        jne     .L20
        movl    $0, (%edx)
        movl    (%edx), %eax
        movl    $1, 4(%edx)
        testl   %eax, %eax
        jne     .L20
        movl    $0, (%ecx)
        movl    (%ecx), %ecx
        movl    28(%esp), %eax
        testl   %ecx, %ecx
        movl    $1, (%eax)
        jne     .L20
        addl    $12, %esp
        ret
.L20:
        call    abort

I'll file a bug report.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]