This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Alias analysis - does base_alias_check still work ?


I wrote:

> f/com.c contains the following note, preceding the definition of
> 
> #define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0
> 
> /* We do not wish to use alias-set based aliasing at all.  Used in the
>    extreme (every object with its own set, with equivalences recorded)
> it
>    might be helpful, but there are problems when it comes to inlining.
> We
>    get on ok with flag_argument_noalias, and alias-set aliasing does
>    currently limit how stack slots can be reused, which is a lose.  */
> 
> I do not know if all the facts mentioned here still actually hold, but I
> do have strong doubts that base_alias_check in alias.c still does its
> duty.
> 
> Consider the following Fortran source:
> 
>       SUBROUTINE SIMPLE(A, B)
>       B = 3.0
>       A = 2.0
>       B = A*B
>       END
> 
> one would assume that alias analysis at least once should check that the
> assignment to A in line 3 doesn't change the value of B set in line 2,
> which, with
> 
>         flag_argument_noalias > 1
> 
> [arguments don't alias] in effect, would be the case.
> 
> However, according to my experiments with setting breakpoints in
> base_alias_check, it never passes that point.

Sigh, that's just because it doesn't need to.  The code generated looks
like this:

	...
        movl    $0x40400000, (%edx)	! B=3.0
        movl    $0x40000000, (%eax)	! A=2.0
        flds    (%edx)			! put B on stack
	...

which, of course, gets neatly around the problem of whether the store into
A would change B.

Now for the real issue.  To have register renaming be really effective,
alias analysis has to work well.

Take the following example:

      subroutine saxpy(n,sa,sx,sy)
      real sx(n),sy(n),sa
      integer i,n
      do i = 1,n
        sy(i) = sy(i) + sa*sx(i)
      enddo
      end

If we compile this with -O2 -march=pentium4 -mfpmath=sse -funroll-loops
-frename-registers, we get for the unrolled loop:

.L6:
        movaps  %xmm1, %xmm7
        movaps  %xmm1, %xmm6
        movaps  %xmm1, %xmm5
        mulss   (%edx), %xmm7
        movaps  %xmm1, %xmm4
        addss   (%eax), %xmm7
        movss   %xmm7, (%eax)
        mulss   4(%edx), %xmm6
        addss   4(%eax), %xmm6
        movss   %xmm6, 4(%eax)
        mulss   8(%edx), %xmm5
        addss   8(%eax), %xmm5
        movss   %xmm5, 8(%eax)
        mulss   12(%edx), %xmm4
        addl    $16, %edx
        addss   12(%eax), %xmm4
        movss   %xmm4, 12(%eax)
        addl    $16, %eax
        subl    $4, %ecx
        jns     .L6

Obviously, register renaming has done its work, but the (re-)scheduling of
instructions leaves something to be desired.  After much gdb'ing in
sched-deps.c and alias.c I believe to have found the cause: the
rescheduling of this loop after register renaming is run after register
allocation (hey, no surprise :-).  However, alias analysis is really
careful about assumptions on the contents of these hard registers, so
almost no instruction gets moved around.

OK, but what if we allow instruction scheduling before register allocation
(that would only be beneficial if the floating point (pseudo) registers
have different "names" already, but fortunately, they do), using
-fschedule-insns instead of -frename-registers:

.L6:
        movaps  %xmm4, %xmm0
        movaps  %xmm4, %xmm1
        movaps  %xmm4, %xmm2
        movaps  %xmm4, %xmm3
        mulss   (%edx), %xmm0
        mulss   4(%edx), %xmm1
        mulss   8(%edx), %xmm2
        mulss   12(%edx), %xmm3
        addss   (%eax), %xmm0
        addss   4(%eax), %xmm1
        addss   8(%eax), %xmm2
        addss   12(%eax), %xmm3
        movss   %xmm0, (%eax)
        movss   %xmm1, 4(%eax)
        movss   %xmm2, 8(%eax)
        movss   %xmm3, 12(%eax)
        addl    $16, %edx
        addl    $16, %eax
        subl    $4, %ecx
        jns     .L6

Bingo !  That's a lot better !

Which begs the question: Is there a reason -fschedule-insns isn't on by
default when using -O2 ?

Cheers,

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]