This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Alias analysis - does base_alias_check still work ?
- From: Toon Moene <toon at moene dot indiv dot nluug dot nl>
- To: gcc at gcc dot gnu dot org
- Date: Fri, 19 Jul 2002 16:40:26 +0200
- Subject: Re: Alias analysis - does base_alias_check still work ?
- Organization: Moene Computational Physics, Maartensdijk, The Netherlands
- References: <3D346B28.47039CD9@moene.indiv.nluug.nl>
I wrote:
> f/com.c contains the following note, preceding the definition of
>
> #define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0
>
> /* We do not wish to use alias-set based aliasing at all. Used in the
> extreme (every object with its own set, with equivalences recorded)
> it
> might be helpful, but there are problems when it comes to inlining.
> We
> get on ok with flag_argument_noalias, and alias-set aliasing does
> currently limit how stack slots can be reused, which is a lose. */
>
> I do not know if all the facts mentioned here still actually hold, but I
> do have strong doubts that base_alias_check in alias.c still does its
> duty.
>
> Consider the following Fortran source:
>
> SUBROUTINE SIMPLE(A, B)
> B = 3.0
> A = 2.0
> B = A*B
> END
>
> one would assume that alias analysis at least once should check that the
> assignment to A in line 3 doesn't change the value of B set in line 2,
> which, with
>
> flag_argument_noalias > 1
>
> [arguments don't alias] in effect, would be the case.
>
> However, according to my experiments with setting breakpoints in
> base_alias_check, it never passes that point.
Sigh, that's just because it doesn't need to. The code generated looks
like this:
...
movl $0x40400000, (%edx) ! B=3.0
movl $0x40000000, (%eax) ! A=2.0
flds (%edx) ! put B on stack
...
which, of course, gets neatly around the problem of whether the store into
A would change B.
Now for the real issue. To have register renaming be really effective,
alias analysis has to work well.
Take the following example:
subroutine saxpy(n,sa,sx,sy)
real sx(n),sy(n),sa
integer i,n
do i = 1,n
sy(i) = sy(i) + sa*sx(i)
enddo
end
If we compile this with -O2 -march=pentium4 -mfpmath=sse -funroll-loops
-frename-registers, we get for the unrolled loop:
.L6:
movaps %xmm1, %xmm7
movaps %xmm1, %xmm6
movaps %xmm1, %xmm5
mulss (%edx), %xmm7
movaps %xmm1, %xmm4
addss (%eax), %xmm7
movss %xmm7, (%eax)
mulss 4(%edx), %xmm6
addss 4(%eax), %xmm6
movss %xmm6, 4(%eax)
mulss 8(%edx), %xmm5
addss 8(%eax), %xmm5
movss %xmm5, 8(%eax)
mulss 12(%edx), %xmm4
addl $16, %edx
addss 12(%eax), %xmm4
movss %xmm4, 12(%eax)
addl $16, %eax
subl $4, %ecx
jns .L6
Obviously, register renaming has done its work, but the (re-)scheduling of
instructions leaves something to be desired. After much gdb'ing in
sched-deps.c and alias.c I believe to have found the cause: the
rescheduling of this loop after register renaming is run after register
allocation (hey, no surprise :-). However, alias analysis is really
careful about assumptions on the contents of these hard registers, so
almost no instruction gets moved around.
OK, but what if we allow instruction scheduling before register allocation
(that would only be beneficial if the floating point (pseudo) registers
have different "names" already, but fortunately, they do), using
-fschedule-insns instead of -frename-registers:
.L6:
movaps %xmm4, %xmm0
movaps %xmm4, %xmm1
movaps %xmm4, %xmm2
movaps %xmm4, %xmm3
mulss (%edx), %xmm0
mulss 4(%edx), %xmm1
mulss 8(%edx), %xmm2
mulss 12(%edx), %xmm3
addss (%eax), %xmm0
addss 4(%eax), %xmm1
addss 8(%eax), %xmm2
addss 12(%eax), %xmm3
movss %xmm0, (%eax)
movss %xmm1, 4(%eax)
movss %xmm2, 8(%eax)
movss %xmm3, 12(%eax)
addl $16, %edx
addl $16, %eax
subl $4, %ecx
jns .L6
Bingo ! That's a lot better !
Which begs the question: Is there a reason -fschedule-insns isn't on by
default when using -O2 ?
Cheers,
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)