This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Induction variable elimination, was: Re: On the x86_64, does one have to zero a vector register before filling it completely ?


Toon Moene wrote:

I wrote:

OK, so it is an alignment issue (with -mtune=barcelona):

.L6:
        movups  0(%rbp,%rax), %xmm0
        movups  (%rbx,%rax), %xmm1
        incl    %ecx
        addps   %xmm1, %xmm0
        movaps  %xmm0, (%r8,%rax)
        addq    $16, %rax
        cmpl    %r10d, %ecx
        jb      .L6

Once this problem is solved (well, determined how it could be solved), we go on to the next, the extraneous induction variable %ecx.


There are two ways to deal with it:

1. Eliminate it with respect to the other induction variable that
   counts in the same direction (upwards, with steps 16) and remember
   that induction variable's (%rax) limit.

Just for completeness - gcc *does* know how to do this; it just doesn't work when vectorizing.


This is what I get when compiling with -O2 -S:

.L3:
        movss   (%rdi,%rax), %xmm0
        addss   (%rsi,%rax), %xmm0
        movss   %xmm0, (%rdx,%rax)
        addq    $4, %rax
        cmpq    %rcx, %rax
        jne     .L3

Note how %rax remains as sole induction variable.

--
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]