This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Revision 166517 caused PR 46414


Hi,
it seems that the mail got stuck, I sent it now.
The problem in testcase is with:
void t3(void)
{
 int i;

 for (i = 0; i < 16; i++)
   r[i] = sqrtf (a[i]);
}
this gets vectorized into loop iterating twice for 32bit:
.L9:
        vmovaps a(%eax), %ymm1
        vrsqrtps        %ymm1, %ymm0
        vcmpneqps       %ymm1, %ymm2, %ymm5
        vandps  %ymm5, %ymm0, %ymm0
        vmulps  %ymm1, %ymm0, %ymm1
        vmulps  %ymm0, %ymm1, %ymm0
        vmulps  %ymm3, %ymm1, %ymm1
        vaddps  %ymm4, %ymm0, %ymm0
        vmulps  %ymm1, %ymm0, %ymm1
        vmovaps %ymm1, r(%eax)
        addl    $32, %eax
        cmpl    $64, %eax
        jne     .L9
        leave
.LCFI8: 


while because of better builtin costs in 64bit target we unroll it

.LCFI10:
        vmovaps a(%rip), %ymm1
        vmovaps .LC0(%rip), %ymm3
        vcmpneqps       %ymm1, %ymm4, %ymm2
        vrsqrtps        %ymm1, %ymm0
        vandps  %ymm2, %ymm0, %ymm0
        vmovaps .LC1(%rip), %ymm2
        vmulps  %ymm1, %ymm0, %ymm1
        vmulps  %ymm0, %ymm1, %ymm0
        vmulps  %ymm2, %ymm1, %ymm1
        vaddps  %ymm3, %ymm0, %ymm0
        vmulps  %ymm1, %ymm0, %ymm1
        vmovaps %ymm1, r(%rip)
        vmovaps a+32(%rip), %ymm1
        vcmpneqps       %ymm1, %ymm4, %ymm4
        vrsqrtps        %ymm1, %ymm0
        vandps  %ymm4, %ymm0, %ymm0
        vmulps  %ymm1, %ymm0, %ymm1
        vmulps  %ymm0, %ymm1, %ymm0
        vmulps  %ymm2, %ymm1, %ymm1
        vaddps  %ymm3, %ymm0, %ymm3
        vmulps  %ymm1, %ymm3, %ymm1
        vmovaps %ymm1, r+32(%rip)
        leave


The complette unrilling seems sane here.
This seems like good idea to me and i am not quite sure why 32bit compilation does not do the
transofrm. I will check now.

Since you are author of testcase, can you, please update it to be more robust?
Perhaps by making the loop to iterate more times so complette unrolling would be too expensive

Honza
> Hi,
> 
> Revision 166517:
> 
> http://gcc.gnu.org/ml/gcc-cvs/2010-11/msg00405.html
> 
> caused
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46414
> 
> Also I couldn't find the patch in the gcc-patches mailing list archive.
> 
> -- 
> H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]