This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization



------- Comment #11 from jv244 at cam dot ac dot uk  2008-08-19 06:09 -------
Created an attachment (id=16095)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16095&action=view)
new testcase 

This (PR31079_11.f90) should be a replacement for comment #4, and illustrates
the vectorizer issue.

> gfortran -O3 -ftree-vectorize -ffast-math -march=native PR31079_11.f90
> ./a.out
   4.0282512

> ifort -O3 -xT PR31079_11.f90
PR31079_11.f90(52): (col. 13) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(52): (col. 13) remark: BLOCK WAS VECTORIZED.
PR31079_11.f90(52): (col. 13) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(52): (col. 13) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(17): (col. 8) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(24): (col. 5) remark: BLOCK WAS VECTORIZED.
PR31079_11.f90(30): (col. 7) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(31): (col. 7) remark: LOOP WAS VECTORIZED.
> ./a.out
   2.640165

The inner loop looks like:

    DO i=1,N
      s(1:2)=s(1:2)+pxy(i)%a(:)*dpy(i)%a(1)
      s(3:4)=s(3:4)+pxy(i)%a(:)*dpy(i)%a(2)
    ENDDO

which ifort vectorizes (I will attach the full asm):

..B3.4:                         # Preds ..B3.4 ..B3.3
        movddup   collocate_core_2_2_0_0_$DPY.0.1(%rax), %xmm2  #30.33
        movddup   8+collocate_core_2_2_0_0_$DPY.0.1(%rax), %xmm4 #31.33
        movaps    collocate_core_2_2_0_0_$PXY.0.1(%rax), %xmm3  #30.7
        mulpd     %xmm3, %xmm2                                  #30.32
        incq      %rdx                                          #29.5
        addq      $16, %rax                                     #29.5
        addpd     %xmm2, %xmm1                                  #30.7
        cmpq      $1000, %rdx                                   #29.5
        mulpd     %xmm3, %xmm4                                  #31.32
        addpd     %xmm4, %xmm0                                  #31.7
        jl        ..B3.4        # Prob 99%                      #29.5


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]