This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: vectorizer question


2008/8/18 VandeVondele Joost <vondele@pci.uzh.ch>:
>
> The attached testcase yields (on a core2 duo, gcc trunk):
>
>> gfortran -O3 -ftree-vectorize -ffast-math -march=native test.f90
>> time ./a.out
>
> real    0m3.414s
>
>> ifort -xT -O3  test.f90
>> time ./a.out
>
> real    0m1.556s
>
> The assembly contains:
>
>        ifort   gfortran
> mulpd     140          0
> mulsd       0        280
>
> so the reason seems that ifort vectorizes the following code (full testcase
> attached):
>
> SUBROUTINE collocate_core_6(res,coef_xyz,pol_x,pol_y,pol_z,cmax,kg,jg)
>
>  IMPLICIT NONE
>  INTEGER, PARAMETER :: wp = SELECTED_REAL_KIND ( 14, 200 )
>  integer, PARAMETER :: lp=6
>    real(wp), INTENT(OUT)    :: res
>    integer, INTENT(IN)     :: cmax,kg,jg
>    real(wp), INTENT(IN)    :: pol_x(0:lp,-cmax:cmax)
>    real(wp), INTENT(IN)    :: pol_y(1:2,0:lp,-cmax:0)
>    real(wp), INTENT(IN)    :: pol_z(1:2,0:lp,-cmax:0)
>    real(wp), INTENT(IN)    :: coef_xyz(((lp+1)*(lp+2)*(lp+3))/6)
>    real(wp) ::  coef_xy(2,(lp+1)*(lp+2)/2)
>    real(wp) ::  coef_x(4,0:lp)
>
> [...]
>    coef_x(1:2,4)=coef_x(1:2,4)+coef_xy(1:2,12)*pol_y(1,1,jg)
>    coef_x(3:4,4)=coef_x(3:4,4)+coef_xy(1:2,12)*pol_y(2,1,jg)
>    coef_x(1:2,5)=coef_x(1:2,5)+coef_xy(1:2,13)*pol_y(1,1,jg)
>    coef_x(3:4,5)=coef_x(3:4,5)+coef_xy(1:2,13)*pol_y(2,1,jg)
>    coef_x(1:2,0)=coef_x(1:2,0)+coef_xy(1:2,14)*pol_y(1,2,jg)
>    coef_x(3:4,0)=coef_x(3:4,0)+coef_xy(1:2,14)*pol_y(2,2,jg)
>    coef_x(1:2,1)=coef_x(1:2,1)+coef_xy(1:2,15)*pol_y(1,2,jg)
>    coef_x(3:4,1)=coef_x(3:4,1)+coef_xy(1:2,15)*pol_y(2,2,jg)
>    coef_x(1:2,2)=coef_x(1:2,2)+coef_xy(1:2,16)*pol_y(1,2,jg)
>    coef_x(3:4,2)=coef_x(3:4,2)+coef_xy(1:2,16)*pol_y(2,2,jg)
>    coef_x(1:2,3)=coef_x(1:2,3)+coef_xy(1:2,17)*pol_y(1,2,jg)
>    coef_x(3:4,3)=coef_x(3:4,3)+coef_xy(1:2,17)*pol_y(2,2,jg)
>    coef_x(1:2,4)=coef_x(1:2,4)+coef_xy(1:2,18)*pol_y(1,2,jg)
>    coef_x(3:4,4)=coef_x(3:4,4)+coef_xy(1:2,18)*pol_y(2,2,jg)
>    coef_x(1:2,0)=coef_x(1:2,0)+coef_xy(1:2,19)*pol_y(1,3,jg)
>    coef_x(3:4,0)=coef_x(3:4,0)+coef_xy(1:2,19)*pol_y(2,3,jg)
> [...]
>
> either it is able to interpret the short vectors as such, or it realizes
> that these very short implicit loops are nevertheless favourable for
> vectorization.
>
> Is there a trick to get gcc vectorize these loops, or is there some
> technology missing for this ?
>
> Should I file a PR for this (this is somewhat similar to PR31079 and
> PR31021)?

It would be nice to have a stand-alone testcase for this, so please
file a bugreport.

Thanks,
Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]