*From*: Steve Kargl <sgk at troutmask dot apl dot washington dot edu>*To*: roger at eyesopen dot com*Cc*: gcc-patches at gcc dot gnu dot org, fortran at gcc dot gnu dot org*Date*: Mon, 18 Dec 2006 17:08:16 -0800*Subject*: Re: [fortran PATCH] Implement a(:,:) = 0.0 using memset*References*: <4568.208.41.78.162.1166463384.squirrel@mail.eyesopen.com>

On Mon, Dec 18, 2006 at 10:36:24AM -0700, roger@eyesopen.com wrote: > > The following patch makes use of the recently added gfc_full_array_ref_p > function to provide the optimization of using memset when assigning an > entire array to zero. Currently, the source code below: > > integer :: a(20) > a(:) = 0; > > we currently generate the following with -fdump-tree-original > > int8 S.0; > > S.0 = 1; > while (1) > { > if (S.0 > 20) goto L.1; else (void) 0; > (*a)[NON_LVALUE_EXPR <S.0> + -1] = 0; > S.0 = S.0 + 1; > } > L.1:; > > with the patch below, we now generate this instead. > > (void) __builtin_memset ((void *) a, 0, 80); > > > This can then take advantage of GCC's intrinsic expansion machinery, > including Jan's recent improvements for x86. I'm keen to hear if there > are any corner cases that I've overlooked and aren't covered by the > gfortran > testsuite. Perhaps if someone could run NIST, polyhedron and the usual > suspects to confirm there are no issues. > Roger, This patch appears to have very little affect on Polyhedron. troutmask:sgk[203] cd work/pb05/ troutmask:sgk[204] rm *.original troutmask:sgk[205] gfc4x -c -O2 -fdump-tree-original -w *.f90 troutmask:sgk[206] grep memset *.original | grep void rnflow.f90.003t.original: (void) __builtin_memset ((void *) mtrsbt, 0, 262144); rnflow.f90.003t.original: (void) __builtin_memset ((void *) mtrsrt, 0, 262144); rnflow.f90.003t.original: (void) __builtin_memset ((void *) ptrst, 0, 262144); troutmask:sgk[207] wc -l *.f90 810 ac.f90 51885 aermod.f90 1701 air.f90 626 capacita.f90 296 channel.f90 6065 doduc.f90 1680 fatigue.f90 2428 gas_dyn.f90 6635 induct.f90 696 linpk.f90 2170 mdbx.f90 309 nf.f90 2190 protein.f90 4620 rnflow.f90 4447 test_fpu.f90 416 tfft.f90 86974 total There are more than 3 instance of "x(:) = 0." in the code. In trying to understand what was going on, I noticed that your optimization is not used in subroutine a integer x(20) x = 0 end subroutine a a () { int4 x[20]; { int8 S.0; S.0 = 1; while (1) { if (S.0 > 20) goto L.1; else (void) 0; x[S.0 + -1] = 0; S.0 = S.0 + 1; } L.1:; } } If the routine was declared as "a(x)" in the above, then things work as expected. More importantly, it has no effect on arrays in the main program where one may anticipate initialization to occur. It is not uncommon to see something like program b integer x(20) call init contains subroutine init x = 0 end subroutine init end program b -- Steve

