This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch fortran] PR41113 and PR41117 - Unnecessary invocations of internal_pack
- From: Steve Kargl <sgk at troutmask dot apl dot washington dot edu>
- To: Paul Richard Thomas <paul dot richard dot thomas at gmail dot com>
- Cc: fortran at gcc dot gnu dot org, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Sun, 27 Dec 2009 14:12:50 -0800
- Subject: Re: [Patch fortran] PR41113 and PR41117 - Unnecessary invocations of internal_pack
- References: <339c37f20912220513v66ac9cf9ge4a037970d080a62@mail.gmail.com>
On Tue, Dec 22, 2009 at 02:13:18PM +0100, Paul Richard Thomas wrote:
> This patch should effect speedups of some calls with array valued
> actual arguments. I have yet to test any of the standard benchmarks.
> It does this by removing the need to call internal_pack and _unpack
> whan array sections are contiguous or are full array components of
> derived types.
>
> Bootstrapped and regtested on X86_64/FC9 - OK for trunk?
>
Paul,
I applied the patch and ran my version of the Polyhedron
Benchmark (simple average of 3 runs). I did not see any
improves or regressions. My flags were -w -O2 -pipe
-march=native -funroll-loops -ftree-vectorize
laptop:kargl[209] more zxc
Unpatched Patched
ac.f90 17.48 17.47
aermod.f90 47.46 47.47
air.f90 18.61 18.35
capacita.f90 93.22 93.73
channel.f90 13.83 13.75
doduc.f90 53.37 53.28
fatigue.f90 13.27 13.35
gas_dyn.f90 14.35 14.26
induct.f90 32.54 32.51
linpk.f90 29.49 29.18
mdbx.f90 21.53 21.87
nf.f90 39.48 39.99
protein.f90 61.83 61.69
rnflow.f90 47.46 47.55
test_fpu.f90 26.29 26.37
tfft.f90 9.18 9.67
I got curious and added -fdump-tree-original to the flags. None
of the dumps contained a internal_pack and _unpack string. Perhaps,
Joost can run cp2k against the patch.
--
Steve