This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [fortran PATCH] Implement a(:,:) = 0.0 using memset (take 2)
On Mon, Dec 18, 2006 at 06:55:22PM -0700, roger@eyesopen.com wrote:
>
> The use of __builtin_memset, allows the compiler to generate the most
> efficient idiom for clearing a block of memory. It tuens out that the
> above assignment is in a critical function of fatigue, and optimizing
> it using memset shows an observable speed-up on x86_64-unknown-linux-gnu.
>
> Time Before: 21.76s 21.67s 21.73s
> Time After: 20.30s 20.03s 20.04s
>
> which is approximately a 7% performance improvement.
>
w/o patche patche2
ac 17.1825 17.1825 17.1725
aermod 43.3275 43.3625 44.265
air 18.23 18.31 18.0975
capacita 105.573 105.36 105.532
channel 13.5675 13.6825 13.865
doduc 52.075 52.025 52.0375
fatigue 25.0275 24.975 23.155
gas_dyn 15.0975 14.905 14.8875
induct 67.855 67.835 67.56
linpk 29.3725 28.715 29.025
mdbx 22.325 22.3725 22.2975
nf 40.93 41.01 40.9275
protein 64.9025 65.0525 64.9025
rnflow 40.445 40.535 40.59
test_fpu 26.19 26.185 25.95
tfft 9.2475 9.2025 9.175
The above represent the average of 4 consecutive runs of
each program. Roger's patche2 indeed improves fatigue.
Other programs show a +-1 to 2% improvement, but I'd need
to look at the statistical significance of the simple benchmark
to determine if there is a real affect.
--
Steve