This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [fortran PATCH] Implement a(:,:) = 0.0 using memset (take 2)


On Mon, Dec 18, 2006 at 06:55:22PM -0700, roger@eyesopen.com wrote:
> 
> The use of __builtin_memset, allows the compiler to generate the most
> efficient idiom for clearing a block of memory.  It tuens out that the
> above assignment is in a critical function of fatigue, and optimizing
> it using memset shows an observable speed-up on x86_64-unknown-linux-gnu.
> 
> Time Before:  21.76s  21.67s  21.73s
> Time After:   20.30s  20.03s  20.04s
> 
> which is approximately a 7% performance improvement.
> 

             w/o       patche   patche2
ac         17.1825    17.1825   17.1725
aermod     43.3275    43.3625   44.265
air        18.23      18.31     18.0975
capacita  105.573    105.36    105.532
channel    13.5675    13.6825   13.865
doduc      52.075     52.025    52.0375
fatigue    25.0275    24.975    23.155
gas_dyn    15.0975    14.905    14.8875
induct     67.855     67.835    67.56
linpk      29.3725    28.715    29.025
mdbx       22.325     22.3725   22.2975
nf         40.93      41.01     40.9275
protein    64.9025    65.0525   64.9025
rnflow     40.445     40.535    40.59
test_fpu   26.19      26.185    25.95
tfft        9.2475     9.2025    9.175

The above represent the average of 4 consecutive runs of
each program.  Roger's patche2 indeed improves fatigue.
Other programs show a +-1 to 2% improvement, but I'd need
to look at the statistical significance of the simple benchmark
to determine if there is a real affect.

-- 
Steve


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]