[RFC PATCH]: Reciprocal sqrt (rsqrt) conversion pass
Uros Bizjak
ubizjak@gmail.com
Wed Jun 13 20:47:00 GMT 2007
Hello!
Attached patch now implements fully-functional rsqrt pass, with all
target-dependent stuff (for i386 target). The patch generates rcpss,
rcpps, rsqrtss and rsqrtps, together with NR-step.
Most notable change from previous version is, that sqrt(a) is now
converted during RTL expansion time, so sqrt(a) -> a*rsqrt(a) is not
needed on tree level any more. This change is due to expansion of
sqrt(), where:
(sqrt a = 0.5 * a * rsqrtss(a) * (3.0 - a * rsqrtss(a) * rsqrtss(a)))
we can CSE [a*rsqrtss(a)] from the expression above saving one multiply.
And a*rsqrt(a) didn't fold on tree level...
I'll write ChangeLog and testcases tomorrow.
gas_dyn.f90 now finish the test run without problems, showing a _nice_
speed up there. The score is:
-O3 -ffast-math -funroll-loops: 0m14.993s
-O3 -ffast-math -funroll-loops -mrecip: 0m7.512s
-O3 -ffast-math -funroll-loops -mrecip -ftree-vectorize 0m6.348s
-O3 -ffast-math -funroll-loops -mrecip -ftree-vectorize gas_dyn.f90 0m4.488s
Yes, this is now with correct output results. Patched gcc creates 30% -
50% faster code. The timings are achieved on C2DEE.
As a proof, here are last lines of gas_dyn.out, without -mrecip:
HYDRODYNAMIC FRONT AT X= 1.74086E-02 IN CELL 555
CELL SETTING DT IS 50, V= 1.93338E+04, CS= 2.36582E+03
HYDRODYNAMIC FRONT AT X= 1.74095E-02 IN CELL 555
CELL SETTING DT IS 50, V= 1.93338E+04, CS= 2.36582E+03
HYDRODYNAMIC FRONT AT X= 1.74103E-02 IN CELL 555
CELL SETTING DT IS 50, V= 1.93338E+04, CS= 2.36582E+03
HYDRODYNAMIC FRONT AT X= 1.74112E-02 IN CELL 555
CELL SETTING DT IS 50, V= 1.93338E+04, CS= 2.36582E+03
-----> CYCLE=20000 TIME= 8.67749E-07 DT= 4.34575E-11
NOZZLE CONDITIONS AT BEGINNING OF TIME STEP:
PRESSURE= 4.00000E+03
DENSITY = 1.00000E-03
ENERGY = 1.00000E+07
VELOCITY= 1.93330E+04
HYDRODYNAMIC FRONT AT X= 1.74121E-02 IN CELL 555
CELL SETTING DT IS 50, V= 1.93338E+04, CS= 2.36582E+03
and with -mrecip:
HYDRODYNAMIC FRONT AT X= 1.74078E-02 IN CELL 555
CELL SETTING DT IS 27, V= 1.93322E+04, CS= 2.36749E+03
HYDRODYNAMIC FRONT AT X= 1.74087E-02 IN CELL 555
CELL SETTING DT IS 27, V= 1.93322E+04, CS= 2.36749E+03
HYDRODYNAMIC FRONT AT X= 1.74096E-02 IN CELL 555
CELL SETTING DT IS 27, V= 1.93322E+04, CS= 2.36749E+03
HYDRODYNAMIC FRONT AT X= 1.74104E-02 IN CELL 555
CELL SETTING DT IS 27, V= 1.93322E+04, CS= 2.36749E+03
-----> CYCLE=20000 TIME= 8.67748E-07 DT= 4.34573E-11
NOZZLE CONDITIONS AT BEGINNING OF TIME STEP:
PRESSURE= 4.00000E+03
DENSITY = 1.00000E-03
ENERGY = 1.00000E+07
VELOCITY= 1.93330E+04
HYDRODYNAMIC FRONT AT X= 1.74113E-02 IN CELL 555
CELL SETTING DT IS 27, V= 1.93322E+04, CS= 2.36749E+03
Uros.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gcc-recip-3.diff.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20070613/65f101a8/attachment.txt>
More information about the Gcc-patches
mailing list