[RFC PATCH]: Reciprocal sqrt (rsqrt) conversion pass

Uros Bizjak ubizjak@gmail.com
Wed Jun 13 20:47:00 GMT 2007


Hello!

Attached patch now implements fully-functional rsqrt pass, with all 
target-dependent stuff (for i386 target). The patch generates rcpss, 
rcpps, rsqrtss and rsqrtps, together with NR-step.

Most notable change from previous version is, that sqrt(a) is now 
converted during RTL expansion time, so sqrt(a) -> a*rsqrt(a) is not 
needed on tree level any more. This change is due to expansion of 
sqrt(), where:

(sqrt a = 0.5 * a * rsqrtss(a) * (3.0 - a * rsqrtss(a) * rsqrtss(a)))

we can CSE [a*rsqrtss(a)] from the expression above saving one multiply. 
And a*rsqrt(a) didn't fold on tree level...

I'll write ChangeLog and testcases tomorrow.

gas_dyn.f90 now finish the test run without problems, showing a _nice_ 
speed up there. The score is:

-O3 -ffast-math -funroll-loops: 0m14.993s
-O3 -ffast-math -funroll-loops -mrecip: 0m7.512s
-O3 -ffast-math -funroll-loops -mrecip -ftree-vectorize 0m6.348s
-O3 -ffast-math -funroll-loops -mrecip -ftree-vectorize gas_dyn.f90 0m4.488s

Yes, this is now with correct output results. Patched gcc creates 30% - 
50% faster code. The timings are achieved on C2DEE.

As a proof, here are last lines of gas_dyn.out, without -mrecip:

 HYDRODYNAMIC FRONT AT X= 1.74086E-02 IN CELL   555
 CELL SETTING DT IS    50, V= 1.93338E+04, CS= 2.36582E+03
 HYDRODYNAMIC FRONT AT X= 1.74095E-02 IN CELL   555
 CELL SETTING DT IS    50, V= 1.93338E+04, CS= 2.36582E+03
 HYDRODYNAMIC FRONT AT X= 1.74103E-02 IN CELL   555
 CELL SETTING DT IS    50, V= 1.93338E+04, CS= 2.36582E+03
 HYDRODYNAMIC FRONT AT X= 1.74112E-02 IN CELL   555
 CELL SETTING DT IS    50, V= 1.93338E+04, CS= 2.36582E+03
 ----->  CYCLE=20000 TIME= 8.67749E-07 DT= 4.34575E-11


 NOZZLE CONDITIONS AT BEGINNING OF TIME STEP:
   PRESSURE= 4.00000E+03
   DENSITY = 1.00000E-03
   ENERGY  = 1.00000E+07
   VELOCITY= 1.93330E+04

 HYDRODYNAMIC FRONT AT X= 1.74121E-02 IN CELL   555
 CELL SETTING DT IS    50, V= 1.93338E+04, CS= 2.36582E+03

and with -mrecip:

 HYDRODYNAMIC FRONT AT X= 1.74078E-02 IN CELL   555
 CELL SETTING DT IS    27, V= 1.93322E+04, CS= 2.36749E+03
 HYDRODYNAMIC FRONT AT X= 1.74087E-02 IN CELL   555
 CELL SETTING DT IS    27, V= 1.93322E+04, CS= 2.36749E+03
 HYDRODYNAMIC FRONT AT X= 1.74096E-02 IN CELL   555
 CELL SETTING DT IS    27, V= 1.93322E+04, CS= 2.36749E+03
 HYDRODYNAMIC FRONT AT X= 1.74104E-02 IN CELL   555
 CELL SETTING DT IS    27, V= 1.93322E+04, CS= 2.36749E+03
 ----->  CYCLE=20000 TIME= 8.67748E-07 DT= 4.34573E-11


 NOZZLE CONDITIONS AT BEGINNING OF TIME STEP:
   PRESSURE= 4.00000E+03
   DENSITY = 1.00000E-03
   ENERGY  = 1.00000E+07
   VELOCITY= 1.93330E+04

 HYDRODYNAMIC FRONT AT X= 1.74113E-02 IN CELL   555
 CELL SETTING DT IS    27, V= 1.93322E+04, CS= 2.36749E+03

Uros.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gcc-recip-3.diff.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20070613/65f101a8/attachment.txt>


More information about the Gcc-patches mailing list