[PATCH, i386] Unify TARGET_SSE_MATH for trunc* patterns
Uros Bizjak
uros@kss-loka.si
Fri Dec 17 10:39:00 GMT 2004
Hello!
This patch fixes TARGET_SSE_MATH for trunc* patterns. It prevents
generation of cvtsd2ss insn for -mfpmath=sse. Patch was bootstrapped on
pentium4-pc-linux-gnu, regtesting is in progress for c and c++.
In addition, povray-3.50c was built for various architectures (pentium,
pentium3, pentium4) with all combinations of -mfpmath. A benchmark.pov
picture was generated and visually checked for correctnes.
Here are the results of generating benchmark.pov picture for -O3
-march=pentium4 -ffast-math and various -mfpmath:
pentium4 3.2G
gcc version 4.0.0 20041217 (experimental)
PovRay-3.50c
sse:
Time For Parse: 0 hours 0 minutes 1.0 seconds (1 seconds)
Time For Photon: 0 hours 0 minutes 37.0 seconds (37 seconds)
Time For Trace: 0 hours 5 minutes 31.0 seconds (331 seconds)
Total Time: 0 hours 6 minutes 9.0 seconds (369 seconds)
387:
Time For Parse: 0 hours 0 minutes 1.0 seconds (1 seconds)
Time For Photon: 0 hours 0 minutes 34.0 seconds (34 seconds)
Time For Trace: 0 hours 5 minutes 16.0 seconds (316 seconds)
Total Time: 0 hours 5 minutes 51.0 seconds (351 seconds)
mixed:
Time For Parse: 0 hours 0 minutes 2.0 seconds (2 seconds)
Time For Photon: 0 hours 0 minutes 38.0 seconds (38 seconds)
Time For Trace: 0 hours 5 minutes 36.0 seconds (336 seconds)
Total Time: 0 hours 6 minutes 16.0 seconds (376 seconds)
As it can be seen, mfpmath=387 now beats -mfpmath=sse by 18 seconds,
that is ~5%. I guess this is quite good :) .
Remaining TARGET_SSE_MATH issues remain with fix* patterns and fp
compare patterns (this is PR target/19009). These issues will be fixed
sometime next week.
2004-12-17 Uros Bizjak <uros@kss-loka.si>
* config/i386/i386.md (truncdfsf2, *truncdfsf2_2,
*truncdfsf2_2_nooverlap, *truncdfsf2_sse_only_nooverlap:
Unify enable constraint with respect to TARGET_SSE2,
TARGET_80387, TARGET_SSE_MATH and TARGET_MIX_SSE_I387.
(define_split): Same for correspondig split patterns.
(truncdfsf2): Use TARGET_SSE_MATH for SSE pattern generation.
(truncdfsf2_noop): Rename to truncdfsf2_i387_noop.
(*truncdfsf2_1_sse): Rename to *truncdfsf2_2.
(*truncdfsf2_1_sse_nooverlap): Rename to *truncdfsf2_2_nooverlap.
(*truncdfsf2_2): Rename to *truncdfsf2_mixed.
(*truncdfsf2_2_nooverlap): Rename to *truncdfsf2_mixed_nooverlap.
(*truncdfsf2_3): Rename to *truncdfsf2_i387.
(truncdfsf2_sse_only): Rename to truncdfsf2_sse.
(*truncdfsf2_sse_only_nooverlap): Rename to
*truncdfsf2_sse_nooverlap.
(truncxf{s,d}f2_noop) Rename to truncxf{d,s}f2_i387_noop.
(*truncxfsf2_2): Rename to *truncxfsf2_i387.
(truncxf{s,d}f2, fmod{s,d}f3, drem{s,d}f3, log1p{s,d}f2,
rint{s,d}f2, floor{s,d}f2, ceil{s,d}f2, btrunc{s,d}f2,
nearbyint{s,d}f2): Use renamed patterns.
Uros.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fpmath.diff
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20041217/3f758675/attachment.ksh>
More information about the Gcc-patches
mailing list