[PATCH, i386] Unify TARGET_SSE_MATH and TARGET_MIX_SSE_I387 in insn constraints
Uros Bizjak
uros@kss-loka.si
Wed Dec 8 16:20:00 GMT 2004
Hello!
There is a big mess regarding TARGET_SSE_MATH and TARGET_MIX_SSE_I387
handling in i386.md file. Actually every pattern handles these macros in
its own way and it is very difficult to figure out the conditions under
which certain pattern is enabled.
I have encoutered a problem with float* patterns, which have wrong
enable condition defined, so SSE cvtsi2s{s,d} instruction was
generated unconditinally for -mfpmath=387,sse and -mfpmath=i387. This is
int->float conversion instruction, and i387 insn is quite faster than
SSE one. Additional trouble with this insn is, that it converts from
integer to SSE register, so additional sse->mem>fp moves were needed to
get value into FP register. The produced asm code was horrible,
something like this:
...
cvtsi2ss 12(%ebp), %xmm0
movss %xmm0, -8(%ebp)
flds -8(%ebp)
...
To avoid this problems, I would like to propose unified handling of
TARGET_SSE_MATH and TARGET_MIX_SSE_I387 macros in following way:
SF pattern that can be implemented with either sse or i387 code would
have a constraint:
"TARGET_80387 || TARGET_SSE_MATH"
When SF i387 insn should be generated its constraint would be:
"TARGET_80387 && !TARGET_SSE_MATH"
SF SSE pattern:
"TARGET_SSE_MATH && !TARGET_MIX_SSE_I387"
SF mixed pattern:
"TARGET_80387 && TARGET_MIX_SSE_I387"
DF pattern that can be implemented with either sse or i387 code:
"TARGET_80387 || (TARGET_SSE2 && TARGET_SSE_MATH)"
DF i387 pattern:
"TARGET_80387 && !(TARGET_SSE2 && TARGET_SSE_MATH)"
DF SSE pattern:
"TARGET_SSE2 && TARGET_SSE_MATH && !TARGET_MIX_SSE_I387"
DF mixed pattern:
"TARGET_80387 && TARGET_SSE2 && TARGET_MIX_SSE_I387"
In this way, TARGET_64BIT "float" pattern could easily be added as:
SF enabled:
"TARGET_80387 || (TARGET_64BIT && TARGET_SSE_MATH)"
SF i387:
"TARGET_80387 && !(TARGET_64BIT && TARGET_SSE_MATH)"
SF sse:
"TARGET_64BIT && TARGET_SSE_MATH && !TARGET_MIX_SSE_I387"
SF mixed:
"TARGET_80387 && TARGET_64BIT && TARGET_MIX_SSE_I387"
DF 64bit:
DF enabled:
"TARGET_80387 || (TARGET_64BIT && TARGET_SSE2 && TARGET_SSE_MATH)"
DF i387:
"TARGET_80387 && !(TARGET_64BIT && TARGET_SSE2 && TARGET_SSE_MATH)"
DF sse:
"TARGET_64BIT && TARGET_SSE2 && TARGET_SSE_MATH && !TARGET_MIX_SSE_I387"
DF mixed:
"TARGET_80387 && TARGET_64BIT && TARGET_SSE2 && TARGET_MIX_SSE_I387"
I would like to point out that TARGET_SSE_MATH enables TARGET_SSE, and
TARGET_MIX_SSE_I387 enables TARGET_SSE_MATH _and_ TARGET_SSE. By
replacing TARGET_80387 with TARGET_USE_FANCY_MATH_387, fsqrt patterns
can be handled.
It would also be very nice to unify pattern names in some logical way. I
would like to propose names like: "whatever_i387", "whatever_sse" and
"whatever_mixed".
Attached patch implements proposed solution for all FP operators that
can be implemented in both sse and i387 form: (add, sub, mul, div,
sqrt). This solution is also implemented for float* patterns. The patch
was bootstrapped for i686-pc-linux-gnu, regression testing is in
progress. As the patch is already big, I will prepare a followup patch
to also change fix* patterns this way, where cvttsd2si is faster than
fistp everywhere and perhaps with the proposed implementation of new
fisttp insn.
In addition, povray-3.50c was built using combinations of
-march={pentium, pentium3 and pentium4} with -mfpmath=387, -mfpmath=sse
and -mfpmath=sse,387. Asm code was checked for correct instructions and
a standard povray benchmark was run, where resulting picture was
visually checked for correctnes.
The patch may look big, but it is only a mechanical change/rename thing
following proposed solutions.
2004-12-08 Uros Bizjak <uros@kss-loka.si>
* config/i386/i386.md (floathisf2, *floathisf2_1, floatsisf2,
*floatsisf2_i387, *floatsisf2_sse, floatdisf2,
*floatdisf2_i387_only, *floatdisf2_i387, *floatdisf2_sse,
floathidf2, *floathidf2_1, *floatsidf2_i387, *floatsidf2_sse,
floatdidf2, *floatdidf2_i387_only, *floatdidf2_i387,
*floatdidf2_sse, floatunssisf2, floatunsdisf2, floatunsdidf2,
*fop_sf_comm_nosse, *fop_sf_comm, *fop_sf_comm_sse,
*fop_df_comm_nosse, *fop_df_comm, *fop_df_comm_sse,
*fop_sf_1_nosse, *fop_sf_1, *fop_sf_1_sse, *fop_sf_2,
*fop_sf_3, *fop_df_1_nosse, *fop_df_1, *fop_df_1_sse,
*fop_df_2, *fop_df_3, *fop_df_4, *fop_df_5, *fop_df_6,
*fop_xf_1, *fop_xf_2, *fop_xf_3, *fop_xf_4, *fop_xf_5,
*fop_xf_6, sqrtsf2_1, sqrtsf2_1_sse_only, sqrtsf2_i387,
sqrtdf2, sqrtdf2_1, sqrtdf2_1_sse_only, sqrtdf2_i387,
*sqrtextendsfdf2, *mindf): Unify enable constraint with
respect to TARGET_80387, TARGET_64BIT, TARGET_SSE,
TARGET_SSE2, TARGET_SSE_MATH and TARGET_MIX_SSE_I387.
(*float?i?f2_1): Rename to *float?i?f2_i387.
(*float?i?f_i387): Rename to *float?i?f2_mixed.
(*float?i?f2_i387_only): Rename to *float?i?f2_i387.
(float?ixf2): Penalize integer register constraint.
(*fop_?f_comm_nosse, *fop_?f_1_nosse): Rename to
*fop_?f_comm_i387, *fop_?f_1_i387
(*fop_?f_comm, *fop_?f_1): Rename to *fop_?f_comm_mixed,
*fop_?f_1_mixed.
(*fop_df_{2,3,4,5,6}): Rename to *fop_df_{2,3,4,5,6}_i387.
(*fop_xf_{2,3,4,5,6}): Rename to *fop_xf_{2,3,4,5,6}_i387.
(sqrtsf2_1, sqrtdf2_1): Rename to *sqrtsf2_mixed and
*sqrtdf2_mixed.
(sqrtsf2_i387, sqrtdf2_i387): Rename to *sqrtsf2_i387 and
*sqrtdf2_i387.
(sqrtsf2_1_sse_only, sqrtdf2_1_sse_only): Rename to
*sqrtsf2_sse and *sqrtdf2_sse.
Uros.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fpmath.diff
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20041208/0d386d59/attachment.ksh>
More information about the Gcc-patches
mailing list