[PATCH, i386] Unify TARGET_SSE_MATH and TARGET_MIX_SSE_I387 in insn constraints

Wed Dec 8 16:20:00 GMT 2004

Hello!

There is a big mess regarding TARGET_SSE_MATH and TARGET_MIX_SSE_I387 
handling in i386.md file. Actually every pattern handles these macros in 
its own way and it is very difficult to figure out the conditions under 
which certain pattern is enabled.

I have encoutered a problem with float* patterns, which have wrong 
enable condition defined, so SSE cvtsi2s{s,d} instruction was
generated unconditinally for -mfpmath=387,sse and -mfpmath=i387. This is 
int->float conversion instruction, and i387 insn is quite faster than 
SSE one. Additional trouble with this insn is, that it converts from 
integer to SSE register, so additional sse->mem>fp moves were needed to 
get value into FP register. The produced asm code was horrible, 
something like this:
        ...
        cvtsi2ss        12(%ebp), %xmm0
        movss   %xmm0, -8(%ebp)
        flds    -8(%ebp)
        ...

To avoid this problems, I would like to propose unified handling of 
TARGET_SSE_MATH and TARGET_MIX_SSE_I387 macros in following way:

SF pattern that can be implemented with either sse or i387 code would 
have a constraint:
  "TARGET_80387 || TARGET_SSE_MATH"
When SF i387 insn should be generated its constraint would be:
  "TARGET_80387 && !TARGET_SSE_MATH"
SF SSE pattern:
  "TARGET_SSE_MATH && !TARGET_MIX_SSE_I387"
SF mixed pattern:
  "TARGET_80387 && TARGET_MIX_SSE_I387"

DF pattern that can be implemented with either sse or i387 code:
   "TARGET_80387 || (TARGET_SSE2 && TARGET_SSE_MATH)"
DF i387 pattern:
  "TARGET_80387 && !(TARGET_SSE2 && TARGET_SSE_MATH)"
DF SSE pattern:
  "TARGET_SSE2 && TARGET_SSE_MATH && !TARGET_MIX_SSE_I387"
DF mixed pattern:
  "TARGET_80387 && TARGET_SSE2 && TARGET_MIX_SSE_I387"

In this way, TARGET_64BIT "float" pattern could easily be added as:
SF enabled:
  "TARGET_80387 || (TARGET_64BIT && TARGET_SSE_MATH)"
SF i387:
  "TARGET_80387 && !(TARGET_64BIT && TARGET_SSE_MATH)"
SF sse:
  "TARGET_64BIT && TARGET_SSE_MATH && !TARGET_MIX_SSE_I387"
SF mixed:
  "TARGET_80387 && TARGET_64BIT && TARGET_MIX_SSE_I387"

DF 64bit:
DF enabled:
  "TARGET_80387 || (TARGET_64BIT && TARGET_SSE2 && TARGET_SSE_MATH)"
DF i387:
  "TARGET_80387 && !(TARGET_64BIT && TARGET_SSE2 && TARGET_SSE_MATH)"
DF sse:
  "TARGET_64BIT && TARGET_SSE2 && TARGET_SSE_MATH && !TARGET_MIX_SSE_I387"
DF mixed:
  "TARGET_80387 && TARGET_64BIT && TARGET_SSE2 && TARGET_MIX_SSE_I387"

I would like to point out that TARGET_SSE_MATH enables TARGET_SSE, and 
TARGET_MIX_SSE_I387 enables TARGET_SSE_MATH _and_ TARGET_SSE. By 
replacing TARGET_80387 with TARGET_USE_FANCY_MATH_387, fsqrt patterns 
can be handled.

It would also be very nice to unify pattern names in some logical way. I 
would like to propose names like: "whatever_i387", "whatever_sse" and 
"whatever_mixed".

Attached patch implements proposed solution for all FP operators that 
can be implemented in both sse and i387 form: (add, sub, mul, div, 
sqrt). This solution is also implemented for float* patterns. The patch 
was bootstrapped for i686-pc-linux-gnu, regression testing is in 
progress. As the patch is already big, I will prepare a followup patch 
to also change fix* patterns this way, where cvttsd2si is faster than 
fistp everywhere and perhaps with the proposed implementation of new 
fisttp insn.

In addition, povray-3.50c was built using combinations of 
-march={pentium, pentium3 and pentium4} with -mfpmath=387, -mfpmath=sse 
and -mfpmath=sse,387. Asm code was checked for correct instructions and 
a standard povray benchmark was run, where resulting picture was 
visually checked for correctnes.

The patch may look big, but it is only a mechanical change/rename thing 
following proposed solutions.

2004-12-08  Uros Bizjak  <uros@kss-loka.si>

    * config/i386/i386.md (floathisf2, *floathisf2_1, floatsisf2,
    *floatsisf2_i387, *floatsisf2_sse, floatdisf2,
    *floatdisf2_i387_only, *floatdisf2_i387, *floatdisf2_sse,
    floathidf2, *floathidf2_1, *floatsidf2_i387, *floatsidf2_sse,
    floatdidf2, *floatdidf2_i387_only, *floatdidf2_i387,
    *floatdidf2_sse, floatunssisf2, floatunsdisf2, floatunsdidf2,
    *fop_sf_comm_nosse, *fop_sf_comm, *fop_sf_comm_sse,
    *fop_df_comm_nosse, *fop_df_comm, *fop_df_comm_sse,
    *fop_sf_1_nosse, *fop_sf_1, *fop_sf_1_sse, *fop_sf_2,
    *fop_sf_3, *fop_df_1_nosse, *fop_df_1, *fop_df_1_sse,
    *fop_df_2, *fop_df_3, *fop_df_4, *fop_df_5, *fop_df_6,
    *fop_xf_1, *fop_xf_2, *fop_xf_3, *fop_xf_4, *fop_xf_5,
    *fop_xf_6, sqrtsf2_1, sqrtsf2_1_sse_only, sqrtsf2_i387,
    sqrtdf2, sqrtdf2_1, sqrtdf2_1_sse_only, sqrtdf2_i387,
    *sqrtextendsfdf2, *mindf): Unify enable constraint with
    respect to TARGET_80387, TARGET_64BIT, TARGET_SSE,
    TARGET_SSE2, TARGET_SSE_MATH and TARGET_MIX_SSE_I387.
    (*float?i?f2_1): Rename to *float?i?f2_i387.
    (*float?i?f_i387): Rename to *float?i?f2_mixed.
    (*float?i?f2_i387_only): Rename to *float?i?f2_i387.
    (float?ixf2): Penalize integer register constraint.
    (*fop_?f_comm_nosse, *fop_?f_1_nosse): Rename to
    *fop_?f_comm_i387, *fop_?f_1_i387
    (*fop_?f_comm, *fop_?f_1): Rename to *fop_?f_comm_mixed,
    *fop_?f_1_mixed.
    (*fop_df_{2,3,4,5,6}): Rename to *fop_df_{2,3,4,5,6}_i387.
    (*fop_xf_{2,3,4,5,6}): Rename to *fop_xf_{2,3,4,5,6}_i387.
    (sqrtsf2_1, sqrtdf2_1): Rename to *sqrtsf2_mixed and
    *sqrtdf2_mixed.
    (sqrtsf2_i387, sqrtdf2_i387): Rename to *sqrtsf2_i387 and
    *sqrtdf2_i387.
    (sqrtsf2_1_sse_only, sqrtdf2_1_sse_only): Rename to
    *sqrtsf2_sse and *sqrtdf2_sse.

Uros.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fpmath.diff
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20041208/0d386d59/attachment.ksh>