This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: x87 float truncation/accuracy (gcc vs. icc/msvc)


> Do any of the x86 backend gurus have any suggestions as to how best
> to implement "truncdfsf2" as a move between x87 registers, but as a
> regular "fst*s" instruction for memory targets?  My initial attempt
> was to simply guard the following splitter with !flag_unsafe_math_...
> 
> (define_split
>   [(set (match_operand:SF 0 "register_operand" "")
>         (float_truncate:SF
>          (match_operand:DF 1 "fp_register_operand" "")))
>    (clobber (match_operand:SF 2 "memory_operand" ""))]
>   "TARGET_80387 && reload_completed"
>   [(set (match_dup 2) (float_truncate:SF (match_dup 1)))
>    (set (match_dup 0) (match_dup 2))]
>   "")
> 
> Alas this failed miserably.

You can just cut&paste the extendsfdf implementation.  Basically it
immitate move pattern for x87 but do proper conversions for SSE.
I was very tempted to do this for a while (and sent patch back in 98 or
so) but there appeared to be consensus that the truncations are very
important, but it does not seem to be the practice.
If you want to get really good about elliminating the truncations, you
will need to play the games with combiner patterns containing trucnates,
silimarly as we do for extensions but this is tricky (you will face
pattern explosion).

In the past we tried to use match_operand predicate that accepts the
extensions but that approach failed since reload is handling unarry
expressions in operand by passing them to move patterns and this
behaviour is needed by some other targets.

Honza
> 
> Any advice would be much appreciated.  I've confirmed that GCC performs
> the related "safe" constant folding optimizations, such as converting
> "(float)((double)f1 op (double)f2)" into "f1 op f2" for floating point
> values f1 and f2, and operation op one of add, sub or mul.  For "mul",
> for example, the two 24 bit mantissas of an IEEE "float" can't overflow
> the 53 bit mantissa of an IEEE double, so there's no double rounding and
> so a floating point multiplication returns the same (perfectly rounded)
> result.  These don't help the code above however, which is fundamentally
> unsafe and not normally a win except on Intel.
> 
> Roger
> --


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]