[PATCH] implement fxtract x87 instruction and logb, ilogb builtins

Thu Apr 15 06:33:00 GMT 2004

Roger Sayle wrote:

>>    * config/i386/i386.md (*fxtractdf3, *fxtractsf3, *fxtractxf3): New
>>    patterns to implement fxtract x87 instruction.
>>    (logbdf2, logbsf2, logbxf2, ilogbsi2): New expanders to implement
>>    logb, logbf, logbl, ilogb, ilogbf and ilogbl built-ins as inline x87
>>    intrinsics.
>>    (UNSPEC_XTRACT_FRACT, UNSPEC_XTRACT_EXP): New unspecs to represent
>>    x87's fxtract insn.
>>    
>>
>The logb and ilogb functions are fairly rare.  It might make sense to
>prioritize the more commonly used math functions.  For example, fmod
>and asin are the two remaining libm functions that aren't implemented
>as inline x87 intrinsics in the "almabench" benchmark program... :>
>
>  
>
  But these functions don't use fancy "new" x87 instructions... ;)

  Regarding asin and acos instruction, I suggest to implement them as 
inline functions in mathinline.h, derived from atan2 and sqrt builtin:

--cut here--
inline double asin(double x)
{
  return atan2(x, sqrt(1.0 - x * x));
}

inline double acos(double x)
{
  return atan2(sqrt(1.0 - x * x), x);
}
--cut here

Regarding fmod/drem: These functions are implemented with fprem and 
fprem1, but these functions can reduce exponent by no more than 63. In 
mathinline.h, fmod is defined as loop:
  __asm __volatile__                                  \
    ("1:    fprem\n\t"                              \
     "fnstsw    %%ax\n\t"                              \
     "sahf\n\t"                                      \
     "jp    1b"                                  \
     : "=t" (__value) : "0" (__x), "u" (__y) : "ax", "cc");              \

I don't know, how to model loop into inline intrinsic... FSTCW, FNSTSW, 
SAHF intrinsics are already implemented and FPREM{,1} should be no problem.

BTW:
I would suggest to use builtins as shortcuts to hardware accellerated 
functions (as fsin, fcos, atan2, fsqrt, ...), where we could gain 
something, otherwise we may implement the whole math library in *.md. 
For functions, that are not directly implemented in hardware, gcc should 
provide  optimized system-dependent __FAST_MATH__mathinline.h header for 
libc, with derived high-speed functions. Regarding speed of inline 
functions  - Resulting code should be as much optimized (or even more) 
as it would be, if these functions are implemented as x87 intrinsics. It 
is true, that we can't make optimizations on builtins level (i.e.: 
sin(asin(x)) => x), but IMHO these optimizations could be implemented in 
other places.

IMHO only a few FP intrinsics remain for x87 hardware:
(1) -rint(), lrint() and llrint() with frndint instruction [already 
implemented, but no builtin is linked to frndint insn yet]
(2) -fmod(), drem() with fprem{,1} instruction
(3) -ldexp() with fscale instruction  [this is already implemented, but 
no builtin is linked to fscale insn yet]
(4) -frexp() with just implemented fxtract instruction [a problem with 
(int *exp) pointer, as in sincos instruction]
(5) -expm1(), log1p()

    Uros.