This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] middle-end: convert builtin finite -> MINUS/ORD


On Tue, 12 Jun 2007, Richard Guenther wrote:

> The fold-const.c and simplify-rtx.c changes are ok.
>
> Andrew is right about x - x trapping if x is a signalling NaN or +Inf or
> -Inf (invalid operation exception), x is a denormal (denormal operand
> exception).  So the transformation is only valid for non-trapping math.

Okay, perhaps I'm confusing traps (i.e. aborts) with raising floating
point exceptions.  I guess neither is acceptable.


>  Btw. - is it actually faster to use the new sequence than fxam in x87
> math?

Can't say about my sequence vs an x87 insn, I only have sparc-solaris ATM.

I do know that my transformation is definitely much faster than the
function call into libc.  If I run a loop around calls to x += finite(foo)
and up to loop count until the timing takes 7 to 8 seconds with the
function call, when I switch to the transformed style using the same loop
count the timing drops to zero.


>  How do code size compare to a function call? (Just to see if we should
> add a finite optab as well to allow the target to override the general
> expansion and conditionalize on OPTIMIZE_SIZE).

It's about the same.  On sparc, I tried using this as a function body:

  return __builtin_finite(x) ? 11 : 17;

(I added the ?: to avoid a tail call which isn't a realistic example IMO.
I believe finite is probably most often used in some kind of conditional.)
The difference on sparc is:

-       save    %sp, -112, %sp
-       mov     %i0, %o0
-       mov     %i1, %o1
-       call    finite, 0
-        mov    17, %i0
-       cmp     %o0, 0
-       return  %i7+8
-        movne  %icc, 11, %o0
+       add     %sp, -120, %sp
+       std     %o0, [%sp+96]
+       mov     17, %o0
+       ldd     [%sp+96], %f10
+       sub     %sp, -120, %sp
+       fsubd   %f10, %f10, %f8
+       fcmpd   %fcc0, %f8, %f8
+       jmp     %o7+8
+        movo   %fcc0, 11, %o0

The first is 8 insns, the second is 9.  The exact sequence will depend on
your context and target, but I don't think it warrants an optimize_size
check.

We probably do want to allow optabs to override it especially if they're
shorter and/or don't raise an exception. So I'll rework the patch to allow
optabs and condition this one on no-traps.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]