This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] implement rint{,f,l}, floor{,f,l}. ceil{,f,l}, trunc{,f,l}and nearbyint{,f,l} as x87 built-in functions
- From: Roger Sayle <roger at eyesopen dot com>
- To: Uros Bizjak <uros at kss-loka dot si>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 25 Aug 2004 07:56:57 -0600 (MDT)
- Subject: Re: [PATCH] implement rint{,f,l}, floor{,f,l}. ceil{,f,l}, trunc{,f,l}and nearbyint{,f,l} as x87 built-in functions
On Wed, 25 Aug 2004, Uros Bizjak wrote:
> 2004-25-08 Uros Bizjak <uros@kss-loka.si>
>
> * builtins.c (expand_builtin_mathfn): Handle BUILT_IN_RINT{,F,L}
> using rint_optab.
> (expand_builtin): Expand BUILT_IN_RINT{,F,L} using
> expand_builtin_mathfn.
> * genopinit.c (optabs): Rename trunc_optab to btrunc_optab. Use
> btrunc?f patterns for btrunc_optab. Implement rint_optab using
> rint?f patterns.
> * optabs.c (init_optabs): Initialize rint_optab.
> * optabs.h (enum optab_index): Rename OTI_trunc to OTI_btrunc.
> Add new OTI_rint.
> (btrunc_optab): Rename macro from trunc_optab.
> (rint_optab): Define corresponding macro.
>
> * reg-stack.c (subst_stack_regs_pat): Handle UNSPEC_FRNDINT_FLOOR,
> UNSPEC_FRNDINT_CEIL, UNSPEC_FRNDINT_TRUNC, UNSPEC_FRNDINT_EXCEPTION.
>
> * testsuite/gcc.dg/builtins-46.c: New.
The middle-end bits and the test case look fine.
> * config/i386/i386-protos.h (emit_i387_cw_initialization):
> Change prototype. Use new enum i387_cw_mode parameter.
> * config/i386/i386.c (emit_i387_cw_initialization):
> Handle new rounding modes.
>
> * config/i386/i386.h (enum fp_cw_mode): Delete.
> (enum i387_cw_mode): New enum.
> (MODE_NEEDED): Handle new rounding modes.
> (EMIT_MODE_SET): Change condition to handle new rounding modes.
>
> * config/i386/i386.md (UNSPEC_FRNDINT_FLOOR, UNSPEC_FRNDINT_CEIL,
> UNSPEC_FRNDINT_TRUNC, UNSPEC_FRNDINT_EXCEPTION): New unspecs to
> represent different rounding modes of frndint insn.
> (i387cw): New attribute definition.
> (*fix_truncdi_1): Add "i387cw" attribute defined to "trunc".
> (fix_truncdi_nomemory): Same.
> (fix_truncdi_memory): Same.
> (*fix_truncsi_1): Same.
> (fix_truncsi_nomemory): Same.
> (fix_truncsi_memory): Same.
> (*fix_trunchi_1): Same.
> (fix_trunchi_nomemory): Same.
> (fix_trunchi_memory): Same.
>
> (x86_fnstcw_1): Remove comment.
>
> (*frndintxf2): Rename insn definition to frndintxf2. Move
> insn definition near rint?f2 expanders.
> (rintdf2, rintsf2, rintxf2): New expanders to implement rint,
> rintf and rintl built-ins as inline x87 intrinsics.
> (frndintxf4_floor): New pattern to implement floor rounding
> mode with frndint x87 instruction.
> (floordf2, floorsf2, floorxf2): New expanders to implement floor,
> floorf and floorl built-ins as inline x87 intrinsics.
> (frndintxf4_ceil): New pattern to implement ceil rounding
> mode with frndint x87 instruction.
> (ceildf2, ceilsf2, ceilxf2): New expanders to implement ceil,
> ceilf and ceill built-ins as inline x87 intrinsics.
> (frndintxf4_trunc): New pattern to implement trunc rounding
> mode with frndint x87 instruction.
> (btruncdf2, btruncsf2, btruncxf2): New expanders to implement trunc,
> truncf and truncl built-ins as inline x87 intrinsics.
> (frndintxf4_ex): New pattern to implement rounding
> mode with exceptions with frndint x87 instruction.
> (nearbyintdf2, nearbyintsf2, nearbyintxf2): New expanders to
> implement nearbyint, nearbyintf and nearbyintl built-ins as
> inline x87 intrinsics.
The i386 backend maintainers may have additional comments.
Ideally the OPTIMIZE_MODE_SWITCHING would be used to control
the placement of the "fldcw" instructions that actually change
the mode. Currently, every single fistp is paired with its
own fldcw save/restore pair. This causes much more work than
necessary for two sequential floors for example.
fld
fldcw // floor
fstip
fldcw // restore
fld
fldcw // floor
fstip
fldcw // restore
which should be the much more efficient as
fldcw // floor
fld
fstip
fld
fstip
fldcw // restore
At present OPTIMIZE_MODE_SWITCHING is used purely to avoid recalculating
the bit pattern that needs to loaded into the x87 control world.
Even in this respect things can be improved. The OPTIMIZE_MODE_SWITCHING
machinery could be improved to pass EMIT_MODE_SET both the new and the
original modes that are being switched. Currently, the i386 backend's
emit_i387_cw_initialization inserts a "fstcw" instruction on every mode
change. This is only required on transitions from FP_CW_UNINITIALIZED
or FP_CW_ANY which may have changed the FP control word. Otherwise the
"fstcw" that's used to determine the FLOOR mask can be reused to calculate
the CEIL mask.
Even preserving the i386's current FP mode switching implementation,
you'll need to add length attributes to your frndintxf4_* define_insns.
These macro patterns are oversized which will confuse GCC's branch
range prediction and other parts of the compiler. The exisiting
fistp patterns don't need this as the default "length" calculation
already includes the fldcw/fistp/fldcw sequence.
;; The (bounding maximum) length of an instruction in bytes.
;; ??? fistp is in fact fldcw/fistp/fldcw sequence. Later we may want
;; to split it and compute proper length as for other insns.
Your new define_insns have no length and only type attribute "fpspc".
Finally, as hinted in the .md file comment above, it would be
preferable if the new frndintxf4_* and the exisiting fix_trunc*
patterns could become expanders themselves that decompose into
their three constituent instructions. This would allow better
scheduling, and perhaps even the use of CSE/GCSE to optimize mode
switching instead of optimize_mode_switching.
The only show stopper, for me, with your patch are the missing
"length" attributes, but clearly we can do much better than the
current scheme, though I appreciate you're mostly following
existing precedent. These are really backend issues/preferences,
so hopefully RTH will comment whether the patch is good as is,
needs the lengths fixed, or needs more significant changes.
I hope this helps.
Roger
--
- Follow-Ups:
- Re: [PATCH] implement rint{,f,l}, floor{,f,l}. ceil{,f,l}, trunc{,f,l} and nearbyint{,f,l} as x87 built-in functions
- Re: [PATCH] implement rint{,f,l}, floor{,f,l}. ceil{,f,l}, trunc{,f,l}and nearbyint{,f,l} as x87 built-in functions
- References:
- [PATCH] implement rint{,f,l}, floor{,f,l}. ceil{,f,l}, trunc{,f,l}and nearbyint{,f,l} as x87 built-in functions