This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[Committed] Don't hoist FP constants on x87
- From: Roger Sayle <roger at eyesopen dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: Steven Bosscher <stevenb dot gcc at gmail dot com>, Uros Bizjak <ubizjak at gmail dot com>, Grigory Zagorodnev <grigory_zagorodnev at linux dot intel dot com>
- Date: Sun, 19 Feb 2006 15:03:48 -0700 (MST)
- Subject: [Committed] Don't hoist FP constants on x87
Many thanks to the various folks who did performance benchmarking
of my RFC/RFT to disable GCSEing and loop iv optimizations of FP
constants on x87. Thanks also to Steven B for his help updating
last year's patch to also handle the new RTL loop optimizer.
I retested the following patch on i686-pc-linux-gnu, with a full
bootstrap, all default languages, and regression tested with a
top-level "make -k check" with no new failures.
Committed to mainline as revision 111283. I'll be keeping an
eye of the on-line SPEC testers.
2006-02-19 Roger Sayle <roger@eyesopen.com>
Steven Bosscher <stevenb.gcc@gmail.com>
* gcse.c (want_to_gcse_p): On STACK_REGS targets, look through
constant pool references to identify stack mode constants.
* rtlanal.c (constant_pool_constant_p): New predicate to check
whether operand is a floating point constant in the pool.
* rtl.h (constant_pool_constant_p): Prototype here.
* loop.c (scan_loop): Avoid hoisting constants from the constant
pool on STACK_REGS targets.
(load_mems): Likewise.
* loop-invariant.c (get_inv_cost): Make hoisting constant pool
loads into x87 registers expensive in terms of register pressure.
Index: gcse.c
===================================================================
*** gcse.c (revision 111245)
--- gcse.c (working copy)
*************** static basic_block current_bb;
*** 1170,1175 ****
--- 1170,1183 ----
static int
want_to_gcse_p (rtx x)
{
+ #ifdef STACK_REGS
+ /* On register stack architectures, don't GCSE constants from the
+ constant pool, as the benefits are often swamped by the overhead
+ of shuffling the register stack between basic blocks. */
+ if (IS_STACK_MODE (GET_MODE (x)))
+ x = avoid_constant_pool_reference (x);
+ #endif
+
switch (GET_CODE (x))
{
case REG:
Index: rtlanal.c
===================================================================
*** rtlanal.c (revision 111245)
--- rtlanal.c (working copy)
*************** init_rtlanal (void)
*** 4800,4802 ****
--- 4800,4811 ----
non_rtx_starting_operands[i] = first ? first - format : -1;
}
}
+
+ /* Check whether this is a constant pool constant. */
+ bool
+ constant_pool_constant_p (rtx x)
+ {
+ x = avoid_constant_pool_reference (x);
+ return GET_CODE (x) == CONST_DOUBLE;
+ }
+
Index: rtl.h
===================================================================
*** rtl.h (revision 111245)
--- rtl.h (working copy)
*************** extern bool subreg_offset_representable_
*** 980,985 ****
--- 980,986 ----
extern unsigned int subreg_regno (rtx);
extern unsigned HOST_WIDE_INT nonzero_bits (rtx, enum machine_mode);
extern unsigned int num_sign_bit_copies (rtx, enum machine_mode);
+ extern bool constant_pool_constant_p (rtx);
/* 1 if RTX is a subreg containing a reg that is already known to be
Index: loop.c
===================================================================
*** loop.c (revision 111245)
--- loop.c (working copy)
*************** scan_loop (struct loop *loop, int flags)
*** 1222,1227 ****
--- 1222,1233 ----
if (GET_MODE_CLASS (GET_MODE (SET_DEST (set))) == MODE_CC
&& CONSTANT_P (src))
;
+ #ifdef STACK_REGS
+ /* Don't hoist constant pool constants into stack regs. */
+ else if (IS_STACK_MODE (GET_MODE (SET_SRC (set)))
+ && constant_pool_constant_p (SET_SRC (set)))
+ ;
+ #endif
/* Don't try to optimize a register that was made
by loop-optimization for an inner loop.
We don't know its life-span, so we can't compute
*************** load_mems (const struct loop *loop)
*** 10823,10828 ****
--- 10829,10841 ----
&& SCALAR_FLOAT_MODE_P (GET_MODE (mem)))
loop_info->mems[i].optimize = 0;
+ #ifdef STACK_REGS
+ /* Don't hoist constant pool constants into stack registers. */
+ if (IS_STACK_MODE (GET_MODE (mem))
+ && constant_pool_constant_p (mem))
+ loop_info->mems[i].optimize = 0;
+ #endif
+
/* If this MEM is written to, we must be sure that there
are no reads from another MEM that aliases this one. */
if (loop_info->mems[i].optimize && written)
Index: loop-invariant.c
===================================================================
*** loop-invariant.c (revision 111245)
--- loop-invariant.c (working copy)
*************** get_inv_cost (struct invariant *inv, int
*** 932,937 ****
--- 932,963 ----
(*regs_needed)++;
(*comp_cost) += inv->cost;
+ #ifdef STACK_REGS
+ {
+ /* Hoisting constant pool constants into stack regs may cost more than
+ just single register. On x87, the balance is affected both by the
+ small number of FP registers, and by its register stack organisation,
+ that forces us to add compensation code in and around the loop to
+ shuffle the operands to the top of stack before use, and pop them
+ from the stack after the loop finishes.
+
+ To model this effect, we increase the number of registers needed for
+ stack registers by two: one register push, and one register pop.
+ This usually has the effect that FP constant loads from the constant
+ pool are not moved out of the loop.
+
+ Note that this also means that dependent invariants can not be moved.
+ However, the primary purpose of this pass is to move loop invariant
+ address arithmetic out of loops, and address arithmetic that depends
+ on floating point constants is unlikely to ever occur. */
+ rtx set = single_set (inv->insn);
+ if (set
+ && IS_STACK_MODE (GET_MODE (SET_SRC (set)))
+ && constant_pool_constant_p (SET_SRC (set)))
+ (*regs_needed) += 2;
+ }
+ #endif
+
EXECUTE_IF_SET_IN_BITMAP (inv->depends_on, 0, depno, bi)
{
dep = VEC_index (invariant_p, invariants, depno);
Roger
--