[lra] spilling general class pseudos into SSE regs instead of memory (a target hooks driven implementation)

Vladimir Makarov vmakarov@redhat.com
Tue Mar 27 17:08:00 GMT 2012


   The following patch implements general spilling one class pseudos
into another class hard registers *instead of memory* in LRA.

   Currently, the patch implements spilling of general reg pseudos into
SSE regs for Intel Core architecture as it is recommended by Intel
optimization guide.  Such optimization improves performance and size
of the generated code with LRA.  The size is improved because movd
insn (moving general regs to/from SSE regs) has smaller size that x86
load/store from stack with address offset bigger than 128).  There is
also a steady improvement in code performance with usage of such
optimization for Intel core processors.

   The optimization worsens code performance for AMD processors (Phenom
and Bulldozer) because usage of movd insn is less profitable than
st/ld and it is obvious why X86_TUNE_INTER_UNIT_MOVES is off for such
processors.

   The optimization worsens code performance for Intel Atom although
one could think the opposite as X86_TUNE_INTER_UNIT_MOVES is on for
this processor.  Interesting enough that switching
X86_TUNE_INTER_UNIT_MOVES off for Atom practically does not change the
code performance whithout the optimization.

   The optimization might be useful for some other processors which
have direct move insns for the two considered classes and when IRA for
some reasons did not use the class union.  At least I see
that we could try this for ARM (spilling general regs into VF regs)
and for extended powerpc architecture (spilling general regs into fp
regs).  What is only necessary is just to define two macros.  I am
going to do it for ARM and see is this optimization beneficial for
OMAP4.  Although I think it is not as fp units with VF regs in ARM
implementations I know are too separate from integer units.

The patch was successfully bootstrapped on x86/x86-64 with additional
options -mtune=corei7 -march=corei7.

Committed as rev. 185884.

2012-03-27  Vladimir Makarov <vmakarov@redhat.com>

     * common.opt (flra-reg-spill): New option.

     * doc/tm.texi (TARGET_SPILL_CLASS, TARGET_SPILL_CLASS_MODE): New
     hooks.

     * target.def (spill_class, spill_class_mode): New hooks.

     * target.h: Include tm.h.

     * lra-int.h (lra_reg_spill_p): New external.

     * lra.c (lra_reg_spill_p): New global var.
     (setup_reg_spill_flag): New function.
     (lra): Call setup_reg_spill_flag.  Use lra_reg_spill_p as an
     argument for lra_create_live_ranges before spill sub-pass.

     * lra-spills.c: Include ira.h.
     (spill_hard_reg): New array.
     (struct slot): Add new memebr hard_regno.
     (assign_slot): Rename to assign_mem_slot.
     (assign_spill_hard_regs): New function.
     (add_pseudo_to_slot): Ditto.
     (assign_stack_slot_num_and_sort_pseudos): Rewrite using
     add_pseudo_to_slot.
     (remove_pseudos): Use spill_hard_reg.
     (lra_spill): Allocate, initialize, and free spill_hard_reg.
     Sort pseudo_regnos and call assign_spill_hard_regs.

     * lra-assign.c (assign_hard_regno): Use the biggest mode instead
     of the pseudo mode.

     * Makefile.in (lra-spills.c): Add dependence on ira.h.

     * config/i386/i386.h (enum ix86_tune_indices): Add
     X86_TUNE_GENERAL_REGS_SSE_SPILL.
     (TARGET_GENERAL_REGS_SSE_SPILL): New macro.

     * config/i386/i386.c (initial_ix86_tune_features): Add entry for
     X86_TUNE_GENERAL_REGS_SSE_SPILL.
     (ix86_spill_class): New function.
     (ix86_spill_class_mode): Ditto.
     (TARGET_SPILL_CLASS, TARGET_SPILL_CLASS_MODE): Define macros.



More information about the Gcc-patches mailing list