[PATCH] Improve prepare_shrink_wrap to sink more instructions

Jiong Wang jiong.wang@arm.com
Mon Sep 8 13:57:00 GMT 2014


On 05/09/14 20:48, Jeff Law wrote:
> On 09/04/14 08:15, Jiong Wang wrote:
>> this patch relax the restriction on src to accept any one of the following:
>>
>>     + REG
>>     + CONST_OBJ, like SYMBOL_REF
>>     + combination of single REG and any other CONST_OBJs.
>>       (reg def/use calculation will not affected by CONST_OBJs)
>>
>> RISC backend may benefit more from this relax, although there still
>> be minor improvements on x86. for example, there are 17 more functions
>> shrink-wrapped during x86-64 bootstrap, like sort_bucket in ira-color.c.
>>
>> test done
>> =========
>>     no regression on aarch64-none-elf bare-metal.
>>     no regression on x86-64 check-gcc.
>>     both aarch64 and x86-64 bootstrap OK.
>>
>> ok for install?
>>
>> 2014-09-04 Jiong Wang<jiong.wang@arm.com>
>>
>> gcc/
>>     * shrink-wrap.c (rtx_search_arg): New structure type.
>>     (rtx_search_arg_p): New typedef.
>>     (count_reg_const): New callback function.
>>     (move_insn_for_shrink_wrap): Relax the restriction on src operand.
>>
> Conceptually OK.  Some questions/concerns about the implementation.
Hi Jeff,

   thanks very much for your review.
> It seems to me that what you're trying to describe on the RHS is
> REG_P || CONSTANT_P
   yes.

   and actually I am trying to detect all rtx which contains any number
   of RTX_CONST_OBJs and no more than one REG.
>
> Note that CONSTANT_P will catch things like (high (symbol_ref)) and
> (lo_sum (reg) (symbol_ref)) which are often used to build addresses on
> some targets.
>
> With that in mind, rather than using a for_each_rtx callback, test
> if (REG_P (src) || CONSTANT_P (src))
>
> Note that SRC, when it is a CONSTANT_P,  may have a variety of forms
> with embedded registers.  You'll need to verify that all of those
> registers are not assigned after the insn with the CONSTANT_P source
> operand.  Right now you only perform that check when SRC is a REG_P.

   I am using the for_each_rtx because I want to scan "src" so that
any sub-operands are checked,  the number of REG and non-constant
objects are record in "reg_found" and "nonconst_found".  the embedded
register found also record in the "reg" field of the structure 
"rtx_search_arg",
and there is a assignment:

       else if (arg.reg_found == 1)
         src = arg.reg;

   so the logic of my patch is:
      * if REG_P (src), nothing new, just follow the original logic.
      * if !REG_P (src),  scan src, to make sure there is no more than 
one REG,
        and there is no non-constant objects.
        if any embedded REG found, assign it to "src"  to go through all 
code written for REG_P (src).

  And I do found there is one glitch in my patch:

   +  else if (!CONSTANT_P (x))

should be

   +  else if (!CONSTANT_P (x) && OBJECT_P (x)) to skip those operator 
rtx like plus, minus etc.

so, does my explanation make sense?

BTW, there will be 2547 functions shrink-wrapped during x86-64 bootstrap 
compared with 2501
before this improvement,  46 more,  +1.84%.

thanks.

-- Jiong


>
> Jeff
>
>
>




More information about the Gcc-patches mailing list