[PATCH] Disable loop2_invariant for -Os
Zhenqiang Chen
zhenqiang.chen@arm.com
Mon Jul 9 08:40:00 GMT 2012
>>
>> 1) If -fira_loop_pressure is enabled, it reduces ~24% invariant motions in my
>tests. But it does not help on total code size. Seams there is issue to update the
>"regs_needed" after moving an invariant out of the loop (My benchmark logs
>show ~73% cases have more than one invariants moved).
>>
>> During tracing, I found that move an integer constant out of the loop does not
>increase regs_needed. Function "get_pressure_class_and_nregs (rtx insn, int
>*nregs)" computes the "regs_needed".
>>
>> *nregs
>> = ira_reg_class_max_nregs[pressure_class][GET_MODE (SET_SRC
>> (set))];
>>
>> In ARM, the insn to set an integer is like
>> (set (reg:SI 183)
>> (const_int 32 [0x20])) inv1.c:64 182 {*thumb1_movsi_insn}
>> (nil))
>> GET_MODE (SET_SRC (set)) is VOIDMode and
>ira_reg_class_max_nregs[pressure_class][VOIDMode] is 0. In one of my test
>cases, it moves 4 integer constants out of the loop, which leads to spilling.
>>
>> According to the algorithm in "calculate_loop_reg_pressure", moving an
>> invariant out of the loop should impact on the register pressure. So I
>> try to add the following code
>>
>> if (! (*nregs))
>> *nregs = ira_reg_class_max_nregs[pressure_class][GET_MODE (reg)];
>>
>> Logs show it reduces another 32% invariant motions. But the code size is still
>far from disabling the pass. Logs show -fira_loop_pressure impact other passes
>in addition to loop2_invariant (The result of "-fira_loop_pressure
>-fno-move-loop-invariants" is different from the result of
>"-fno-move-loop-invariants").
>>
>> 2) By default -fira_loop_pressure is not enabled for -Os, the logic to compute
>"regs_used" seams not sound. The following codes is from function
>"find_invariants_to_move"
>> {
>> unsigned int n_regs = DF_REG_SIZE (df);
>>
>> regs_used = 2;
>>
>> for (i = 0; i < n_regs; i++)
>> {
>> if (!DF_REGNO_FIRST_DEF (i) && DF_REGNO_LAST_USE (i))
>> {
>> /* This is a value that is used but not changed inside loop.
>*/
>> regs_used++;
>> }
>> }
>> }
>> * There is no loop related inform in the code.
>> * Benchmark logs show the condition (!DF_REGNO_FIRST_DEF (i) &&
>DF_REGNO_LAST_USE (i)) is never true.
>
>Still there is code that tries to deal with -Os. Simply disabling the pass makes
>that logic pointless.
If -fira-loop-pressure is not enabled, function estimate_reg_pressure_cost (cfgloopanal.c) is used to estimate the cost. At the beginning of the function, it checks
/* If we have enough registers, we should use them and not restrict
the transformations unnecessarily. */
if (regs_needed + target_res_regs <= available_regs)
return 0;
Here are the CSiBE benchmark logs before "if (...)" for ARM/MIPS/PPC/X86.
available_regs target_res_regs regs_needed
ARM : 9 3 2
MIPS: 10/26 3 2
PPC : 18/29 3 2
X86 : 6/15 3 2
regs_needed++ after invariant motion. The size_cost of the first several invariant (available_regs - target_res_regs(3) - regs_needed(2)) motions are always 0. So I prefer to disable the pass if -fira-loop-pressure is not enabled.
>Thus, please try to fix the code that is there to deal with -Os (a target may opt to
>enable -fira-loop-pressure by default for -Os).
Yes. Targets need tune to enable -fira-loop-pressure.
For -fira-loop-pressure, CSiBE logs show MIPS and PPC have a little improvement and X86 has a little regression compared with -fira-loop-pressure is not enabled.
If fira-loop-pressure is enabled, the cost check bases on
if ((int) new_regs[pressure_class]
+ (int) regs_needed[pressure_class]
+ LOOP_DATA (curr_loop)->max_reg_pressure[pressure_class]
+ IRA_LOOP_RESERVED_REGS
> ira_available_class_regs[pressure_class])
But a reg is available does not mean it can be used in any instruction. e.g. For ARM Cortex-M0, only few instructions can use r8-r15. (r8-r11, r13-r15 are already excluded in the available_regs). Logs show the result is much better if r12 is also excluded.
Thanks!
-Zhenqiang
More information about the Gcc-patches
mailing list