[PATCH][IRA] Analysis of register usage of functions for usage by IRA.

Vladimir Makarov vmakarov@redhat.com
Fri Jan 25 15:46:00 GMT 2013


On 01/25/2013 08:05 AM, Tom de Vries wrote:
> Vladimir,
>
> this patch adds analysis of register usage of functions for usage by IRA.
>
> The patch:
> - adds analysis in pass_final to track which hard registers are set or clobbered
>    by the function body, and stores that information in a struct cgraph_node.
> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>    set or clobbered by a call to a function, but are not listed as such in the
>    function body, such as f.i. registers clobbered by veneers inserted by the
>    linker.
> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>    corresponding declaration, even after the calls may have been split into an
>    insn (set register to function address) and a call_insn (call register), which
>    can happen for f.i. sh, and mips with -mabi-calls.
> - uses the register analysis in IRA.
> - adds an option -fuse-caller-save to control the optimization, on by default
>    at -Os and -O2 and higher.
>
>
> The patch (original version by Radovan Obradovic) is similar to your patch
> ( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007.
> But this patch doesn't implement save area stack slot sharing.
> ( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007
> patch ).
>
> [ Steven, you mentioned in this discussion
>    ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on
>    porting the 2007 patch to trunk. What is the status of that effort?
> ]
>
>
> As an example of the functionality, consider foo and bar from test-case aru-1.c:
> ...
> static int __attribute__((noinline))
> bar (int x)
> {
>    return x + 3;
> }
>
> int __attribute__((noinline))
> foo (int y)
> {
>    return y + bar (y);
> }
> ...
>
> Compiled at -O2, bar only sets register $2 (the first return register):
> ...
> bar:
>          .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
>          .mask   0x00000000,0
>          .fmask  0x00000000,0
>          .set    noreorder
>          .set    nomacro
>          j       $31
>          addiu   $2,$4,3
> ...
>
> foo then can use register $3 (the second return register) instead of register
> $16 to save the value in register $4 (the first argument register) over the
> call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff:
> ...
> foo:                                    foo:
> # vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8
> .frame  $sp,32,$31                      .frame  $sp,32,$31
> .mask   0x80010000,-4                 | .mask   0x80000000,-4
> .fmask  0x00000000,0                    .fmask  0x00000000,0
> .set    noreorder                       .set    noreorder
> .set    nomacro                         .set    nomacro
> addiu   $sp,$sp,-32                     addiu   $sp,$sp,-32
> sw      $31,28($sp)                     sw      $31,28($sp)
> sw      $16,24($sp)                   <
> .option pic0                            .option pic0
> jal     bar                             jal     bar
> .option pic2                            .option pic2
> move    $16,$4                        | move    $3,$4
>
> lw      $31,28($sp)                     lw      $31,28($sp)
> addu    $2,$2,$16                     | addu    $2,$2,$3
> lw      $16,24($sp)                   <
> j       $31                             j       $31
> addiu   $sp,$sp,32                      addiu   $sp,$sp,32
> ...
> That way we skip the save and restore of register $16, which is not necessary
> for $3. Btw, a further improvement could be to reuse $4 after the call, and
> eliminate the move.
>
>
> A version of this patch on top of 4.6 ran into trouble with the epilogue on arm,
> where a register was clobbered by a stack pop instruction, while that was not
> visible in the rtl representation. This instruction was introduced in
> arm_output_epilogue by code marked with the comment 'pop call clobbered
> registers if it avoids a separate stack adjustment'.
> I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems
> that the epilogue instructions now list all registers set by it, so
> collect_fn_hard_reg_usage is able to analyze all clobbered registers.
>
>
> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>
>
Thanks for the patch.  I'll look at it during the next week.

Right now I see that the code is based on reload which uses 
caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we 
have LRA support only for x86/x86-64 but the next version will probably 
have a few more targets based on LRA.  Fortunately, LRA modification 
will be pretty easy with all this machinery.

I am going to use ira-improv branch for some my future work for gcc4.9.  
And I am going to regularly (about once per month) merge trunk into it.  
So if you want you could use the branch for your work too.  But this is 
absolutely up to you.  I don't mind if you put this patch directly to 
the trunk at stage1 when the review is finished.



More information about the Gcc-patches mailing list