This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.


On 01/25/2013 08:05 AM, Tom de Vries wrote:
Vladimir,

this patch adds analysis of register usage of functions for usage by IRA.

The patch:
- adds analysis in pass_final to track which hard registers are set or clobbered
   by the function body, and stores that information in a struct cgraph_node.
- adds a target hook fn_other_hard_reg_usage to list hard registers that are
   set or clobbered by a call to a function, but are not listed as such in the
   function body, such as f.i. registers clobbered by veneers inserted by the
   linker.
- adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
   corresponding declaration, even after the calls may have been split into an
   insn (set register to function address) and a call_insn (call register), which
   can happen for f.i. sh, and mips with -mabi-calls.
- uses the register analysis in IRA.
- adds an option -fuse-caller-save to control the optimization, on by default
   at -Os and -O2 and higher.


The patch (original version by Radovan Obradovic) is similar to your patch ( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007. But this patch doesn't implement save area stack slot sharing. ( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007 patch ).

[ Steven, you mentioned in this discussion
   ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on
   porting the 2007 patch to trunk. What is the status of that effort?
]


As an example of the functionality, consider foo and bar from test-case aru-1.c: ... static int __attribute__((noinline)) bar (int x) { return x + 3; }

int __attribute__((noinline))
foo (int y)
{
   return y + bar (y);
}
...

Compiled at -O2, bar only sets register $2 (the first return register):
...
bar:
         .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
         .mask   0x00000000,0
         .fmask  0x00000000,0
         .set    noreorder
         .set    nomacro
         j       $31
         addiu   $2,$4,3
...

foo then can use register $3 (the second return register) instead of register
$16 to save the value in register $4 (the first argument register) over the
call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff:
...
foo:                                    foo:
# vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8
.frame  $sp,32,$31                      .frame  $sp,32,$31
.mask   0x80010000,-4                 | .mask   0x80000000,-4
.fmask  0x00000000,0                    .fmask  0x00000000,0
.set    noreorder                       .set    noreorder
.set    nomacro                         .set    nomacro
addiu   $sp,$sp,-32                     addiu   $sp,$sp,-32
sw      $31,28($sp)                     sw      $31,28($sp)
sw      $16,24($sp)                   <
.option pic0                            .option pic0
jal     bar                             jal     bar
.option pic2                            .option pic2
move    $16,$4                        | move    $3,$4

lw      $31,28($sp)                     lw      $31,28($sp)
addu    $2,$2,$16                     | addu    $2,$2,$3
lw      $16,24($sp)                   <
j       $31                             j       $31
addiu   $sp,$sp,32                      addiu   $sp,$sp,32
...
That way we skip the save and restore of register $16, which is not necessary
for $3. Btw, a further improvement could be to reuse $4 after the call, and
eliminate the move.


A version of this patch on top of 4.6 ran into trouble with the epilogue on arm, where a register was clobbered by a stack pop instruction, while that was not visible in the rtl representation. This instruction was introduced in arm_output_epilogue by code marked with the comment 'pop call clobbered registers if it avoids a separate stack adjustment'. I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems that the epilogue instructions now list all registers set by it, so collect_fn_hard_reg_usage is able to analyze all clobbered registers.


Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on mips, arm, ppc and sh. No issues found. OK for stage1 trunk?


Thanks for the patch. I'll look at it during the next week.

Right now I see that the code is based on reload which uses caller-saves.c. LRA does not use caller-saves.c at all. Right now we have LRA support only for x86/x86-64 but the next version will probably have a few more targets based on LRA. Fortunately, LRA modification will be pretty easy with all this machinery.

I am going to use ira-improv branch for some my future work for gcc4.9. And I am going to regularly (about once per month) merge trunk into it. So if you want you could use the branch for your work too. But this is absolutely up to you. I don't mind if you put this patch directly to the trunk at stage1 when the review is finished.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]