This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, PR60738] More LRA split for regno conflicting with single reg class operand
- From: Vladimir Makarov <vmakarov at redhat dot com>
- To: Wei Mi <wmi at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: David Li <davidxl at google dot com>
- Date: Mon, 05 May 2014 11:40:51 -0400
- Subject: Re: [PATCH, PR60738] More LRA split for regno conflicting with single reg class operand
- Authentication-results: sourceware.org; auth=none
- References: <CA+4CFy76kH8XGTHTdd1dCe2GatkAh=wjXCkkCin1_-Q3guH_3Q at mail dot gmail dot com>
On 2014-04-25, 11:35 PM, Wei Mi wrote:
Hi,
This patch is to address the missing optimization reported in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60738
Now in process_single_reg_class_operands, any allocno A conflicting
with a single reg class operand B is marked to never use the reg class
in IRA. This is non-optimal when A is in a hot region while B is in a
cold region. The patch allows A to use the register in the single reg
class if only the hotness difference between A and B is large enough.
The patch also extends lra_split to make sure A is splitted in the
code region for B instead of being spilled.
bootstrap and regression test are ok for x86_64-linux-gnu. Is it ok for trunk?
Thanks for working on this Wei.
The patch is ok for the trunk with the change you proposed to resolve
Steven's concern and with resolving the changelog entries pitfalls below:
It is better to add the name of the parameter to make easier a future
search when the parameter was introduced (e.g. "params.h
(LRA_SPLIT_FREQ_RATIO): New param."). Adding '#include "params.h"' is
not in the changelog too.
And you should remove unfinished comment (see below).
For the future, it would be nice if such patches are benchmarked too
(e.g. on some SPEC) in order to be sure that it works not worse in most
cases (it is very hard to predict effect of such changes).
ChangeLog:
2014-04-25 Wei Mi <wmi@google.com>
PR rtl-optimization/60738
* params.h: New param.
* params.def: Ditto.
* lra-constraints.c (need_for_split_p): Let more
cases to do lra-split.
* ira-lives.c (process_single_reg_class_operands):
Avoid to add single reg class into conflict hardreg set in
some cases.
...
Index: lra-constraints.c
===================================================================
--- lra-constraints.c (revision 209253)
+++ lra-constraints.c (working copy)
@@ -129,6 +129,7 @@
#include "ira.h"
#include "rtl-error.h"
#include "lra-int.h"
+#include "params.h"
/* Value of LRA_CURR_RELOAD_NUM at the beginning of BB of the current
insn. Remember that LRA_CURR_RELOAD_NUM is the number of emitted
@@ -4632,8 +4633,13 @@ static bitmap_head ebb_global_regs;
static inline bool
need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno)
{
+ int freq;
+ rtx last_use_insn;
int hard_regno = regno < FIRST_PSEUDO_REGISTER ? regno : reg_renumber[regno];
+ last_use_insn = skip_usage_debug_insns (usage_insns[regno].insns);
+ freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (last_use_insn));
+
lra_assert (hard_regno >= 0);
return ((TEST_HARD_REG_BIT (potential_reload_hard_regs, hard_regno)
/* Don't split eliminable hard registers, otherwise we can
@@ -4653,25 +4659,27 @@ need_for_split_p (HARD_REG_SET potential
&& (regno >= FIRST_PSEUDO_REGISTER
|| ! TEST_HARD_REG_BIT (call_used_reg_set, regno)
|| usage_insns[regno].calls_num == calls_num)
- /* We need at least 2 reloads to make pseudo splitting
- profitable. We should provide hard regno splitting in
- any case to solve 1st insn scheduling problem when
- moving hard register definition up might result in
- impossibility to find hard register for reload pseudo of
- small register class. */
- && (usage_insns[regno].reloads_num
- + (regno < FIRST_PSEUDO_REGISTER ? 0 : 3) < reloads_num)
- && (regno < FIRST_PSEUDO_REGISTER
- /* For short living pseudos, spilling + inheritance can
- be considered a substitution for splitting.
- Therefore we do not splitting for local pseudos. It
- decreases also aggressiveness of splitting. The
- minimal number of references is chosen taking into
- account that for 2 references splitting has no sense
- as we can just spill the pseudo. */
- || (regno >= FIRST_PSEUDO_REGISTER
- && lra_reg_info[regno].nrefs > 3
- && bitmap_bit_p (&ebb_global_regs, regno))))
+ /* If
^
I guess it is a leftover from the final patch editing. It should be
removed too.
+ && ((LRA_SPLIT_FREQ_RATIO * freq < lra_reg_info[regno].freq)
+ /* We need at least 2 reloads to make pseudo splitting
+ profitable. We should provide hard regno splitting in
+ any case to solve 1st insn scheduling problem when
+ moving hard register definition up might result in
+ impossibility to find hard register for reload pseudo of
+ small register class. */
+ || ((usage_insns[regno].reloads_num
+ + (regno < FIRST_PSEUDO_REGISTER ? 0 : 3) < reloads_num)
+ && (regno < FIRST_PSEUDO_REGISTER
+ /* For short living pseudos, spilling + inheritance can
+ be considered a substitution for splitting.
+ Therefore we do not splitting for local pseudos. It
+ decreases also aggressiveness of splitting. The
+ minimal number of references is chosen taking into
+ account that for 2 references splitting has no sense
+ as we can just spill the pseudo. */
+ || (regno >= FIRST_PSEUDO_REGISTER
+ && lra_reg_info[regno].nrefs > 3
+ && bitmap_bit_p (&ebb_global_regs, regno))))))
|| (regno >= FIRST_PSEUDO_REGISTER && need_for_call_save_p (regno)));
}