[Bug tree-optimization/88440] size optimization of memcpy-like code
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed May 22 11:49:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440
--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ick.
static inline void
check_pseudos_live_through_calls (int regno,
HARD_REG_SET last_call_used_reg_set,
rtx_insn *call_insn)
{
...
for (hr = 0; HARD_REGISTER_NUM_P (hr); hr++)
if (targetm.hard_regno_call_part_clobbered (call_insn, hr,
PSEUDO_REGNO_MODE (regno)))
add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs,
PSEUDO_REGNO_MODE (regno), hr);
this loop is repeatedly computing an implicit hard-reg set for
which hard-regs are partly clobbered by the call for the _same_
actual instruction since check_pseudos_live_through_calls is called
via
/* Mark each defined value as live. We need to do this for
unused values because they still conflict with quantities
that are live at the time of the definition. */
for (reg = curr_id->regs; reg != NULL; reg = reg->next)
{
if (reg->type != OP_IN)
{
update_pseudo_point (reg->regno, curr_point, USE_POINT);
mark_regno_live (reg->regno, reg->biggest_mode);
check_pseudos_live_through_calls (reg->regno,
last_call_used_reg_set,
call_insn);
...
}
and
EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j)
{
IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set,
this_call_used_reg_set);
if (flush)
check_pseudos_live_through_calls (j,
last_call_used_reg_set,
last_call_insn);
}
and
/* Mark each used value as live. */
for (reg = curr_id->regs; reg != NULL; reg = reg->next)
if (reg->type != OP_OUT)
{
if (reg->type == OP_IN)
update_pseudo_point (reg->regno, curr_point, USE_POINT);
mark_regno_live (reg->regno, reg->biggest_mode);
check_pseudos_live_through_calls (reg->regno,
last_call_used_reg_set,
call_insn);
}
and
EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), FIRST_PSEUDO_REGISTER, j, bi)
{
if (sparseset_cardinality (pseudos_live_through_calls) == 0)
break;
if (sparseset_bit_p (pseudos_live_through_calls, j))
check_pseudos_live_through_calls (j, last_call_used_reg_set,
call_insn);
}
the pseudos mode may change but I guess usually it doesn't. I also wonder
why the target hook doesn't return a hard-reg-set ...
That said, the above code doesn't scale well with functions with a lot of
calls at least, also the passed call_insn isn't the current insn and
might even be NULL. All but aarch64 do not even look at the actual instruction
(even more an argument for re-designing the hook with it's use in mind).
I guess an artificial testcase with a lot of calls and a lot of live
pseudos (even single-BB) should show this issue easily.
Samples: 579 of event 'cycles:ppp', Event count (approx.): 257134187434191
Overhead Command Shared Object Symbol
22.26% f951 f951 [.] process_bb_lives
15.06% f951 f951 [.] ix86_hard_regno_call_part_clobbered
8.55% f951 f951 [.] concat
6.88% f951 f951 [.] find_base_term
3.60% f951 f951 [.] get_ref_base_and_extent
3.27% f951 f951 [.] find_base_term
2.95% f951 f951 [.] make_hard_regno_dead
More information about the Gcc-bugs
mailing list