[PATCH] split, i386: Fix up df uses in i386 splitters [PR99104]

Richard Sandiford richard.sandiford@arm.com
Tue Feb 16 09:16:40 GMT 2021


Jakub Jelinek <jakub@redhat.com> writes:
> On Tue, Feb 16, 2021 at 09:42:22AM +0100, Richard Biener wrote:
>> Just to get an idea whether it's worth doing the extra df_analyze.
>> Since we have possibly 5 split passes it's a lot of churn for things
>> like that WRF ltrans unit that already spends 40% of its time in DF ...
>
> Yeah, df_analyze can be fairly expensive and most of the targets don't
> really need it at all.
>
> If I grep for df_get.*_out, I find:
> config/aarch64/aarch64.c:      bitmap live1 = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> config/arc/arc.c:	      && REGNO_REG_SET_P (df_get_live_out (loop->incoming_src),
> config/arm/arm.c:  bitmap prologue_live_out = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> config/arm/arm.c:  return REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)), 3);
> config/arm/arm.c:	    = REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)),
> config/arm/arm.c:	= REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)),
> config/bfin/bfin.c:	    && !REGNO_REG_SET_P (df_get_live_out (bb_in), i))
> config/i386/i386.c:  return REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)), 0);
> config/i386/i386.c:  live = df_get_live_out(bb);
> config/i386/i386-features.c:      bitmap_copy (live_regs, df_get_live_out (bb));
> where aarch64, arc and bfin are ok, i386 has this known issue and
> arm uses most of the calls during pro/epilogue expansion (fine), but seems
> to use it also (indirectly) in USE_RETURN_INSN which is used not in
> splitters, but in insn conditions (so matched at any time).  But it seems to
> use a cache so that once computed it remembers it, so probably it is ok too.
>
> So, adding unconditional df_analyze / TODO_df_finish would slow down all the
> targets, but only help a single one and even on that one it is better to do
> it only if it will be really needed (e.g. it is never needed during split1
> (!reload_completed doesn't call those at all) and even in split2+ one really
> needs to trigger the right patterns in the IL (it can trigger quite
> frequently with the atom/bonell tunings, but otherwise only very rarely).

But doing it on demand like this seems fragile.  And the targets aren't
a fixed… target.  I think we need to design the interface so that things
are unlikely to go wrong in future rather than design it on the basis
that things will stay the way they are now.

This kind of thing isn't needed for other uses of DF and I don't think
we should expect anyone who adds a new use of DF in splitters to
remember that this is needed.

The fact that the postorder is recomputed on each df_analyze is IMO
a separate issue and could/should be fixed separately if it's a
significant overhead.

(There are other potential savings like that too.  E.g. at the moment
we don't try to keep dominance information up-to-date for RTL passes,
so every pass that computes it has to free it too.)

Thanks,
Richard


More information about the Gcc-patches mailing list