This is the mail archive of the
mailing list for the GCC project.
Re: RFA: improvement to if-conversion
- From: Joern Rennecke <joern dot rennecke at superh dot com>
- To: rth at redhat dot com (Richard Henderson)
- Cc: joern dot rennecke at superh dot com (Joern Rennecke), gcc-patches at gcc dot gnu dot org
- Date: Wed, 4 Feb 2004 18:45:26 +0000 (GMT)
- Subject: Re: RFA: improvement to if-conversion
> On Fri, Jan 30, 2004 at 01:20:40PM +0000, Joern Rennecke wrote:
> > Looking for matching outgoing edges instead of linear control flow
> > merge could improve if-conversion in general.
> Matching edges is the if_case_N routines, as opposed to
> the noce_* routines.
With matching edges I mean that both the if and the else block have
the same successor, albeit the successor is not adjacent, and might
have other predecessors. I don't see that being handled by the
find_if_case_N routines .
> *shrug* depends on how much, I guess. Of course, if you find such
> a sequence that does depend on input registers, cross-jump could
> certainly make use -- emit one move before the jump. It'd be a
> true size improvement over what we currently have.
The overhead is basically a linear factor - I think it's mainly just a
bigger I-cache footprint for matches. Because cross-jumping will consider
using just the end of a block, we can't really be undecided for a long
time if the transformation is possible. Of course we could send some
time finding out that the transformation is possible for N instructions,
but because we are optimizing for speed, and we can't do the entire block,
we decide it's not useful. But I would that expect to be in the noise.
When we are optimizing for speed, we basically only want the case that
is an if conversion.
> > What would be the right home for a basic block structual comparison
> > function used by if-convert and cross-jumping? rtlanal.c ? cfgrtl.c?
> Dunno. It might deserve it's own file.
It doesn't seem quite that large to me. Unless you have objections,
I'll look into putting it into cfgrtl.c
> > I think full register liveness information would be harder to keep
> > up-to-date as if conversion progresses.
> I think that global_live_at_start/end are kept up-to-date. Leastwise,
> I'd be interested to know how badly they're off. We re-run life info
> at the end, but that's to get death notes inserted properly.
Hmm. merge_if_block calls rtl_merge_blocks, and it calls it last merging
join_bb into then_bb, which puts the life_at_end information into then_bb.
Would it be OK to re-order register notes to make them easier to compare?
I.e., if I sort them by type to get a normal form, I can compare the
head of both note lists together without having to search the other list
for a match.
I suppose it is best to make two passes over the insns, first to decide
if the optimization can be safely done (for cross-jumping also: for how many
instructions), and a measure of cost for unifying things that are not the
same according to notes (see below), and a if the optimization goes ahead,
a second pass to actually remove all the notes that need removal.
If I sort the notes in the first pass, I can rely on them being sorted
in the second pass.
I noticed that the current cross-jumping code does only care about
REG_EQUAL, REG_EQUIV, and some REG_DEAD notes, but I think the way it
ignores all the other notes is generally unsafe. I've gone through
enum reg_note to evaluate the current handling of the various note types
in the cross jumping code - or lack thereof - for safety:
REG_DEAD, REG_UNUSED: should be recomputed afterwards. insns_match_p
checks the special case of stack regs dying.
REG_INC: If insns match, notes should match.
(Note: have to add POST_INC etc handling in struct_equiv to clear rvalue
REG_LABEL: If insns match, notes should match.
Dubious - existence of the note is unlikely or its presence might not
matter, but the current handling can't be considered safe without
REG_EQUIV, REG_EQUAL: flow_find_cross_jump removes non-matching ones.
If a libcall is around when it does this, and it matches except for the
note, it is guaranteed to be broken.
Removing notes can also be considered to have a cost in potentially
missed subsequent optimizations.
REG_RETVAL, REG_LIBCALL: If they are around, we should better make sure they
match. We could remove non-matching ones, but only if we do it for
both the REG_RETVAL and the REG_LIBCALL note of the same libcall.
REG_CC_SETTER, REG_CC_USER: If insns match, notes should match - except
after reorg, when a cc0 setting insn is allowed in a delay slot of a jump.
Accordig to backends.html, this could affect cris.
REG_BR_PROB: It would be prudent to compute the resultant probability,
rather than chucking away one set of data.
If both block-to-be-combined are similarly frequent, but the branch
probability is significantly different, we also inhibit subsequent
probability-based optimizations and static branch prediction. This can
be considered to be an additional cost of the considered transformation.
REG_NONNEG, REG_NOALIAS, REG_ALWAYS_RETURN: we could remove non-matching
ones; but we must not introduce them on a path that didn't have them.
REG_FRAME_RELATED_EXPR, REG_EH_REGION, REG_SETJMP, REG_VTABLE_REF.
rtl.h / rtl.c garbage???: