This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch][rfc] PR tree-optimization/21883


On Sun, 2005-09-04 at 01:34 +0200, Steven Bosscher wrote:
> Hi,
> 
> Bug 21883 is a bug where tree-ssa-dom's jump threading decides to
> thread across a very large basic block in a way that requires
> duplicating that large block.  Jakub has a nice test case for it:
Yup.  The only reason I haven't started to tackle this is because
it hasn't been clear if this is a common problem or not.


> 
> Mainline will duplicate all the foo1 calls just to avoid the second
> conditional jump, to give something equivalent to,
Yup.  Which I think everyone agrees isn't terribly smart.


> This transformation is done without any regard for the increase in
> code size that this causes (we do this transformation even at -Os).
And one possibility if we decide not to tackle this for 4.1 would
to avoid anything except trivial jump threading at -Os.

> The patch below changes this.  With the patch, DOM assumes that if
> the jump threading succeeds, all statements used in the computation
> of the controlling expression of a conditional jump are going to be
> folded (so they do not cause any code size increase).  DOM assume all
> other statements have to be duplicated when the jump threading is
> performed.  It then simply makes an estimate of the number of insns
> required for those statements (using estimate_num_insns) and simply
> does not apply the transformation of this number is higher than some
> number, in this case a number defined in params.def.
These seem like pretty reasonable assumptions.

Those assumptions will only become more accurate as we push more
work to specialized passes -- for example, the amount of copies
lying around the IL is staggering when we reach DOM and as a
result the number of dead copies after DOM is staggering.  And
(of course) we dutifully duplicate all those dead copies....



> 1) Does this patch look reasonable for GCC 4.1 (modulo reviewer's
>    comments of course...)?  I think the patch is not too intrusive
>    but it is quite large for a stage3 patch.
It looks like there's a little refactoring, then the heuristic to
avoid threading in cases where it's likely to cause too much growth.
I'd probably support this patch, or a variant thereof for 4.1.

Another thing to consider -- you might want to mark blocks which are
OK/not-OK for threading so that we don't process them over and over.
You might get a little more speedup with that change.  IIRC the RTL
jump threader does something similar.  I'm not sure if doing a
pre-scan of the blocks at the start of a DOM iteration would work
better than lazily computing the attribute and saving it for a DOM
iteration.

[ The CFG cleanup which occurs before DOM iterates can combine 
  blocks, which would invalidate whether or not a block is
  suitable for threading through. ]


> 2) What would be a reasonable number for this new param I introduce?
>    We discussed this a bit on IRC.  It was suggested that it should
>    probably depend on BRANCH_COST, but in that case perhaps a param
>    is the wrong choice.  Maybe the loop depth should be taken into
>    account (for icache) and perhaps for -Os the duplication should be
>    disabled completely if there are any statements that need to be
>    duplicated.  I've looked at bb-reorder's copy_bb_p, but that did
>    not help very much either. 
>    Ideas?  Suggestions?  Just brute-force testing is obviously also
>    required, but I'd like to hear some idea for heuristics if the 
>    answer for (1) above is positive ;-)
I would expect the thresh hold to be relatively small -- if we're
duplicating 100 statements to eliminate a single conditional jump
then we're probably doing way too much duplication.  It might be
interesting to have a histogram of how many statements are encountered
in these duplicated blocks across a bootstrap or something like
that -- just to get a feel for where the sweet spot might be.


Jeff





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]