This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch][rfc] PR tree-optimization/21883


> This transformation is done without any regard for the increase in
> code size that this causes (we do this transformation even at -Os).
> 
> The patch below changes this.  With the patch, DOM assumes that if
> the jump threading succeeds, all statements used in the computation
> of the controlling expression of a conditional jump are going to be
> folded (so they do not cause any code size increase).  DOM assume all
> other statements have to be duplicated when the jump threading is
> performed.  It then simply makes an estimate of the number of insns
> required for those statements (using estimate_num_insns) and simply
> does not apply the transformation of this number is higher than some
> number, in this case a number defined in params.def.
> 
> 
> Now, this patch bootstrapped and passed testing just fine on x86_64,
> but otherwise I haven't tested it very thoroughly yet.  I've got a
> few questions for the experts first ;-)

> 
> 1) Does this patch look reasonable for GCC 4.1 (modulo reviewer's
>    comments of course...)?  I think the patch is not too intrusive
>    but it is quite large for a stage3 patch.
I like this.

Other compilers limit the code growth of jump threading to avoid blowing
up code size (and icache usage) just to avoid some single, predictable
branch. 

The only problem i see is that we don't classify and rank our
opportunities, so without ordering, we're just going to stop even though
we might hit something profitable.  We also still iterate, which is
where we really lose in DOM.

It's pretty easy to determine why we want to thread something, and we
probably shouldn't have to iterate.

The opportunities generally fall into categories of:

1. Redundant conditions (you can just eliminate the second condition as
long as it is control dependent on the first)

if (cond) 
{
    if (cond) 
    {
    }
}
2. Mergable conditions (identically control dependence, but you may have
to merge common code).

if (cond)
{
}
<maybe code>
if (cond)
{
}

(mergable conditions are either adjacent or just identically control
dependent, where the adjacent ones require no common code)

3. Partially redundant conditions (control dependent on the first, but
code replication may be required to eliminate)

if (cond1 || cond2)
{
  if (cond1)
   {
   }
}


1 is always a win as long as you aren't screwing up the CFG too badly.
2 and 3 should be ranked according to how much we expect the benefit to
be (ie execution frequencies, etc).

BTW, to not iterate, XLC, as an example, identifies all the
opportunities top down by tracking live conditions, classifying the
redundancy opportunities, and then does a transitive closure to find the
rest of the "really" redundant ones (IE originally thought to be
mergeable/redundant conditions where the live condition for the first
candidate dominates the condition we believed to be the condition for
the second block, making the second condition really redundant instead
of just mergable or adjacent).

We currently do this all on the fly, so we have to iterate to discover
that we could have made something else really redundant, etc.




> 
> 2) What would be a reasonable number for this new param I introduce?

somewhere between 100 and 1000 sounds right :)



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]