This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-ssa] DCE with control dependence again (with numbers, for a change)

In message <>, "Zack Weinberg" writes:
 >We have a valued contributor, and a valuable contribution.  Perhaps
 >not perfect, but certainly not something which produced "significant
 >compile time performance concerns [and] nearly nothing in terms of
 >generating better code".  (I cannot speak to the complexity of the
 >One person stonewalls the patch, asking for - I reiterate - testing
 >wildly out of proportion to other changes.  As a result the patch is
 >withdrawn and the contributor is frustrated, and perhaps less inclined
 >to make further contributions.
 >This is precisely the sort of outcome that I was hoping could be
I'm sorry it looks like I'm stonewalling the patch.  I think it would
be better characterized as looking at it very closely.

Quite simply if this patch doesn't help, then all it's doing is churning
the sources.  Churning without creating something that was better than
what we started with isn't useful.  That's why I'm so interested in the
hard data about how the patch affects compile time, runtime, etc.


We started off with a control dependent DCE implementation (not the same as
Steven's, but it had the same net effect).

We saw that it was clearly a performance problem given the infrastructure
we had in place 6 months ago.  At that time we also knew Zdenek's lowering
work was going to break how we determined control dependence in our

At that time I made the changes necessary to do a true control dependent
DCE, like Steven I borrowed that code from the old rtl ssa-dce.c.  After
benchmarking that code I quickly tossed it out without checking it in.
Building the control dependencies was killing the compile time performance.  

We then evaluated the utility of killing control structures entirely and
found that in practice the control dependent DCE implementation very
rarely found anything that the simpler and faster DCE implementation

At that time we decided to drop control dependent DCE for the simpler
and faster implementation.

It's probably worth noting that I was originally in favor of a scheme like
Steven's -- run the simplest/fastest DCE for all passes except the last,
then run the more aggressive version last.  But in the end, I couldn't
justify that based on the data I had regarding the behavior of the more 
aggressive version of DCE.

On the size of generated binaries:

  Right now I see very conflicting data from Steven -- in one set of GCC
  tests the size of gcc's test sections was reduced by ~4k.  Not huge, but
  enough to make me seriously consider Steven's patch.

  But then looking at compiling GCC within spec we saved all of 192 bytes.
  Yup, 192 bytes.  That's probably on the order of killing 10-20 control
  structures and the insns that feed them -- across the entire compiler.

  All I asked for was any ideas from Steven which might explain the
  discrepancy.  I can put forth some of my own:

    1. The spec GCC sources are significantly smaller and thus have
       fewer opportunities for the improved DCE code to improve things.

    2. The RTL passes cleaned up most of the turds left by the old dce
       implementation, thus the resulting code wasn't significantly better.
       However, if this were the case, then I would have expected to see
       similar behavior in Steven's original tests.

    3. Alignment behavior.    Maybe we're killing more code, but it's getting
       lost due to loop/function alignment requests.  Depending on how
       precisely we measure the text segments this could be quite significant.

However, I could also argue that this is not a good measurement of the
utility of Steven's implementation.  Instead it might be better to look
at how many times it manages to eliminate a control structure that the
old implementation left lying around.  This could be done by running the
old implementation first, then an instrumented version of Steven's which
logged a message anytime it deleted anything.

Why is this a better measure?  It tells us how often the more complex
algorithm actually does something useful -- every time it does something
useful there is less code for the remaining passes to deal with.


On compile-time performance:

  We have some data which strongly  the compiler is faster with Steven's 
  code  -- all I was looking for was a couple additional compiles
  of spec to verify that behavior across a larger codebase.   Having that
  alone may be enough to swing my opinion about the code.  Particularly
  when I know we may have other uses for control dependency later.

  Two things have changed significantly which I think makes Steven's code
  more tractable these days from a compile-time standpoint.  First, 
  we factor computed gotos which eliminates a certain amount of insanity.

  Second, we have an improved dominator infrastructure.


On the runtime performance on the generated code:

  I agree with Steven that the overall performance impact I think is 
  likely a wash.

  I don't necessarily agree with Steven's conclusions about the individual
  tests simply because the variance in the test results makes it very hard
  to determine if the changes in individual tests are statistically
  significant.  In fact, the amount of variance he's seeing is within the
  amount of variance I typically see for runs of the same executable
  (which is why I suggested he first verify that the executables in fact
   had different code).


I can't control how you choose to interpret my actions.  I can only hope
that you know me well enough to realize that I will do my best to do the
right thing and make good decisions, just like I trust you to do the same.

Will there be judgment calls where we disagree? Probably.  We are after all
two different people.  But hopefully we can see past differences of opinion
and realize that each of us is doing whatever they can to improve things
and that sometimes we will have to agree to disagree.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]