This is the mail archive of the
mailing list for the GCC project.
Re: [tree-ssa] DCE with control dependence again (with numbers, for a change)
- From: law at redhat dot com
- To: "Zack Weinberg" <zack at codesourcery dot com>
- Cc: Steven Bosscher <s dot bosscher at student dot tudelft dot nl>, gcc-patches at gcc dot gnu dot org, aj at suse dot de, pinskia at physics dot uc dot edu
- Date: Fri, 16 Jan 2004 23:47:31 -0700
- Subject: Re: [tree-ssa] DCE with control dependence again (with numbers, for a change)
- Reply-to: law at redhat dot com
In message <email@example.com>, "Zack Weinberg" writes:
>We have a valued contributor, and a valuable contribution. Perhaps
>not perfect, but certainly not something which produced "significant
>compile time performance concerns [and] nearly nothing in terms of
>generating better code". (I cannot speak to the complexity of the
>One person stonewalls the patch, asking for - I reiterate - testing
>wildly out of proportion to other changes. As a result the patch is
>withdrawn and the contributor is frustrated, and perhaps less inclined
>to make further contributions.
>This is precisely the sort of outcome that I was hoping could be
I'm sorry it looks like I'm stonewalling the patch. I think it would
be better characterized as looking at it very closely.
Quite simply if this patch doesn't help, then all it's doing is churning
the sources. Churning without creating something that was better than
what we started with isn't useful. That's why I'm so interested in the
hard data about how the patch affects compile time, runtime, etc.
We started off with a control dependent DCE implementation (not the same as
Steven's, but it had the same net effect).
We saw that it was clearly a performance problem given the infrastructure
we had in place 6 months ago. At that time we also knew Zdenek's lowering
work was going to break how we determined control dependence in our
At that time I made the changes necessary to do a true control dependent
DCE, like Steven I borrowed that code from the old rtl ssa-dce.c. After
benchmarking that code I quickly tossed it out without checking it in.
Building the control dependencies was killing the compile time performance.
We then evaluated the utility of killing control structures entirely and
found that in practice the control dependent DCE implementation very
rarely found anything that the simpler and faster DCE implementation
At that time we decided to drop control dependent DCE for the simpler
and faster implementation.
It's probably worth noting that I was originally in favor of a scheme like
Steven's -- run the simplest/fastest DCE for all passes except the last,
then run the more aggressive version last. But in the end, I couldn't
justify that based on the data I had regarding the behavior of the more
aggressive version of DCE.
On the size of generated binaries:
Right now I see very conflicting data from Steven -- in one set of GCC
tests the size of gcc's test sections was reduced by ~4k. Not huge, but
enough to make me seriously consider Steven's patch.
But then looking at compiling GCC within spec we saved all of 192 bytes.
Yup, 192 bytes. That's probably on the order of killing 10-20 control
structures and the insns that feed them -- across the entire compiler.
All I asked for was any ideas from Steven which might explain the
discrepancy. I can put forth some of my own:
1. The spec GCC sources are significantly smaller and thus have
fewer opportunities for the improved DCE code to improve things.
2. The RTL passes cleaned up most of the turds left by the old dce
implementation, thus the resulting code wasn't significantly better.
However, if this were the case, then I would have expected to see
similar behavior in Steven's original tests.
3. Alignment behavior. Maybe we're killing more code, but it's getting
lost due to loop/function alignment requests. Depending on how
precisely we measure the text segments this could be quite significant.
However, I could also argue that this is not a good measurement of the
utility of Steven's implementation. Instead it might be better to look
at how many times it manages to eliminate a control structure that the
old implementation left lying around. This could be done by running the
old implementation first, then an instrumented version of Steven's which
logged a message anytime it deleted anything.
Why is this a better measure? It tells us how often the more complex
algorithm actually does something useful -- every time it does something
useful there is less code for the remaining passes to deal with.
On compile-time performance:
We have some data which strongly the compiler is faster with Steven's
code -- all I was looking for was a couple additional compiles
of spec to verify that behavior across a larger codebase. Having that
alone may be enough to swing my opinion about the code. Particularly
when I know we may have other uses for control dependency later.
Two things have changed significantly which I think makes Steven's code
more tractable these days from a compile-time standpoint. First,
we factor computed gotos which eliminates a certain amount of insanity.
Second, we have an improved dominator infrastructure.
On the runtime performance on the generated code:
I agree with Steven that the overall performance impact I think is
likely a wash.
I don't necessarily agree with Steven's conclusions about the individual
tests simply because the variance in the test results makes it very hard
to determine if the changes in individual tests are statistically
significant. In fact, the amount of variance he's seeing is within the
amount of variance I typically see for runs of the same executable
(which is why I suggested he first verify that the executables in fact
had different code).
I can't control how you choose to interpret my actions. I can only hope
that you know me well enough to realize that I will do my best to do the
right thing and make good decisions, just like I trust you to do the same.
Will there be judgment calls where we disagree? Probably. We are after all
two different people. But hopefully we can see past differences of opinion
and realize that each of us is doing whatever they can to improve things
and that sometimes we will have to agree to disagree.