This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [tree-ssa] New regressions as of 2003-11-04
On Wed, 2003-11-05 at 09:58, Jan Hubicka wrote:
> > On Wed, 2003-11-05 at 04:20, Jan Hubicka wrote:
> > > > On Tue, 2003-11-04 at 16:17, law@redhat.com wrote:
> > remove_useless_stmts_and_vars_cond could keep the cfg updated if it
> > wanted to. It didnt before becuase it was thrown away immediately
> > aftewards, so what was the point. Now perhaps there is a point.
>
> We can do it that way, but the function should be mostly NOOP now.
> cleanup_cfg+dead code removal should've caught all cases previously.
I think it rips out VAR_DECLS which aren't used too from the variable
lists. Not sure what else it does now.
> > Thats not going to change the number of temporaries. None of the copies
> > inserted create new temporaries unless there is a cycle, and you cant
> > avoid creating one in that case.
>
> The point is that we *want* to create temporary registers so unless
> there is good reason to do so, we don't want single register to be set
> two times. That confuses RTL.
>
Setting a single register more than once confuses RTL? Im not sure I
follow. That would imply RTL is SSA :-)
The register allocator ought to be smart enough to tell when two
disjoint live ranges use the same register, and rename one of the
registers to something else to allow them to prevent artifical
interferences. Thats pretty standard register allocation machinery. Vlad
has this code a while ago, but it made something in SPEC worse so he
never followed up on it.
> > I have some ongoing work in SSA->normal which remove all temporaries
> > which are used only once. This reduces the number of temporaries
> > significantly. In many cases, the object footprint generated is even 10%
> > smaller.
>
> You mean that it will re-insert the expression back into the form
> (basically undoing the gimplification?)
> Yes, that would help too.
Yes, the code coming out of SSA will no longer be GIMPLE.
> >
> > If live ranges are disjoint for the same variable, we ought to be using
> > Vlad's code which splits disjoint live range on registers. He has it,
> > but hasnt submitted it for mainline because it wasn't doing enough good.
>
> webizer split's the disjoint live ranges too. That is what I did.
> It is, however, unnecesary to do so since we can do it for free while
> going out of SSA.
> Of course in case we end up with SSA, I will just propose patch to run
> limited webizer when optimizing early in the queue.
I think register allocation is the right place to be splitting disjoint
live ranges. I've worked on other compilers where the backend *tries* to
do exactly what gimplification does, provide a canonical definition for
a register. (In fact, it had a seperate pass to re-canonicalize the code
part way through). Then the register allocator sorts it out to make sure
there are no artifical conflicts. I dont see why other optimizations
should be confused by this behaviour... I presume we do get confused
somewhere tho or you wouldn't have brought it up :-). Where do we get
confused?
> > So you would assign every SSA_NAME variable a unique register, and
> > generate rtl directly? You would still have to attempt to coalesce any
>
> Kind of. The out-of-SSA pass works by first deciding how to match
> SSA_NAMES to variables and then do the rewriting.
> We can do the second part easilly without rewriting during expansion
> time and in addition we can manage thinks in a way so we don't need to
> invent new DECL_NODEs for temporaries saving some overhead.
> Not sure.
Wll, SSA->normal introduces few new VAR_DECL nodes. Only when 2 SSA
versions overlap do we get a new one, and its pretty goopd at producing
a minimal number of them. It will try to coalesce as many as possible
with the original variable. If we need a new variable, it tries to
coalesce as many of the remaininig versions as possible with the new
variable. That usually gets them all when there is overlap.
And yes, mapping SSA_NAMes to registers wouldn't be difficult in
expansion, as long as SSA->normal makes sure that all the unsafe stuff
has been taken care of already. I need to thikn about the ramifications
of that more. There are some nice properties to it on the surface at
least.
>
> We can insert the copies in RTL form already, so we save some extra
> trees.
yeah, but I dont think thats causing our highwater memory marks either.
We probably ought to be doing a garbage collection after we generate RTL
for each function tho. Maybe :-)
> >
> > Anyway, we'll see what kind of results I get with the temporary
> > elimination in SSA->normal that Im testing now. Its not a lot of code,
> > and is performed during the SSA->normal rewrite phase.
>
> OK, lets wait with this until the code is more stabilized and
> benchmarked. THen we will likely know how much burden the de-ssa pass
> is.
I had a run last night I wasnt too pleased with (bzip was much *slower*
for some reason), but my compiler was a little bit broken. Im doing more
runs today. I just ran crafty on my 1.8 Ghz x86, and it drops from 3.52
minutes to 3.34 minutes. That was the testcase that triggered me
looking into this.
Anyway, More results when I get them.
Andrew