[tree-ssa] New regressions as of 2003-11-04
Jan Hubicka
jh@suse.cz
Wed Nov 5 15:38:00 GMT 2003
> On Wed, 2003-11-05 at 09:58, Jan Hubicka wrote:
> > > On Wed, 2003-11-05 at 04:20, Jan Hubicka wrote:
> > > > > On Tue, 2003-11-04 at 16:17, law@redhat.com wrote:
>
> > > remove_useless_stmts_and_vars_cond could keep the cfg updated if it
> > > wanted to. It didnt before becuase it was thrown away immediately
> > > aftewards, so what was the point. Now perhaps there is a point.
> >
> > We can do it that way, but the function should be mostly NOOP now.
> > cleanup_cfg+dead code removal should've caught all cases previously.
>
> I think it rips out VAR_DECLS which aren't used too from the variable
> lists. Not sure what else it does now.
I think with Zdenek's change it don't do that anymore. We can do this
safely as it don't modify CFG at all, so I won't object agains this.
>
>
> > > Thats not going to change the number of temporaries. None of the copies
> > > inserted create new temporaries unless there is a cycle, and you cant
> > > avoid creating one in that case.
> >
> > The point is that we *want* to create temporary registers so unless
> > there is good reason to do so, we don't want single register to be set
> > two times. That confuses RTL.
> >
>
> Setting a single register more than once confuses RTL? Im not sure I
> follow. That would imply RTL is SSA :-)
It is not SSA, but many optimizers to expect it.
Simply when register is set more than once, they give up (lacking DU/UD
chains). For local optimizers this works.
>
> The register allocator ought to be smart enough to tell when two
> disjoint live ranges use the same register, and rename one of the
It is not.
Webizer can do it, but we do it only at -O3 becuase it messes up debug
info. We can even get worse when doing this in dummy way as sometimes
we are smart to re-use register on 2-address machines. OK, don't ask me
> registers to something else to allow them to prevent artifical
> interferences. Thats pretty standard register allocation machinery. Vlad
> has this code a while ago, but it made something in SPEC worse so he
> never followed up on it.
Webizer makes about 0.5% difference on mainline, over 1.5% difference on
tree-SSA, so we obviously made this worse by unnecesary re-use.
> > You mean that it will re-insert the expression back into the form
> > (basically undoing the gimplification?)
> > Yes, that would help too.
>
> Yes, the code coming out of SSA will no longer be GIMPLE.
That is cool. I guess it should bring RTL expansion mostly back to the
shape :)
>
> > >
> > > If live ranges are disjoint for the same variable, we ought to be using
> > > Vlad's code which splits disjoint live range on registers. He has it,
> > > but hasnt submitted it for mainline because it wasn't doing enough good.
> >
> > webizer split's the disjoint live ranges too. That is what I did.
> > It is, however, unnecesary to do so since we can do it for free while
> > going out of SSA.
> > Of course in case we end up with SSA, I will just propose patch to run
> > limited webizer when optimizing early in the queue.
>
> I think register allocation is the right place to be splitting disjoint
> live ranges. I've worked on other compilers where the backend *tries* to
Not for GCC, as all the stupid local passes benefit from it greatly too.
At the moment I do the splitting (webizer) after loop unrolling (because
it introduces such a re-use) and trying to maintain the constraint of
not inventing re-use up to reg-alloc.
This is pretty easy as re-use never did good to RTL.
> do exactly what gimplification does, provide a canonical definition for
> a register. (In fact, it had a seperate pass to re-canonicalize the code
> part way through). Then the register allocator sorts it out to make sure
> there are no artifical conflicts. I dont see why other optimizations
> should be confused by this behaviour... I presume we do get confused
> somewhere tho or you wouldn't have brought it up :-). Where do we get
> confused?
You do expect sane compiler, not GCC :)
> Wll, SSA->normal introduces few new VAR_DECL nodes. Only when 2 SSA
> versions overlap do we get a new one, and its pretty goopd at producing
> a minimal number of them. It will try to coalesce as many as possible
This is the problem. We should introduce new VAR_DECLs when the live
ranges does not overlap to get as many different pseudos as possible
without introducing register copies to make RTL backend happy.
One of reasons why you idea to re-combine gimple nodes works is that you
hide these re-used temporaries and we reinvent them during expansion in
the way RTL expect.
> with the original variable. If we need a new variable, it tries to
> coalesce as many of the remaininig versions as possible with the new
> variable. That usually gets them all when there is overlap.
>
> And yes, mapping SSA_NAMes to registers wouldn't be difficult in
> expansion, as long as SSA->normal makes sure that all the unsafe stuff
> has been taken care of already. I need to thikn about the ramifications
> of that more. There are some nice properties to it on the surface at
> least.
>
> I had a run last night I wasnt too pleased with (bzip was much *slower*
> for some reason), but my compiler was a little bit broken. Im doing more
> runs today. I just ran crafty on my 1.8 Ghz x86, and it drops from 3.52
> minutes to 3.34 minutes. That was the testcase that triggered me
> looking into this.
I would say it makes sense to wait month or two for the whole thing to
somewhat settle down. There are too many changes on the way to expect
benchmark results to be usefull right now I would say.
At least from our side Zdenek has important changes to tree-SSA
insturction chain representation and I do have new expansion code. We
hope to push this out till end of week, but there are many, many weird
brokeness around the branch delaying this.
Honza
>
> Anyway, More results when I get them.
>
> Andrew
>
More information about the Gcc
mailing list