This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: GCSE/PRE problem
Hello,
> There was a discussion on a problem in GCSE/PRE, I should have posted this
> in gcc at gcc dot gnu dot org in the first place.
> I am posting all the previous discussion now along with my response.
>
> PREVIOUS DISCUSSION
> > > I am trying to use this pass to implement an optimization of redundant
> > > loads due to stores.
> > > I gave it several tries and found out that we have a problem due to the
> way
> > > global variables are accessed - in such an access there will be one add
> and
> > > two loads. Thus in order to eliminate the load GCSE must eliminate the
> add
> > > in the first try , then the first load and finally the last load - a
> total
> > > of 3 GCSE passes.
> > > Bottom line , we need at least three iterations of "one_pre_gcse_pass".
> > > I made some tries to see if the GCSE is capable of doing so in general
> and
> > > wrote the below example.
> > > It has a simple addition and multiplication expressions that are
> dependent;
> > > GCC couldn't make the second pass elimination as it should (according
> to my
> > > understanding) , it didn't remove the multiplication.
>
> > the main reason (*) is that gcc's gcse is too stupid to handle
> > multiplication. The reason for this is that multiplication instruction
> > on x86 clobbers flags and we are unable to handle it (see
> > want_to_gcse_p).
>
> >
> > Honza Hubicka has a patch for this on cfg branch (he planned to push it
> > to mainline but probably had too much other work).
>
> We've recently chattend about these patches with Richard. I would like
> to re/continue pushing them to manline but they need trorough testing as
> the concept is pretty different.
> It is top on my TODO, but recently I don=t even have time to look into
> my TODO..
>
> Honza
> >
> > Zdenek
> >
> > (*) There is one more minor issue I've found while examining your
> > problem that I think also might also cause problems. Try the cfg branch
> > first, if it won't help I will have a look at that.
> >
> > > I have also looked at the RTL dump after gcse to confirm this.
> > > Is this is a known problem? Am I doing some things wrong or
> misunderstood
> > > what PRE should do?
> > >
> > > I have compiled with gcc -O3 -S -dG --param max-gcse-passes=3
> > >
> > > The code:
> > > void g (int a , int b , int c);
> > > void f (int x , int y , int z)
> > > {
> > > int a = 0, b = 0 , c= 0,t1,t2,t3;
> > > if (z > 0 )
> > > {
> > > t1 = x + y;
> > > a = t1 * z;
> > > }
> > > else
> > > {
> > > t2 = x + y;
> > > b = t2 *z;
> > >
> > > }
> > >
> > > t3 = x + y; /* Redundant , should be removed by GCSE first try*/
> > > c = t3 * z; /* Redundant , should be removed by GCSE second try*/
> > > g(a,b,c);
> > > }
> > > Mustafa
>
> MY RESPONSE:
>
> I have tested the above example on a PowerPC machine, which doesn't have
> this clobber problem, and I still have the same result.
> Moreover, I have additional example with load and addition (without any
> multiplication), which also suffer from the same problem.
> I have continued looking at the GCSE code, and found out that the problem
> is due to the fact that the "reaching pseudo register" that holds the
> redundant add value has different number from the one that holds the
> addition value in the other two additions.
> Thus, the redundant multiplication is not recognized as the same expression
> as the other two multiplications.
> In order to be sure that this is indeed the problem, I took the early
> coalesce patch (from cgf branch cfg-coalesce.c) and called coalesce()
> after each iteration of one_pre_gcse_pass. And it did removed the redundant
> multiplication!
> What happened is that the coalesce() rearranged the pseudo register usage
> and put the addition result in one pseudo. Now the GCSE recognizes the
> three multiplications as the same expression.
yes, this is the additional problem I have spoken about. In fact the
solution is easier -- in pre_insert_copy_insn we replace
x = something
with
x = something
temp = x.
Instead we should do
temp = something
x = temp
and copy propagation would take care of the rest.
Zdenek