This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Run copy-prop before VRP to increase VRP optimization opportunities


While investigating the feasibility of removing the VRP bits from
DOM I discovered that tree-vrp.c regularly misses optimization
opportunities when copies are present in the IL. 

Here's a trivial little example (greatly simplified from
alias.c::nonlocal_mentioned_p:

SSA form after inserting ASSERT_EXPRs
nonlocal_mentioned_p (x)
{
  unsigned int D.1246;
  int D.1245;
  int D.1244;

<bb 0>:
  D.1244_2 = x_1->code;
  D.1245_3 = D.1244_2 - 6;
  D.1246_4 = (unsigned int) D.1245_3;
  if (D.1246_4 <= 1) goto <L0>; else goto <L3>;

<L0>:;
  D.1244_5 = D.1244_2;
  if (D.1244_5 == 7) goto <L1>; else goto <L3>;

<L1>:;
  D.1244_6 = D.1244_2;
  if (D.1244_6 != 7) goto <L2>; else goto <L3>;

<L2>:;
  abort ();

<L3>:;
  return;

}


The above fragment is taken from the .vrp dump.  Look carefully at the
conditionals after L0 and L1.  It's not terribly difficult to see
that the conditional after <L1> is totally unnecessary.  However, it
will not be removed by VRP.

I can make an educated guess that because the SSA_NAMEs in the
conditionals have no other use we never insert any ASSERT_EXPRs
for those SSA_NAMEs and we never make a real attempt to eliminate
the second conditional.  Basically the copies hide the fact that we
have a set of objects which are equivalent and a use of one member
in the set should be viewed as using the set's representative member.


Unfortunately, the code to build equivalency sets runs after we have
already inserted our ASSERT_EXPRs.  So we can't just use that code to
handle this case for us.

Luckily we can run copy-propagation to just before VRP to clean this
stuff up.

If we take that approach we'll have the following VRP dump after
inserting ASSERT_EXPRs:

nonlocal_mentioned_p (x)
{
  unsigned int D.1246;
  int D.1245;
  int D.1244;

<bb 0>:
  D.1244_2 = x_1->code;
  D.1245_3 = D.1244_2 - 6;
  D.1246_4 = (unsigned int) D.1245_3;
  if (D.1246_4 <= 1) goto <L0>; else goto <L3>;

<L0>:;
  D.1244_5 = D.1244_2;
  if (D.1244_2 == 7) goto <L1>; else goto <L3>;

<L1>:;
  D.1244_8 = ASSERT_EXPR <D.1244_2, D.1244_2 == 7>;
  D.1244_6 = D.1244_8;
  if (D.1244_8 != 7) goto <L2>; else goto <L3>;

<L2>:;
  abort ();

<L3>:;
  return;

}



Now we have the ASSERT_EXPR we want after <L1> and we will
have recorded the proper equivalences to allow us to optimize
away the last conditional.


While DOM would trivially catch this case, it seems to me that it would
be better to catch it in VRP.  After all, this is precisely the kind of
thing we want VRP to optimize for us.

Across my set of old cc1 .i files, we increase the number of
conditionals which VRP can evaluate at compile time from ~6500
to just over 7000 and we get a reduction in the number of 
conditionals that DOM evaluates at compile time.


The simplest approach is to just add another copy-prop pass before VRP;
however, that approach will result in a compile-time hit on the order 
of about 1%.  Not good.

Another approach would be to move the existing copy-prop pass which runs
after VRP to just before VRP.  That ought to be compile-time neutral or
slightly beneficial.  Unfortunately, this approach will leave in copies
introduced by VRP when it eliminates ASSERT_EXPRs.  Those copies get in
the way of later optimizers and ultimately cause regressions (failures
to optimize in particular).

There is a third approach (and the one I think works best).  First we
move the copy-prop pass which currently runs after VRP so that it runs
just before VRP.  We then get more intelligent about eliminating 
ASSERT_EXPRs -- it's pretty trivial to eliminate those copies as they're
created rather than run the full copy-propagation pass.    This approach
gets us the benefit of removing more conditionals in VRP, actually
*improves* compile time slightly and doesn't leave copies in the IL
stream to confuse later optimizers.

Bootstrapped and regression tested on i686-pc-linux-gnu.




















Attachment: PPP
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]