This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Improve VRP assert creation for loops


On 11/06/13 10:13, Jakub Jelinek wrote:
On Tue, Nov 05, 2013 at 02:00:16PM -0800, Cong Hou wrote:
I'm also curious -- did this code show up in a particular benchmark, if so,
which one?

I didn't find this problem from any benchmark, but from another
concern about loop upper bound estimation. Look at the following code:

int foo(unsigned int n, int r)
{
   int i;
   if (n > 0)
     if (n < 4)
     {
       do {
          --n;
          r *= 2;
       } while (n > 0);
     }
   return r+n;
}


In order to get the upper bound of the loop in this function, GCC
traverses conditions n<4 and n>0 separately and tries to get any
useful information. But as those two conditions cannot be combined
into one due to this issue (note that n>0 will be transformed into
n!=0), when GCC sees n<4, it will consider the possibility that n may
be equal to 0, in which case the upper bound is UINT_MAX. If those two
conditions can be combined into one, which is n-1<=2, then we can get
the correct upper bound of the loop.

I've looked at the above testcase to see why we aren't able to determine
the number of iterations upper bound properly here.

That doesn't mean your patch is useless, though I must say I'm far from
being convinced it is safe ATM and also it looks like quite ugly special
case (trying to undo a VRP optimization but only one single specific case).

The first problem is VRP, we end up with:
   <bb 5>:
   # n_1 = PHI <n_5(D)(4), n_7(6)>
   # r_3 = PHI <r_6(D)(4), r_8(6)>
   # RANGE [0, 4294967295]
   n_7 = n_1 + 4294967295;
   # RANGE [-2147483648, 2147483647] NONZERO 0x000000000fffffffe
   r_8 = r_3 * 2;
   if (n_7 != 0)
     goto <bb 6>;
   else
     goto <bb 7>;

   <bb 6>:
   goto <bb 5>;
- missing range on n_1, extremely conservative range on n_7.  With the
attached patch we get instead:
   <bb 5>:
   # RANGE [1, 3] NONZERO 0x00000000000000003
   # n_1 = PHI <n_5(D)(4), n_7(6)>
   # r_3 = PHI <r_6(D)(4), r_8(6)>
   # RANGE [0, 2] NONZERO 0x00000000000000003
   n_7 = n_1 + 4294967295;
   # RANGE [-2147483648, 2147483647] NONZERO 0x000000000fffffffe
   r_8 = r_3 * 2;
   if (n_7 != 0)
     goto <bb 6>;
   else
     goto <bb 7>;

   <bb 6>:
   goto <bb 5>;

The issue is that we use live_on_edge to determine if it is desirable
to added ASSERT_EXPRs, but as we walk bbs in RPO order, we first enter
the loop through the bb with exit edge and thus live of the latch isn't
computed (and, generally the propagation for live ignores dfs back edges
anyway), and because in the above live_on_edge ((5)->(6), n_7) is false,
we don't add ASSERT_EXPR that n_7 is != 0 in the latch bb, so during
iteration we start with n_1 being assumed [1, 3] (that is range of the
assertion from the preceeding conditions on n_5(D)), but in the next
iteration widen it to [0, 3] because we think n_7 can be [0, 2] in the
PHI and thus end up with uselessly wide range because we think the
subtraction can wrap around.  This patch improves live_on_edge for
empty latches, by just looking at the PHIs on loop->header from the
latch -> header edge and noting which SSA_NAMEs are used there.

I had to add -fno-tree-vrp to 4 unroll_*.c tests, because they disable
various tree optimizations already and want to test unrolling of loops
iterating exactly twice, but with this VRP change VRP is smart enough
to replace the PHI argument on the i IV from
-  # i_13 = PHI <i_8(3), 0(2)>
+  # i_13 = PHI <1(3), 0(2)>
(the loop iterates exactly twice) and RTL unrolling doesn't do the
tested thing in that case.  But, this should affect only loops that roll
exactly twice and those realy should be unrolled already far before, so IMHO
it isn't something to try to optimize better at the RTL level.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-11-06  Jakub Jelinek  <jakub@redhat.com>

	* tree-vrp.c (live_on_edge): If e->dest is an empty latch
	of some loop, but live[e->dest->index] is not computed, look
	for SSA_NAMEs used in loop header PHIs from the latch edge.

	* gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
	* gcc.dg/unroll_2.c: Likewise.
	* gcc.dg/unroll_3.c: Likewise.
	* gcc.dg/unroll_4.c: Likewise.
OK with a testcase verifying we get the tighter ranges.

I wonder a bit how often this situation occurs in practice (unnecessarily wide range due to missing an ASSERT_EXPR on the latch edge.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]