[PATCH] Improve VRP assert creation for loops

Jeff Law law@redhat.com
Wed Nov 6 22:10:00 GMT 2013


On 11/06/13 10:13, Jakub Jelinek wrote:
> On Tue, Nov 05, 2013 at 02:00:16PM -0800, Cong Hou wrote:
>>> I'm also curious -- did this code show up in a particular benchmark, if so,
>>> which one?
>>
>> I didn't find this problem from any benchmark, but from another
>> concern about loop upper bound estimation. Look at the following code:
>>
>> int foo(unsigned int n, int r)
>> {
>>    int i;
>>    if (n > 0)
>>      if (n < 4)
>>      {
>>        do {
>>           --n;
>>           r *= 2;
>>        } while (n > 0);
>>      }
>>    return r+n;
>> }
>>
>>
>> In order to get the upper bound of the loop in this function, GCC
>> traverses conditions n<4 and n>0 separately and tries to get any
>> useful information. But as those two conditions cannot be combined
>> into one due to this issue (note that n>0 will be transformed into
>> n!=0), when GCC sees n<4, it will consider the possibility that n may
>> be equal to 0, in which case the upper bound is UINT_MAX. If those two
>> conditions can be combined into one, which is n-1<=2, then we can get
>> the correct upper bound of the loop.
>
> I've looked at the above testcase to see why we aren't able to determine
> the number of iterations upper bound properly here.
>
> That doesn't mean your patch is useless, though I must say I'm far from
> being convinced it is safe ATM and also it looks like quite ugly special
> case (trying to undo a VRP optimization but only one single specific case).
>
> The first problem is VRP, we end up with:
>    <bb 5>:
>    # n_1 = PHI <n_5(D)(4), n_7(6)>
>    # r_3 = PHI <r_6(D)(4), r_8(6)>
>    # RANGE [0, 4294967295]
>    n_7 = n_1 + 4294967295;
>    # RANGE [-2147483648, 2147483647] NONZERO 0x000000000fffffffe
>    r_8 = r_3 * 2;
>    if (n_7 != 0)
>      goto <bb 6>;
>    else
>      goto <bb 7>;
>
>    <bb 6>:
>    goto <bb 5>;
> - missing range on n_1, extremely conservative range on n_7.  With the
> attached patch we get instead:
>    <bb 5>:
>    # RANGE [1, 3] NONZERO 0x00000000000000003
>    # n_1 = PHI <n_5(D)(4), n_7(6)>
>    # r_3 = PHI <r_6(D)(4), r_8(6)>
>    # RANGE [0, 2] NONZERO 0x00000000000000003
>    n_7 = n_1 + 4294967295;
>    # RANGE [-2147483648, 2147483647] NONZERO 0x000000000fffffffe
>    r_8 = r_3 * 2;
>    if (n_7 != 0)
>      goto <bb 6>;
>    else
>      goto <bb 7>;
>
>    <bb 6>:
>    goto <bb 5>;
>
> The issue is that we use live_on_edge to determine if it is desirable
> to added ASSERT_EXPRs, but as we walk bbs in RPO order, we first enter
> the loop through the bb with exit edge and thus live of the latch isn't
> computed (and, generally the propagation for live ignores dfs back edges
> anyway), and because in the above live_on_edge ((5)->(6), n_7) is false,
> we don't add ASSERT_EXPR that n_7 is != 0 in the latch bb, so during
> iteration we start with n_1 being assumed [1, 3] (that is range of the
> assertion from the preceeding conditions on n_5(D)), but in the next
> iteration widen it to [0, 3] because we think n_7 can be [0, 2] in the
> PHI and thus end up with uselessly wide range because we think the
> subtraction can wrap around.  This patch improves live_on_edge for
> empty latches, by just looking at the PHIs on loop->header from the
> latch -> header edge and noting which SSA_NAMEs are used there.
>
> I had to add -fno-tree-vrp to 4 unroll_*.c tests, because they disable
> various tree optimizations already and want to test unrolling of loops
> iterating exactly twice, but with this VRP change VRP is smart enough
> to replace the PHI argument on the i IV from
> -  # i_13 = PHI <i_8(3), 0(2)>
> +  # i_13 = PHI <1(3), 0(2)>
> (the loop iterates exactly twice) and RTL unrolling doesn't do the
> tested thing in that case.  But, this should affect only loops that roll
> exactly twice and those realy should be unrolled already far before, so IMHO
> it isn't something to try to optimize better at the RTL level.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2013-11-06  Jakub Jelinek  <jakub@redhat.com>
>
> 	* tree-vrp.c (live_on_edge): If e->dest is an empty latch
> 	of some loop, but live[e->dest->index] is not computed, look
> 	for SSA_NAMEs used in loop header PHIs from the latch edge.
>
> 	* gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
> 	* gcc.dg/unroll_2.c: Likewise.
> 	* gcc.dg/unroll_3.c: Likewise.
> 	* gcc.dg/unroll_4.c: Likewise.
OK with a testcase verifying we get the tighter ranges.

I wonder a bit how often this situation occurs in practice 
(unnecessarily wide range due to missing an ASSERT_EXPR on the latch edge.

jeff



More information about the Gcc-patches mailing list