This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH]Enable elimination of IV use with unsigned type candidate

From: "Bin.Cheng" <amker dot cheng at gmail dot com>
To: Richard Biener <richard dot guenther at gmail dot com>
Cc: Bin Cheng <bin dot cheng at arm dot com>, Zdenek Dvorak <ook at ucw dot cz>, GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Tue, 1 Jul 2014 10:32:42 +0100
Subject: Re: [PATCH]Enable elimination of IV use with unsigned type candidate
Authentication-results: sourceware.org; auth=none
References: <000001cf8ec8$68c07340$3a4159c0$ at arm dot com> <CAFiYyc18W5R7-CkHBeRi4aiQuu5i=MgenBPL6r64kJ75fQDd0w at mail dot gmail dot com>

Sorry for this late reply, I spent some time in understanding the problem.

On Tue, Jun 24, 2014 at 12:36 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, Jun 23, 2014 at 11:49 AM, Bin Cheng <bin.cheng@arm.com> wrote:

>> expressions.  It's possible to have iv_elimination_compare_lt to do some
>> undo transformation on may_be_zero, but I found it's difficult for cases
>> involving signed/unsigned conversion like case loop-41.c.  Since I think
>> there is no obvious benefit to fold may_be_zero here (somehow because the
>> two operands are already in folded forms), this patch just calls build2_loc
>> instead.
>
> But it may fold to true/false, no?
You are right, it can be folded to false in many cases.  Thus I
managed to check specific folded forms of may_be_zero, as in attached
patch.  So far it works for tests I added, but there are some other
folded forms not handled.

>

>>
>> When GCC trying to eliminate use 0 with cand 0, the miscellaneous trees in
>> iv_elimination_compare_lt are like below with i_1 of signed type:
>> B:             i_1 + 1
>> A:             0
>> niter->niter:  (unsigned int)i_1
>>
>> Apparently, (B-A-1) is "i_1", which doesn't equal to "(unsigned int)i_1".
>> Without this patch, it is considered equal to each other.
>
> just looking at this part.  Do you have a testcase that exhibits a difference
> when just applying patch A?  So I can have a look here?
>
> From the code in iv_elimination_compare_lt I can't figure why we'd
> end up with i_1 instead of (unsigned int)i_1 as we convert to a common
> type.
>
> I suppose the issue may be that at tree_to_aff_combination time
> we strip all nops with STRIP_NOPS but when operating on ->rest
> via convert/scale or add we do not strip them again.
>
> But then 'nit' should be i_1, not (unsigned int)i_1.  So the analysis above
> really doesn't look correct.
>
> Just to make sure we don't paper over an issue in tree-affine.c.
>
> Thus - testcase?  On x86 we don't run into this place in
> iv_elimination_compare_lt (on an unpatched tree).
>
> CCing Zdenek for the meat of patch B.

I am more and more convinced that check of overflow in function
iv_elimination_compare_lt is not sufficient.  Considering example
given by comments just before the function:
/* Tries to replace loop exit by one formulated in terms of a LT_EXPR
   comparison with CAND.  NITER describes the number of iterations of
   the loops.  If successful, the comparison in COMP_P is altered accordingly.

   We aim to handle the following situation:

   sometype *base, *p;
   int a, b, i;

   i = a;
   p = p_0 = base + a;

   do
     {
       bla (*p);
       p++;
       i++;
     }
   while (i < b);

   Here, the number of iterations of the loop is (a + 1 > b) ? 0 : b - a - 1.
   We aim to optimize this to

   p = p_0 = base + a;
   do
     {
       bla (*p);
       p++;
     }
   while (p < p_0 - a + b);

   This preserves the correctness, since the pointer arithmetics does not
   overflow.  More precisely:

   1) if a + 1 <= b, then p_0 - a + b is the final value of p, hence there is no
      overflow in computing it or the values of p.
   2) if a + 1 > b, then we need to verify that the expression p_0 - a does not
      overflow.  To prove this, we use the fact that p_0 = base + a.  */

The example seems wrong to me, because the statement only stands for
unsigned case like ivopts-lt.c:

#include "stdint.h"

void
f1 (char *p, uintptr_t i, uintptr_t n)
{
  p += i;
  do
    {
      *p = '\0';
      p += 1;
      i++;
    }
  while (i < n);
}

If i/n are of signed type, the 2nd point for "a+1>b" scenario in the
comment isn't right, because of below facts:
a) niter of loop are calculated in unsigned type as in function
number_of_iterations_lt.  So loop information for this case is like:

Analyzing # of iterations of loop 1
  exit condition [i_4(D) + 1, + , 1](no_overflow) < n_12(D)
  bounds on difference of bases: -4294967295 ... 4294967294
  result:
    zero if i_4(D) >= n_12(D)
    # of iterations ((unsigned int) n_12(D) - (unsigned int) i_4(D)) -
1, bounded by 4294967294
  number of iterations ((unsigned int) n_12(D) - (unsigned int)
i_4(D)) - 1; zero if i_4(D) >= n_12(D)

b) The (unsigned int)n_12(D) could overflow, the check in
iv_elimination_compare_lt doesn't take this fact into consideration.

Tricky thing is GCC now doesn't eliminate "i < n" with point P for
signed version of case.  The reason behind is again the may_be_zero
information.  The original form of may_be_zero is like "i_4(D) + 1>
n_12(D)", which is folded into "i_4(D) >= n_12(D)" because both
i_4/n_12 are of signed type and don't wrap.  In the end, function
iv_elimination_compare_lt only handles the original form of
may_be_zero.
So below case is the closest one to illustrate the problem:

#include "stdint.h"

void
f1 (char *p, int i, int n)
{
  p += i;
  do
    {
      *p = '\0';
      p += 1;
      i++;
    }
  while (i < n);
}

char arr[100] = "hello\n";

int main(void)
{
  f1 (arr, 1, -1);

  return 0;
}

If we build the original form of may_be_zero with change:
Index: gcc/tree-ssa-loop-niter.c
===================================================================
--- gcc/tree-ssa-loop-niter.c   (revision 211966)
+++ gcc/tree-ssa-loop-niter.c   (working copy)
@@ -1153,8 +1153,13 @@
         condition.  */

       if (mpz_sgn (bnds->below) < 0)
+#if 0
        niter->may_be_zero = fold_build2 (LT_EXPR, boolean_type_node,
                                          iv1->base, iv0->base);
+#else
+       niter->may_be_zero = build2_loc (UNKNOWN_LOCATION, LT_EXPR,
boolean_type_node,
+                                         iv1->base, iv0->base);
+#endif
       niter->niter = delta;
       niter->max = widest_int::from (wi::from_mpz (niter_type,
bnds->up, false),
                                     TYPE_SIGN (niter_type));

The case will be mis-compiled.

One the other handle, iv_elimination_compare_lt for loops with
may_be_zero can be further improved to handle cases other than pointer
candidate.  My original patch is intended to handle unsigned type
candidates, but there are other cases.

The problem here is a little complicated, especially if we want to
improve iv elimination, I may mis-read the code somehow.  So any idea
about this?

Thanks,
bin

Attachment: condition-iv_use-elimination-20140701-b.txt
Description: Text document

Follow-Ups:
- Re: [PATCH]Enable elimination of IV use with unsigned type candidate
  - From: Bin.Cheng
- Re: [PATCH]Enable elimination of IV use with unsigned type candidate
  - From: Bin.Cheng

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]