This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 0/3] Fix PR47654 and PR49649

From: Tobias Grosser <tobias at grosser dot es>
To: Sebastian Pop <sebpop at gmail dot com>
Cc: gcc-patches at gcc dot gnu dot org, rguenther at suse dot de
Date: Sun, 17 Jul 2011 03:02:07 +0200
Subject: Re: [PATCH 0/3] Fix PR47654 and PR49649
References: <1310062037-16618-1-git-send-email-sebpop@gmail.com>

On 07/07/2011 08:07 PM, Sebastian Pop wrote:

Hi,

Hi Sebastian,

sorry for taking so long. Here some comments from me:

First there are two cleanup patches independent of the fix:

   Start counting nesting level from 0.
   Do not compute twice type, lb, and ub.

Then the patch that fixes PR47654:

Fix PR47654: Compute LB and UB of a CLAST expression.

One of the reasons we cannot determine the IV type only from the
polyhedral representation is that as in the testcase of PR47654, we
are asked to generate an induction variable going from 0 to 127.  That
could be represented with a "char".  However the upper bound
expression of the loop generated by CLOOG is min (127, 51*scat_1 + 50)
and that would overflow if we use a "char" type.  To evaluate a type
in which the expression 51*scat_1 + 50 does not overflow, we have to
compute an upper and lower bound for the expression.

To fix the problem exposed by Tobias:

for (i = 0 ; i<  2; i++)
  for (j = i ; j<  i + 1; j++)
    for (k = j ; k<  j + 1; k++)
      for (m = k ; m<  k + 1; m++)
        for (n = m ; n<  m + 1; n++)
          A[0] += A[n];

I am a little bit afraid that we will increase the type size by an
order of magnitude (or at least one bit) for each nesting level.


instead of computing the lb and ub of scat_1 in "51*scat_1 + 50" based
on the type of scat_1 (that we already code generated when building
the outer loop), we use the polyhedral representation to get an
accurate lb and ub for scat_1.

I think this part is OK.

With the new cloog-isl you may get even more information (and consequently smaller tyes), if you do not only us domain + scattering, but intersect it with the subset of the scattering space that the loop enumerates (This information is available in cloog-isl).

Another way to get smaller types is to only get the maximal/minimal value each constraint can have. Let this be an upper bound up = min(5+3*n-8*m+33*i, 5i), which has two constraints. 5+3*n-8*m+33*i and 5i. Here we do not need to ensure that every single result during the calculation of 5+3*n-8*m+33*i fits into its type. Overflows during addition and multiplication should not affect the final result, as long as the overflow follows modulo semantics and as long as the type is large enough to store the final value. Only for the calculations like min/max/div/mod we need to ensure that every intermediate value fits in the range of the used type.

When translating the substitutions of a user statement using this
precise method, like for example S5 in:

   for (scat_1=0;scat_1<=min(T_3-1,T_4-1);scat_1++) {
     S5(scat_1);

we get a type that is too precise: based on the interval [0,99] we get
the type "unsigned char" when the type of scat_1 is "int", misleading
the vectorizer due to the insertion of spurious casts:

Here I am a little lost. You write using the precise method you get an interval [0,99] for the example in vect-pr43423.c. This is the example:

int a[100], b[100], c[100];

void foo(int n, int mid)
{
  int i;
  for(i=0; i<n; i++)
    {
      if (i < mid)
        a[i] = a[i] + b[i];
      else
        a[i] = a[i] + c[i];
    }
}

I do not see, how you would derive this small type, as the parameters of function foo() allow a range that is a lot larger.

So lets assume the size of n is known and limited to 100 in the context. Like this you may be able to get such a precise type.

#  Access function 0: (int) {(<unnamed-unsigned:8>) graphite_IV.7_56, +, 1}_3;
#)
affine dependence test not usable: access function not affine or constant.

So we have to keep around the previous code gcc_type_for_clast_* that
computes the type of an expression as the max precision of the
components of that expression, and use that when computing the types
of substitution expressions.

Here I am lost again. I understand that very small types may introduce more casts (and casts are always difficult to analyze). However, I do not understand your solution.

First of all it seems the gcc_type_for_clast ... functions are always needed to calculate the types, as just using polyhedral methods would not get larger types, as they may be needed for intermediate results. Which functions in your patch (or in the current graphite code), are only needed to ensure that no problematic casts are inserted? This would probably help me to understand how they ensure no problematic casts are generated.

Cheers
Tobi

P.S.: I would love to see this upper/lower bound calculation being part of CLooG.

Follow-Ups:
- Re: [PATCH 0/3] Fix PR47654 and PR49649
  - From: Sebastian Pop

References:
- [PATCH 0/3] Fix PR47654 and PR49649
  - From: Sebastian Pop

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]