Bug 31178 - VRP can infer a range for b in a >> b and a << b
Summary: VRP can infer a range for b in a >> b and a << b
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: easyhack, missed-optimization
Depends on: 30317
Blocks: VRP
  Show dependency treegraph
 
Reported: 2007-03-14 21:12 UTC by Richard Biener
Modified: 2022-04-06 13:16 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-09-06 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2007-03-14 21:12:52 UTC
namely [0, n) where n is the width of type a.  (Or better type a's mode to be safe).  Look at infer_value_range().
Comment 1 Richard Biener 2007-03-15 14:43:36 UTC
This is blocked by the same fact as PR30317 in that ASSERT_EXPRs can only
assert half-ranges or single valued ranges.  This makes the following prototype
not assert [0, prec) but [-INF, prec) instead :(

Index: tree-vrp.c
===================================================================
*** tree-vrp.c  (revision 122953)
--- tree-vrp.c  (working copy)
*************** infer_value_range (tree stmt, tree op, e
*** 3113,3118 ****
--- 3113,3134 ----
        }
      }
  
+   /* We can assume that the shift argument of a left or right shift
+      is within zero and the type precision of the shifted operand.  */
+   if (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+       && (TREE_CODE (GIMPLE_STMT_OPERAND (stmt, 1)) == RSHIFT_EXPR
+         || TREE_CODE (GIMPLE_STMT_OPERAND (stmt, 1)) == LSHIFT_EXPR)
+       && TREE_OPERAND (GIMPLE_STMT_OPERAND (stmt, 1), 1) == op)
+     {
+       tree lop = TREE_OPERAND (GIMPLE_STMT_OPERAND (stmt, 1), 0);
+       tree lop_type = TREE_TYPE (lop);
+ 
+       /* Unfortunately we only can infer half-ranges here.  */
+       *val_p = build_int_cst (lop_type, TYPE_PRECISION (lop_type) - 1);
+       *comp_code_p = LE_EXPR;
+       return true;
+     }
+ 
    return false;
  }
  
Comment 2 Steven Bosscher 2009-02-08 14:20:55 UTC
No news since almost two years ago.  Is this still a problem?
Comment 3 Richard Biener 2009-02-08 14:47:07 UTC
No, it's now possible to implement this optimization (but yes, nobody has
done so sofar).  It's on my TODO (with tons of other stuff, of course).

As this is an easy task for beginners ... whoever feels like doing it, I'll
review the result.
Comment 4 Richard Biener 2013-02-11 07:48:21 UTC
*** Bug 56281 has been marked as a duplicate of this bug. ***
Comment 6 Eric Gallager 2018-06-28 02:24:32 UTC
(In reply to Richard Biener from comment #3)
> No, it's now possible to implement this optimization (but yes, nobody has
> done so sofar).  It's on my TODO (with tons of other stuff, of course).
> 

Is that still the case?

> As this is an easy task for beginners ... whoever feels like doing it, I'll
> review the result.

Adding "easyhack" keyword then
Comment 7 Eric Gallager 2018-09-30 01:51:33 UTC
(In reply to Eric Gallager from comment #6)
> (In reply to Richard Biener from comment #3)
> > No, it's now possible to implement this optimization (but yes, nobody has
> > done so sofar).  It's on my TODO (with tons of other stuff, of course).
> > 
> 
> Is that still the case?
> 

Guess not.
Comment 8 Andrew Macleod 2022-03-22 14:21:22 UTC
Is this always true?

I implemented this for GCC13 in the new side-effect code, and its causing problems in fortran.

In particular  gfortran.dg/check_bits_1.f90 goes into an infinite loop. I am seeing things like

nb = bit_size (i)
do shift = 0, nb
     k = shiftl (i, shift) ! Fortran 2008
     i = shiftr (k, shift)

So it appears to do shifts on [0, 32] rather than [0, 31], and when I go look for fortran 2008 info, i find:

"Its value must be non-negative, and less than or equal to BIT_SIZE(I)"

so it seem [0, N] for fortran rather than [0, N)?   Are there other language issues as well?
Comment 9 Andrew Macleod 2022-03-22 17:03:05 UTC
Seems like it may expose a problem in gcc.target/i386/sse2-v1ti-shift-3.c as well:

 for (i=0; i<128; i++) {
<...>
    if ((ti)rotr_v1ti(ut,i) != (ti)rotr_ti(x,i))
      __builtin_abort();
    if ((ti)rotl_v1ti(ut,i) != (ti)rotl_ti(x,i))

And those are defined:

uv1ti rotr_v1ti(uv1ti x, unsigned int i) { return (x >> i) | (x << (128-i)); }
uv1ti rotl_v1ti(uv1ti x, unsigned int i) { return (x << i) | (x >> (128-i)); }

so when i is 0, they can perform a shift of 128 on a 128 bit object.
Comment 10 rguenther@suse.de 2022-03-23 08:08:48 UTC
On Tue, 22 Mar 2022, amacleod at redhat dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31178
> 
> Andrew Macleod <amacleod at redhat dot com> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |amacleod at redhat dot com
> 
> --- Comment #8 from Andrew Macleod <amacleod at redhat dot com> ---
> Is this always true?
> 
> I implemented this for GCC13 in the new side-effect code, and its causing
> problems in fortran.
> 
> In particular  gfortran.dg/check_bits_1.f90 goes into an infinite loop. I am
> seeing things like
> 
> nb = bit_size (i)
> do shift = 0, nb
>      k = shiftl (i, shift) ! Fortran 2008
>      i = shiftr (k, shift)
> 
> So it appears to do shifts on [0, 32] rather than [0, 31], and when I go look
> for fortran 2008 info, i find:
> 
> "Its value must be non-negative, and less than or equal to BIT_SIZE(I)"
> 
> so it seem [0, N] for fortran rather than [0, N)?   Are there other language
> issues as well?

Interesting.  For GIMPLE we of course have to find common ground which
then means to allow N as well.
Comment 11 Jakub Jelinek 2022-03-23 08:26:26 UTC
I think we rely on [0, N) in middle-end and in backends in so many places that it probably would be better to change the Fortran FE to emit for shift{l,a,r} (a, b)
something like __builtin_expect (b == N, 0) ? something : a {<<,>>} b.
Comment 12 Jakub Jelinek 2022-03-23 08:31:38 UTC
(In reply to Andrew Macleod from comment #9)
> Seems like it may expose a problem in gcc.target/i386/sse2-v1ti-shift-3.c as
> well:
> 
>  for (i=0; i<128; i++) {
> <...>
>     if ((ti)rotr_v1ti(ut,i) != (ti)rotr_ti(x,i))
>       __builtin_abort();
>     if ((ti)rotl_v1ti(ut,i) != (ti)rotl_ti(x,i))
> 
> And those are defined:
> 
> uv1ti rotr_v1ti(uv1ti x, unsigned int i) { return (x >> i) | (x << (128-i));
> }
> uv1ti rotl_v1ti(uv1ti x, unsigned int i) { return (x << i) | (x >> (128-i));
> }
> 
> so when i is 0, they can perform a shift of 128 on a 128 bit object.

That just means they should be fixed.
As documented above simplify_rotate in tree-ssa-forwprop.cc, we pattern match a lot of forms and many of those are safe for any rotate count.
If only 0..127 is supposed to be valid for i, then e.g.
   (x << i) | (x >> ((-i) & 127))
will do.
Comment 13 Andrew Macleod 2022-04-05 17:58:07 UTC
huh, so even after fixing the testcase, ranger is still tripping over this test case.

uv1ti ashl_v1ti (uv1ti x, unsigned int i)
{
  uv1ti _3;

  <bb 2> :
  _3 = x_1(D) << i_2(D);
  return _3;


we have an ssa_name for x_1 with a type of uvlti.

This passes the INTEGRAL_TYPE_P test and an ssa_name is created, but is it actually a vector_type.  This seems to be OK, but being unaware of this, and having a type which passes the INTEGRAL_TYPE_P (type),  I was using TYPE_PRECISION (type) to find the upper bounds for i_2.

It appears that if VECTOR_TYPE_P (type) is true, then TYPE_PRECISION (type) is not a valid request? but it silently returns 0 and happily moves on.

As near as I can tell, I am suppose to ask for:
TYPE_PRECISION (TREE_TYPE (type)) when its a VECTOR_TYPE?

IS there a good reason we don't fill in the TYPE_PRECISION field?   Or if its not suppose to be used, then can we trap on it if its passed a vector type?  It seems like the sort of thing that is easy to trip over.  Are their other bits which make VECTOR_TYPE incompatible with scalar INTGERAL_TYPEs that should not be queried?
Comment 14 Jakub Jelinek 2022-04-05 18:06:00 UTC
INTEGRAL_TYPE_P is only true for scalar integral types (integer, enum and boolean).
Do you mean ANY_INTEGRAL_TYPE_P instead which is true also for integral vector and complex types?
Anyway, one can use element_precision (type) to query the precision for scalar types and precision of each element (for vector and complex types).
Comment 15 Andrew Macleod 2022-04-05 18:24:47 UTC
no... we won't process ranges for anything unless it passes supports_type_p ():

 (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type)))

oh oh oh.  
Never mind. I see.  we are generating a range for i_2, and I'm now using the type of op1...  we probably are NOT generating ranges for x_1...  but I am looking at its type in this case.
Doh!  my bad.
I shall use element_precision.  problem solved. Thanks
Comment 16 Richard Biener 2022-04-06 06:00:14 UTC
(In reply to Andrew Macleod from comment #15)
> no... we won't process ranges for anything unless it passes supports_type_p
> ():
> 
>  (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type)))
> 
> oh oh oh.  
> Never mind. I see.  we are generating a range for i_2, and I'm now using the
> type of op1...  we probably are NOT generating ranges for x_1...  but I am
> looking at its type in this case.
> Doh!  my bad.
> I shall use element_precision.  problem solved. Thanks

Note for shifts the precision of the shift operand does not have to match that of the shifted operand.  In your case you have vector << scalar, so you definitely want to look at scalars precision when deriving a range for scalar, _not_ at
element_precision of vector (or the LHS)!
Comment 17 Vincent Lefèvre 2022-04-06 08:16:00 UTC
(In reply to Richard Biener from comment #16)
> Note for shifts the precision of the shift operand does not have to match
> that of the shifted operand.  In your case you have vector << scalar, so you
> definitely want to look at scalars precision when deriving a range for
> scalar, _not_ at element_precision of vector (or the LHS)!

I'm not sure I understand what you mean, but the C standard says:

  The integer promotions are performed on each of the operands. The type
  of the result is that of the promoted left operand. If the value of the
  right operand is negative or is greater than or equal to the width of
  the promoted left operand, the behavior is undefined.

So, what matters is the type of the promoted *left* operand (corresponding to vector above).
Comment 18 rguenther@suse.de 2022-04-06 08:22:14 UTC
On Wed, 6 Apr 2022, vincent-gcc at vinc17 dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31178
> 
> --- Comment #17 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
> (In reply to Richard Biener from comment #16)
> > Note for shifts the precision of the shift operand does not have to match
> > that of the shifted operand.  In your case you have vector << scalar, so you
> > definitely want to look at scalars precision when deriving a range for
> > scalar, _not_ at element_precision of vector (or the LHS)!
> 
> I'm not sure I understand what you mean, but the C standard says:
> 
>   The integer promotions are performed on each of the operands. The type
>   of the result is that of the promoted left operand. If the value of the
>   right operand is negative or is greater than or equal to the width of
>   the promoted left operand, the behavior is undefined.
> 
> So, what matters is the type of the promoted *left* operand (corresponding to
> vector above).

Sure, if that's what the precision is used for.  The message from Andrew
sounded like 'I want the precision for the shift operand but let me
just use that of the shifted anyway'
Comment 19 Vincent Lefèvre 2022-04-06 08:45:19 UTC
(In reply to rguenther@suse.de from comment #18)
> Sure, if that's what the precision is used for.  The message from Andrew
> sounded like 'I want the precision for the shift operand but let me
> just use that of the shifted anyway'

Andrew should clarify. From what I understand, he does not want the precision for the shift operand (right operand), but an upper bound (this is what this bug is about). And this upper bound is deduced from the precision of the shifted element (promoted left operand).
Comment 20 Andrew Macleod 2022-04-06 13:10:10 UTC
That is correct.

              tree op1_type = TREE_TYPE (gimple_assign_rhs1 (s));
              tree op2_type = TREE_TYPE (gimple_assign_rhs2 (s));
              tree l = build_int_cst (op2_type, 0);
              // C is [0, N), but fortran is [0, N], so default to [0, N].
              tree u = build_int_cst (op2_type, element_precision (op1_type));
              int_range_max shift (l, u);
              add_range (gimple_assign_rhs2 (s), shift);

I build a range for the RHS shift-by operand (op2_type) from 0 to the precision of the left operand (op1_type)...    THats all we were using op1 for. I was just under the misimpression that op1 would have a TYPE_PRECISION, not realizing vectors could be there and they don't set that field.

I believe that to be correct?
Comment 21 Jakub Jelinek 2022-04-06 13:13:38 UTC
Yes, though I think we should fix the Fortran FE so that it only relies on [0, N) .  If the shift count is constant, it can deal with it at compile time, if it is variable, can emit a COND_EXPR for the shift_count == N case.
Comment 22 rguenther@suse.de 2022-04-06 13:16:27 UTC
On Wed, 6 Apr 2022, amacleod at redhat dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31178
> 
> --- Comment #20 from Andrew Macleod <amacleod at redhat dot com> ---
> That is correct.
> 
>               tree op1_type = TREE_TYPE (gimple_assign_rhs1 (s));
>               tree op2_type = TREE_TYPE (gimple_assign_rhs2 (s));
>               tree l = build_int_cst (op2_type, 0);
>               // C is [0, N), but fortran is [0, N], so default to [0, N].
>               tree u = build_int_cst (op2_type, element_precision (op1_type));
>               int_range_max shift (l, u);
>               add_range (gimple_assign_rhs2 (s), shift);
> 
> I build a range for the RHS shift-by operand (op2_type) from 0 to the precision
> of the left operand (op1_type)...    THats all we were using op1 for. I was
> just under the misimpression that op1 would have a TYPE_PRECISION, not
> realizing vectors could be there and they don't set that field.
> 
> I believe that to be correct?

Yes.  For vector types TYPE_PRECISION is overloaded and specifies
log2 of the number of vector elements instead.