Bug 93781 - Optimizer produces suboptimal code related to -ftree-vrp
Summary: Optimizer produces suboptimal code related to -ftree-vrp
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.2.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: VRP
  Show dependency treegraph
 
Reported: 2020-02-17 09:41 UTC by vfdff
Modified: 2021-07-13 13:45 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-02-17 00:00:00


Attachments
patch base on gcc 7.3, additional for 1st testcases (1.10 KB, patch)
2020-03-10 11:13 UTC, vfdff
Details | Diff
patch to discard invalid ranges (1.85 KB, patch)
2020-11-19 22:45 UTC, Andrew Macleod
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description vfdff 2020-02-17 09:41:54 UTC
For the following code, we can known the value C000003FE is always less then 5, so the return value should be true.
test base on the x86-64 gcc 9.2 on https://gcc.godbolt.org/, so get complicated assemble.

unsigned int foo (unsigned int arg)
{
  int C00000400 = arg - 3;
  unsigned int C000003FE = 4;
  int C000003FF = 0x1 << arg;

  if (C00000400 < 0)
     C000003FE = C000003FF;

  return C000003FE < 5;
}
Comment 1 Andrew Pinski 2020-02-17 09:59:14 UTC
Adding assert for C00000400_5 from C00000400_5 < 0
Adding assert for _1 from _1 + 2147483648 <= 2147483647

But _1 is defined as:
  _1 = arg_4(D) + 4294967293;

and arg_4 is used in the BB were we would have added the above asserts.

The second assert comes from:
      /* Add asserts for NAME cmp CST and NAME being defined
         as NAME = (int) NAME2.  */

We need one more for NAME cmp CST, and NAME = (int)NAME2 and NAME2 = NAME3 + CST.

The question becomes how recusive/adding extra cases to do we want to add to register_edge_assert_for_2.
Comment 2 vfdff 2020-03-09 03:57:02 UTC
I test a more simple testcase, and find the arg_5(D) already get the expected range, but the _2 = 1 << arg_9 is unexpected.

unsigned int foo (unsigned int arg)
{
  unsigned int C000003FE = 4;

  if (arg + 1 < 4)  // work of for if (arg < 3)
     C000003FE = 0x1 << arg;

  return C000003FE < 5;
}

Adding assert for arg_5(D) from arg_5(D) + 1

Visiting statement:
arg_9 = ASSERT_EXPR <arg_5(D), arg_5(D) + 1 <= 3>;
Intersecting
  ~[3, 4294967294]  EQUIVALENCES: { arg_5(D) } (1 elements)
and
  VARYING
to
  ~[3, 4294967294]  EQUIVALENCES: { arg_5(D) } (1 elements)
Found new range for arg_9: ~[3, 4294967294]
marking stmt to be not simulated again

Visiting statement:
_2 = 1 << arg_9;
Meeting  ------ should be Intersecting ?
  [1, 4]
and
  VARYING
to
  VARYING
Found new range for _2: VARYING  ---------> so, _2 excepted range [1, 4] ?
Comment 3 Jakub Jelinek 2020-03-09 11:20:05 UTC
For the second testcase, I think we could in undefined_shift_range_check or its 2 callers intersect the value range of the last [lr]shift operand with the range of [0, prec-1], and if that yields empty range, do what we use right now, but if it narrows the op2 range to something smaller, use that in the range_operator::fold_range call instead of op2.
Comment 4 vfdff 2020-03-10 07:51:34 UTC
according your prompt, I test it base on gcc 7.3, and the second testcase works.

--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3301,6 +3301,18 @@ extract_range_from_binary_expr (value_range *vr,
   else
     set_value_range_to_varying (&vr1);
 
+  /* the [lr]shift operand with the range of [0, prec-1]  */
+  if (code == LSHIFT_EXPR || code == RSHIFT_EXPR)
+    {
+      value_range shiftp = VR_INITIALIZER;
+      int prec = TYPE_PRECISION (expr_type);
+      shiftp.type = VR_RANGE;
+      shiftp.min = integer_zero_node;
+      shiftp.max = build_int_cst (expr_type, prec - 1);
+
+      vrp_intersect_ranges (&vr1, &shiftp);
+     }
+

========================================
Visiting statement:
_2 = 1 << arg_9;
Intersecting
  ~[3, 4294967294]  EQUIVALENCES: { arg_5(D) } (1 elements)
and
  [0, 31]
to
  [0, 2]  EQUIVALENCES: { arg_5(D) } (1 elements)
Found new range for _2: [1, 4]
marking stmt to be not simulated again
Comment 5 vfdff 2020-03-10 11:13:52 UTC
Created attachment 48008 [details]
patch base on gcc 7.3, additional for 1st testcases
Comment 6 Jakub Jelinek 2020-03-10 11:20:24 UTC
Note, 7.x is not supported anymore.  If this is going to be changed, it would be for GCC 11, different source file, different routine...
Comment 7 GCC Commits 2020-11-19 22:44:02 UTC
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:d0d8b5d83614d8f0d0e40c0520d4f40ffa01f8d9

commit r11-5185-gd0d8b5d83614d8f0d0e40c0520d4f40ffa01f8d9
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Thu Nov 19 17:41:30 2020 -0500

    Process only valid shift ranges.
    
    When shifting outside the valid range of [0, precision-1], we can
    choose to process just the valid ones since the rest is undefined.
    this allows us to produce results for x << [0,2][+INF, +INF] by discarding
    the invalid ranges and processing just [0,2].
    
            gcc/
            PR tree-optimization/93781
            * range-op.cc (get_shift_range): Rename from
            undefined_shift_range_check and now return valid shift ranges.
            (operator_lshift::fold_range): Use result from get_shift_range.
            (operator_rshift::fold_range): Ditto.
            gcc/testsuite/
            * gcc.dg/tree-ssa/pr93781-1.c: New.
            * gcc.dg/tree-ssa/pr93781-2.c: New.
            * gcc.dg/tree-ssa/pr93781-3.c: New.
Comment 8 Andrew Macleod 2020-11-19 22:45:12 UTC
Created attachment 49601 [details]
patch to discard invalid ranges

We can do as Jakub suggests and mask out invalid ranges, processing only whats left.

We can get the first test case if we tweak it into the third one I have added.

The first one will require the next release of ranger:

    <bb 2> :
    _1 = arg_4(D) + 4294967293;
    a_5 = (int) _1;
    x_7 = 1 << arg_4(D);
    if (a_5 < 0)
      goto <bb 3>; [INV]
    else
      goto <bb 4>; [INV]

2->3  (T) _1 :  unsigned int [2147483648, +INF]
2->3  (T) arg_4(D) :    unsigned int [0, 2][2147483651, +INF]
2->3  (T) a_5 :         int [-INF, -1]
2->4  (F) _1 :  unsigned int [0, 2147483647]
2->4  (F) arg_4(D) :    unsigned int [3, 2147483650]
2->4  (F) a_5 :         int [0, +INF]

=========== BB 3 ============
x_7     int VARYING
    <bb 3> :
    b_8 = (unsigned int) x_7;

Ranger currently doesn't mark ssa-names which are not in the calculation chain of the outgoing edge..   x_7 has not direct effect on the edge, but rather it happens to use a value which later changes on the edge.

These secondary effect re-calculations are slated for the next release of ranger.  I moved the x_7 into the following block in the third testcase, which demonstrates that the value can be recalculated properly with the available information.
Comment 9 GCC Commits 2021-07-13 13:43:44 UTC
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:f75560398af6f1f696c820016f437af4e8b4265c

commit r12-2284-gf75560398af6f1f696c820016f437af4e8b4265c
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Tue Jul 13 09:41:30 2021 -0400

    Adjust testcase to test the call is removed.
    
    Ranger now handles the test.
    
            gcc/testsuite
            PR tree-optimization/93781
            * gcc.dg/tree-ssa/pr93781-1.c: Check that call is removed.
Comment 10 Andrew Macleod 2021-07-13 13:45:58 UTC
Recomputation now resolves all the tests in the series.