Bug 97515 - [11 Regression] ICE: verify_gimple failed (error: type mismatch in 'rshift_expr') since r11-3685-gfcae5121154d1c33
Summary: [11 Regression] ICE: verify_gimple failed (error: type mismatch in 'rshift_ex...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.0
: P1 normal
Target Milestone: 11.0
Assignee: Not yet assigned to anyone
URL:
Keywords: ice-on-valid-code
Depends on:
Blocks:
 
Reported: 2020-10-21 11:20 UTC by Arseny Solokha
Modified: 2020-11-20 16:09 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work: 10.2.0
Known to fail: 11.0
Last reconfirmed: 2020-10-21 00:00:00


Attachments
check for undefined before not returning a constant value (496 bytes, patch)
2020-10-21 19:40 UTC, Andrew Macleod
Details | Diff
tweak testcase (538 bytes, patch)
2020-11-20 16:09 UTC, Andrew Macleod
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Arseny Solokha 2020-10-21 11:20:17 UTC
g++-11.0.0-alpha20201018 snapshot (g:1e70b1a358b6ce3b894f284d88fbb90518d45cc0) ICEs when compiling the following testcase w/ -O1 -ftree-vrp:

int
e7 (int gg)
{
  int xe = 0;

  while (xe < 1)
    {
      int ui;

      ui = ~xe;
      if (ui == 0)
        ui = xe >> gg;

      xe %= !ui;
    }

  return xe;
}

% gcc-11.0.0 -O1 -ftree-vrp -c kg5m6kjq.c
kg5m6kjq.c: In function 'e7':
kg5m6kjq.c:18:1: error: type mismatch in 'rshift_expr'
   18 | }
      | ^
int
<<< error >>>
int
ui_10 = _3 >> gg_9(D);
during GIMPLE pass: evrp
kg5m6kjq.c:18:1: internal compiler error: verify_gimple failed
0xe04f1a verify_gimple_in_cfg(function*, bool)
	/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201018/work/gcc-11-20201018/gcc/tree-cfg.c:5482
0xcdb72f execute_function_todo
	/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201018/work/gcc-11-20201018/gcc/passes.c:1992
0xcdc56c do_per_function
	/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201018/work/gcc-11-20201018/gcc/passes.c:1640
0xcdc56c execute_todo
	/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201018/work/gcc-11-20201018/gcc/passes.c:2046
Comment 1 Martin Liška 2020-10-21 12:58:10 UTC
Confirmed, since r11-3685-gfcae5121154d1c33.
Comment 2 Andrew Macleod 2020-10-21 19:40:15 UTC
Created attachment 49417 [details]
check for undefined before not returning a constant value

The ranger deals with UNDEFINED slightly differently than traditional
EVRP, which is exposed in this PR.

An UNDEFINED value can be calculated in unreachable code. ie
bb5:
a_3 = 5
if (a_3 < 0)
  goto bb6
else
  goto bb9

bb6:
  z_8 = a_3

the edge 5->6 cannot be taken, Calculating the range for a_3 on this edge
is calculated by taking the outgoing edge range of [-INF, -1] and intersecting it with the known range of  [5, 5].  Which is [] or undefined.  The value has no meaning on that edge since the edge can't be taken.

The substitute_and_fold engine is driven by replacing values with their 
constant equivalent.  It does so only as a statement is visited.

When it gets to z_8 = a_3, the ranger was returning a range of [] AKA undefined instead of [5,5] so the use of a_3 was not being replaced by subst_and_fold in this block.

This results in a reference to a_3 being left in the IL.. and chaos ensues.

The simple solution to this is to adjust the way value_of() is calculated on ranges from the ranger.  If the resulting range is UNDEFINED from a query, do a quick check of the global value and if its a constant, return that.

This has been a lagging issue that pops up from time to time which is nicely demonstrated in this PR.


This patch is undergoing testing currently
Comment 3 GCC Commits 2020-10-22 00:03:39 UTC
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:ca5f4666f7a9404cdb04832324de3dd7d71e35c3

commit r11-4198-gca5f4666f7a9404cdb04832324de3dd7d71e35c3
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Wed Oct 21 19:55:28 2020 -0400

    Check for undefined before not returning a constant value
    
    Don't return UNDEFINED for a range in an unreachable block if the global
    value evaluates to a constant.  Return that constant instead.
    
            PR tree-optimization/97515
            * value-query.cc (range_query::value_of_expr): If the result is
            UNDEFINED, check to see if the global value is a constant.
            (range_query::value_on_edge): Ditto.
Comment 4 Andrew Macleod 2020-10-22 00:26:06 UTC
Furthermore, this PR is also a good example of a case where we want to inject updated values into the Ranger's iterator.

<bb 2> :
  goto <bb 6>; [INV]

  <bb 3> :
  ui_8 = ~xe_3;
  if (ui_8 == 0)
    goto <bb 4>; [INV]
  else
    goto <bb 5>; [INV]

  <bb 4> :
  ui_10 = xe_3 >> gg_9(D);

  <bb 5> :
  # ui_4 = PHI <ui_8(3), ui_10(4)>
  _1 = ui_4 == 0;
  _2 = (int) _1;
  xe_11 = xe_3 % _2;

  <bb 6> :
  # xe_3 = PHI <0(2), xe_11(5)>
  if (xe_3 <= 0)
    goto <bb 3>; [INV]
  else
    goto <bb 7>; [INV]

The initial evaluation of
  # xe_3 = PHI <0(2), xe_11(5)>
causes The backedge 5->6  to evaluate xe_11, which is dependant on the value of xe_3 (its a cycle).  This requires evaluating values thru the cycle and Ranger calculates that the edge 6->3 causes xe_3 = [-INF, 0]
which feeds into
  ui_8 = ~xe_3;
resulting in a range of [0, +INF] for ui_8

A bunch of other things are figured out, and ultimately it decides that xe_3 evaluates to [0,0].  Very cool, this is good.

lots of things fall out from this, and when we get to folding
  ui_8 = ~xe_3     and xe_3 is now known to be [0,0]
we can fold it into
  ui_8 = -1


THe problem is during the initial backedge calculations, we decided ui_8 was initially [-1, +INF],  and without the iterative injection to indicate we have a new value for ui_8, the ranger continues to use the cached value of [-1, +INF].  Even though we know it is the constant -1 now.

So we're doing better than before because we didn't get any of those values, but it looks funny in the listing:

 63        range_of_stmt (ui_8) at stmt ui_8 = -1;
             TRUE : (63) range_of_stmt (ui_8) int [-1, +INF]

duh.

I originally planned to add the new value injection early in the next stage 1.  It will recognize that the previously calculated value of ui_8 is stale when the thing it is dependent on (xe_3) has a value different than the one which was used to calculate it.

When this is done, we will basically be able to fold most of this function away in EVRP.

The basic mechanism is already in place, its just a matter of figuring out how to efficiently decide when to trigger the re-evaluation.  There may yet be something we can do about this in this stage 1.
Comment 5 GCC Commits 2020-11-04 18:12:20 UTC
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:e86fd6a17cdb26710d1f13c9a47a3878c76028f9

commit r11-4724-ge86fd6a17cdb26710d1f13c9a47a3878c76028f9
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Wed Nov 4 12:59:15 2020 -0500

    Add Ranger temporal cache
    
    Add a timestamp to supplement the global range cache to detect when a value
    may become stale.
    
            gcc/
            PR tree-optimization/97515
            * gimple-range-cache.h (class ranger_cache): New prototypes plus
            temporal cache pointer.
            * gimple-range-cache.cc (struct range_timestamp): New.
            (class temporal_cache): New.
            (temporal_cache::temporal_cache): New.
            (temporal_cache::~temporal_cache): New.
            (temporal_cache::get_timestamp): New.
            (temporal_cache::set_dependency): New.
            (temporal_cache::temporal_value): New.
            (temporal_cache::current_p): New.
            (temporal_cache::set_timestamp): New.
            (temporal_cache::set_always_current): New.
            (ranger_cache::ranger_cache): Allocate the temporal cache.
            (ranger_cache::~ranger_cache): Free temporal cache.
            (ranger_cache::get_non_stale_global_range): New.
            (ranger_cache::set_global_range): Add a timestamp.
            (ranger_cache::register_dependency): New.  Add timestamp dependency.
            * gimple-range.cc (gimple_ranger::range_of_range_op): Add operand
            dependencies.
            (gimple_ranger::range_of_phi): Ditto.
            (gimple_ranger::range_of_stmt): Check if global range is stale, and
            recalculate if so.
            gcc/testsuite/
            * gcc.dg/pr97515.c: Check listing for folding of entire function.
Comment 6 Andrew Macleod 2020-11-04 18:14:18 UTC
With this final patch, we now optimize this function down to 

e7 (int gg)
{
  int ui;
  int xe;
  _Bool _1;
  int _2;

  <bb 2> :

  <bb 3> :
  _1 = 0;
  _2 = (int) _1;
  goto <bb 3>; [INV]

}

That is as good as we're going to do :-)
Comment 7 Andrew Macleod 2020-11-20 16:09:18 UTC
Created attachment 49607 [details]
tweak testcase

The fix for PR 93781 changed the order of some evaluations in the testcase.

the problem is that we have
  [-1,-1] >> VARYING
This use to overflow in the cross product code and just return VARYING.
now that its being masked to
 [-1,-1] >> [0,31]

so in the end, the rshift cross product code decides the result is [-1, -1]  instead of varying.

THis ends up changing the calculations, and we realized certain other parts of the program are unreachable sooner..
we eventually figure out all the same things, but the problem now is that the first statement evaluated:
xe_3 = PHI <0(2), xe_11(5)>
which we now dont determine xe_11 is 0 right off the bat like we use to.


THe final IL does look like:
  <bb 2> :
  goto <bb 6>; [INV]

  <bb 3> :
  ui_8 = ~xe_3;
  if (ui_8 == 0)
    goto <bb 4>; [INV]
  else
    goto <bb 5>; [INV]

  <bb 4> :

  <bb 5> :
  # ui_4 = PHI <ui_8(3), -1(4)>

  <bb 6> :
  # xe_3 = PHI <0(2), 0(5)>
  if (xe_3 <= 0)
    goto <bb 3>; [INV]
  else
    goto <bb 7>; [INV]

  <bb 7> :
  # xe_5 = PHI <xe_3(6)>
  return xe_5;

The problem is we now don't fold away the PHI:
# ui_4 = PHI <ui_8(3), -1(4)>
because although the code looks obvious that ui_8 must be 0 on the edge 3->5, ranger has determined that xe_3 must be 0 feeding into this block, which means ui_8 must be -1, which means the TRUE edge can never be taken, and therefore ui_8 is actually undefined  on this edge.
THe stmt fold routine only looks for constant values in phi arguements, and thus cannot fold away the PHI anymore since there isnt a constant value on the edge.

All the same things will fold away here, but the undefined bit is preventing it from happening within EVRP.
THe next pass of CCP does turn this into:
  <bb 2> [local count: 10631108]:

  <bb 3> [local count: 1073741824]:
  goto <bb 3>; [100.00%]


SO for now at least,  I'll just change the testcase to confirm its down to a single goto by this pass.

Ideally we'd be able to detect the undefined during statement folding and allow it to  match with any other constants.. much like we do with undefined local values.