g++-11.0.0-alpha20201018 snapshot (g:1e70b1a358b6ce3b894f284d88fbb90518d45cc0) ICEs when compiling the following testcase w/ -O1 -ftree-vrp: int e7 (int gg) { int xe = 0; while (xe < 1) { int ui; ui = ~xe; if (ui == 0) ui = xe >> gg; xe %= !ui; } return xe; } % gcc-11.0.0 -O1 -ftree-vrp -c kg5m6kjq.c kg5m6kjq.c: In function 'e7': kg5m6kjq.c:18:1: error: type mismatch in 'rshift_expr' 18 | } | ^ int <<< error >>> int ui_10 = _3 >> gg_9(D); during GIMPLE pass: evrp kg5m6kjq.c:18:1: internal compiler error: verify_gimple failed 0xe04f1a verify_gimple_in_cfg(function*, bool) /var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201018/work/gcc-11-20201018/gcc/tree-cfg.c:5482 0xcdb72f execute_function_todo /var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201018/work/gcc-11-20201018/gcc/passes.c:1992 0xcdc56c do_per_function /var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201018/work/gcc-11-20201018/gcc/passes.c:1640 0xcdc56c execute_todo /var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201018/work/gcc-11-20201018/gcc/passes.c:2046
Confirmed, since r11-3685-gfcae5121154d1c33.
Created attachment 49417 [details] check for undefined before not returning a constant value The ranger deals with UNDEFINED slightly differently than traditional EVRP, which is exposed in this PR. An UNDEFINED value can be calculated in unreachable code. ie bb5: a_3 = 5 if (a_3 < 0) goto bb6 else goto bb9 bb6: z_8 = a_3 the edge 5->6 cannot be taken, Calculating the range for a_3 on this edge is calculated by taking the outgoing edge range of [-INF, -1] and intersecting it with the known range of [5, 5]. Which is [] or undefined. The value has no meaning on that edge since the edge can't be taken. The substitute_and_fold engine is driven by replacing values with their constant equivalent. It does so only as a statement is visited. When it gets to z_8 = a_3, the ranger was returning a range of [] AKA undefined instead of [5,5] so the use of a_3 was not being replaced by subst_and_fold in this block. This results in a reference to a_3 being left in the IL.. and chaos ensues. The simple solution to this is to adjust the way value_of() is calculated on ranges from the ranger. If the resulting range is UNDEFINED from a query, do a quick check of the global value and if its a constant, return that. This has been a lagging issue that pops up from time to time which is nicely demonstrated in this PR. This patch is undergoing testing currently
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>: https://gcc.gnu.org/g:ca5f4666f7a9404cdb04832324de3dd7d71e35c3 commit r11-4198-gca5f4666f7a9404cdb04832324de3dd7d71e35c3 Author: Andrew MacLeod <amacleod@redhat.com> Date: Wed Oct 21 19:55:28 2020 -0400 Check for undefined before not returning a constant value Don't return UNDEFINED for a range in an unreachable block if the global value evaluates to a constant. Return that constant instead. PR tree-optimization/97515 * value-query.cc (range_query::value_of_expr): If the result is UNDEFINED, check to see if the global value is a constant. (range_query::value_on_edge): Ditto.
Furthermore, this PR is also a good example of a case where we want to inject updated values into the Ranger's iterator. <bb 2> : goto <bb 6>; [INV] <bb 3> : ui_8 = ~xe_3; if (ui_8 == 0) goto <bb 4>; [INV] else goto <bb 5>; [INV] <bb 4> : ui_10 = xe_3 >> gg_9(D); <bb 5> : # ui_4 = PHI <ui_8(3), ui_10(4)> _1 = ui_4 == 0; _2 = (int) _1; xe_11 = xe_3 % _2; <bb 6> : # xe_3 = PHI <0(2), xe_11(5)> if (xe_3 <= 0) goto <bb 3>; [INV] else goto <bb 7>; [INV] The initial evaluation of # xe_3 = PHI <0(2), xe_11(5)> causes The backedge 5->6 to evaluate xe_11, which is dependant on the value of xe_3 (its a cycle). This requires evaluating values thru the cycle and Ranger calculates that the edge 6->3 causes xe_3 = [-INF, 0] which feeds into ui_8 = ~xe_3; resulting in a range of [0, +INF] for ui_8 A bunch of other things are figured out, and ultimately it decides that xe_3 evaluates to [0,0]. Very cool, this is good. lots of things fall out from this, and when we get to folding ui_8 = ~xe_3 and xe_3 is now known to be [0,0] we can fold it into ui_8 = -1 THe problem is during the initial backedge calculations, we decided ui_8 was initially [-1, +INF], and without the iterative injection to indicate we have a new value for ui_8, the ranger continues to use the cached value of [-1, +INF]. Even though we know it is the constant -1 now. So we're doing better than before because we didn't get any of those values, but it looks funny in the listing: 63 range_of_stmt (ui_8) at stmt ui_8 = -1; TRUE : (63) range_of_stmt (ui_8) int [-1, +INF] duh. I originally planned to add the new value injection early in the next stage 1. It will recognize that the previously calculated value of ui_8 is stale when the thing it is dependent on (xe_3) has a value different than the one which was used to calculate it. When this is done, we will basically be able to fold most of this function away in EVRP. The basic mechanism is already in place, its just a matter of figuring out how to efficiently decide when to trigger the re-evaluation. There may yet be something we can do about this in this stage 1.
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>: https://gcc.gnu.org/g:e86fd6a17cdb26710d1f13c9a47a3878c76028f9 commit r11-4724-ge86fd6a17cdb26710d1f13c9a47a3878c76028f9 Author: Andrew MacLeod <amacleod@redhat.com> Date: Wed Nov 4 12:59:15 2020 -0500 Add Ranger temporal cache Add a timestamp to supplement the global range cache to detect when a value may become stale. gcc/ PR tree-optimization/97515 * gimple-range-cache.h (class ranger_cache): New prototypes plus temporal cache pointer. * gimple-range-cache.cc (struct range_timestamp): New. (class temporal_cache): New. (temporal_cache::temporal_cache): New. (temporal_cache::~temporal_cache): New. (temporal_cache::get_timestamp): New. (temporal_cache::set_dependency): New. (temporal_cache::temporal_value): New. (temporal_cache::current_p): New. (temporal_cache::set_timestamp): New. (temporal_cache::set_always_current): New. (ranger_cache::ranger_cache): Allocate the temporal cache. (ranger_cache::~ranger_cache): Free temporal cache. (ranger_cache::get_non_stale_global_range): New. (ranger_cache::set_global_range): Add a timestamp. (ranger_cache::register_dependency): New. Add timestamp dependency. * gimple-range.cc (gimple_ranger::range_of_range_op): Add operand dependencies. (gimple_ranger::range_of_phi): Ditto. (gimple_ranger::range_of_stmt): Check if global range is stale, and recalculate if so. gcc/testsuite/ * gcc.dg/pr97515.c: Check listing for folding of entire function.
With this final patch, we now optimize this function down to e7 (int gg) { int ui; int xe; _Bool _1; int _2; <bb 2> : <bb 3> : _1 = 0; _2 = (int) _1; goto <bb 3>; [INV] } That is as good as we're going to do :-)
Created attachment 49607 [details] tweak testcase The fix for PR 93781 changed the order of some evaluations in the testcase. the problem is that we have [-1,-1] >> VARYING This use to overflow in the cross product code and just return VARYING. now that its being masked to [-1,-1] >> [0,31] so in the end, the rshift cross product code decides the result is [-1, -1] instead of varying. THis ends up changing the calculations, and we realized certain other parts of the program are unreachable sooner.. we eventually figure out all the same things, but the problem now is that the first statement evaluated: xe_3 = PHI <0(2), xe_11(5)> which we now dont determine xe_11 is 0 right off the bat like we use to. THe final IL does look like: <bb 2> : goto <bb 6>; [INV] <bb 3> : ui_8 = ~xe_3; if (ui_8 == 0) goto <bb 4>; [INV] else goto <bb 5>; [INV] <bb 4> : <bb 5> : # ui_4 = PHI <ui_8(3), -1(4)> <bb 6> : # xe_3 = PHI <0(2), 0(5)> if (xe_3 <= 0) goto <bb 3>; [INV] else goto <bb 7>; [INV] <bb 7> : # xe_5 = PHI <xe_3(6)> return xe_5; The problem is we now don't fold away the PHI: # ui_4 = PHI <ui_8(3), -1(4)> because although the code looks obvious that ui_8 must be 0 on the edge 3->5, ranger has determined that xe_3 must be 0 feeding into this block, which means ui_8 must be -1, which means the TRUE edge can never be taken, and therefore ui_8 is actually undefined on this edge. THe stmt fold routine only looks for constant values in phi arguements, and thus cannot fold away the PHI anymore since there isnt a constant value on the edge. All the same things will fold away here, but the undefined bit is preventing it from happening within EVRP. THe next pass of CCP does turn this into: <bb 2> [local count: 10631108]: <bb 3> [local count: 1073741824]: goto <bb 3>; [100.00%] SO for now at least, I'll just change the testcase to confirm its down to a single goto by this pass. Ideally we'd be able to detect the undefined during statement folding and allow it to match with any other constants.. much like we do with undefined local values.