Link to the Compiler Explorer: https://godbolt.org/z/1hs7ssMrd Reproducer: short d, e; int f; extern short g[][24]; char c; void h() { char a = 6; c = a; for (unsigned long a = (d || e) - 1; a < c; a += f) for (signed b = 0; b < 24; b++) g[a][b] = 4; } Error: >$ g++ -O2 -c func.cpp g++: internal compiler error: Segmentation fault signal terminated program cc1plus gcc version 13.0.0 20220815 (6624ad73064de241e937e97a28b65c50fe14c78e)
Started with r13-1268-g8c99e307b20c50.
Works with -fno-thread-jumps or with -fdisable-tree-dom3. I haven't investigated whether the threading done in DOM2 is generating invalid IL, but it looks like this match.pd pattern is going around in circles: /* (A * C) +- (B * C) -> (A+-B) * C and (A * C) +- A -> A * (C+-1). Modeled after fold_plusminus_mult_expr. */ (if (!TYPE_SATURATING (type) && (!FLOAT_TYPE_P (type) || flag_associative_math)) (for plusminus (plus minus) ... ...
It looks like DOM2, as a side-effect of using the ranger to do cprop, is exporting a global range for a_9 that is causing a match.pd pattern to go into an endless loop. I don't understand match.pd patterns very well. Could someone see if there is something obvious going on in the pattern? This is .ivopts: Applying pattern match.pd:3164, generic-match.cc:22877 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:3164, generic-match.cc:22877 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:3164, generic-match.cc:22877 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:3164, generic-match.cc:22877 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:3164, generic-match.cc:22877 Matching expression match.pd:1894, generic-match.cc:676 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:1921, generic-match.cc:27593 Matching expression match.pd:146, generic-match.cc:23 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:3164, generic-match.cc:22877 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:3164, generic-match.cc:22877 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:3164, generic-match.cc:22877 Matching expression match.pd:1894, generic-match.cc:676 Matching expression match.pd:1894, generic-match.cc:676 Applying pattern match.pd:1921, generic-match.cc:27593 etc etc etc generic_simplify() is being called with: (gdb) p code $20 = POINTER_PLUS_EXPR (gdb) p debug(_p0) (vector(8) short int *) &g + a_9 * 48 $21 = void (gdb) p debug(_p1) 48 $22 = void (gdb) Where a_9 has a global range of [0,0].
> It looks like DOM2, as a side-effect of using the ranger to do cprop, is > exporting a global range for a_9 > Where a_9 has a global range of [0,0]. Why didn't DOM do a constant prop here since the only value for a_9 is 0 ...
(In reply to Andrew Pinski from comment #4) > > It looks like DOM2, as a side-effect of using the ranger to do cprop, is > > exporting a global range for a_9 > > Where a_9 has a global range of [0,0]. > > Why didn't DOM do a constant prop here since the only value for a_9 is 0 ... The only uses of a_9 are in PHIs: <bb 5> [local count: 4724464]: # a_9 = PHI <a_13(4), 0(3)> goto <bb 8>; [100.00%] ... ... <bb 8> [local count: 42949672]: # a_23 = PHI <a_14(7), a_9(5)> goto <bb 6>; [100.00%] Perhaps DOM doesn't cprop into PHIs? ....actually: static void cprop_into_successor_phis (basic_block bb, class const_and_copies *const_and_copies) But that only works with DOM's internal const_and_copies tables, not with ranger (or originally with the evrp engine).
We are folding a POINTER_PLUS_EXPR of (vector(8) short int *) &g + a_9 * 48 and 48 which first gets to (vector(8) short int *) &g + (a_9 * 48 + 48) which gets /* (A * C) +- (B * C) -> (A+-B) * C and (A * C) +- A -> A * (C+-1). Modeled after fold_plusminus_mult_expr. */ applied producing (a_9 + 1) * 48 the fold code produces a (sizetype) cast around (unsigned long) which is useless (but not in GENERIC) and then /* Narrow integer multiplication by a zero_one_valued_p operand. Multiplication by [0,1] is guaranteed not to overflow. */ (simplify (convert (mult@0 zero_one_valued_p@1 INTEGER_CST@2)) (if (INTEGRAL_TYPE_P (type) && INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@0))) (mult (convert @1) (convert @2)))) triggers but folding this multiplication via fold-const.cc extract_muldiv hoists the conversion again: if (TREE_CODE (arg1) == INTEGER_CST && (tem = extract_muldiv (op0, arg1, code, NULL_TREE, &strict_overflow_p)) != 0) { if (strict_overflow_p) fold_overflow_warning (("assuming signed overflow does not " "occur when simplifying " "multiplication"), WARN_STRICT_OVERFLOW_MISC); return fold_convert_loc (loc, type, tem); which triggers the endless loop. The fix might be to change the above pattern guard to TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (@0)) which would match what the comment says. The comment about DOM + missing cprop remains of course. I'm testing the above.
(In reply to Richard Biener from comment #6) > The comment about DOM + missing cprop remains of course. This seems like a longstanding problem with DOM (cpropping into PHI args with range information), as it's never been properly integrated with evrp (or ranger nowadays). We could put this on a TODO list, I suppose, though I wonder how much time we should be sinking into DOM since we've all agreed it's best to ultimately replace it with a PRE instance.
On Tue, 16 Aug 2022, aldyh at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106630 > > --- Comment #7 from Aldy Hernandez <aldyh at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #6) > > > The comment about DOM + missing cprop remains of course. > > This seems like a longstanding problem with DOM (cpropping into PHI args with > range information), as it's never been properly integrated with evrp (or ranger > nowadays). We could put this on a TODO list, I suppose, though I wonder how > much time we should be sinking into DOM since we've all agreed it's best to > ultimately replace it with a PRE instance. Indeed, I wouldn't worry about this for now.
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:5e88fccf4be7e8ab22734d87f8e520b25d92d845 commit r13-2069-g5e88fccf4be7e8ab22734d87f8e520b25d92d845 Author: Richard Biener <rguenther@suse.de> Date: Tue Aug 16 09:43:24 2022 +0200 middle-end/106630 - avoid ping-pong between extract_muldiv and match.pd The following avoids ping-pong between the match.pd pattern changing (sizetype) ((a_9 + 1) * 48) to (sizetype)(a_9 + 1) * 48 and extract_muldiv performing the reverse transform by restricting the match.pd pattern to narrowing conversions as the comment indicates. PR middle-end/106630 * match.pd ((T)(x * CST) -> (T)x * CST): Restrict to narrowing conversions. * gcc.dg/torture/pr106630.c: New testcase.
Fixed.