This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/85475] [8 Regression] Compile time hog w/ -O1 -fpeel-loops
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 20 Apr 2018 07:16:20 +0000
- Subject: [Bug tree-optimization/85475] [8 Regression] Compile time hog w/ -O1 -fpeel-loops
- Auto-submitted: auto-generated
- References: <bug-85475-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85475
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
In dom3 I see
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
Applying pattern match.pd:2585, gimple-match.c:10461
...
it is for example matching on (le_10 * 2) * (le_10 * 2), transforming
it to (le_10 * le_10) * 4 which fails because of the :s. But then DOM
faces the situation where the loop is fully unrolled and we have
nj (int le)
{
int zb;
unsigned int ivtmp_1;
int _6;
unsigned int ivtmp_16;
unsigned int ivtmp_100;
<bb 2> [local count: 63136020]:
le_14 = le_3(D) * 2;
zb_15 = 1;
ivtmp_16 = 15;
le_20 = le_14 * 2;
zb_21 = 2;
le_26 = le_20 * 2;
zb_27 = 3;
le_32 = le_26 * 2;
zb_33 = 4;
le_38 = le_32 * 2;
zb_39 = 5;
le_44 = le_38 * 2;
zb_45 = 6;
le_50 = le_44 * 2;
zb_51 = 7;
le_56 = le_50 * 2;
zb_57 = 8;
le_62 = le_56 * 2;
zb_63 = 9;
le_68 = le_62 * 2;
zb_69 = 10;
le_74 = le_68 * 2;
zb_75 = 11;
le_80 = le_74 * 2;
zb_81 = 12;
le_86 = le_80 * 2;
zb_87 = 13;
le_92 = le_86 * 2;
zb_93 = 14;
le_98 = le_92 * 2;
zb_99 = 15;
ivtmp_100 = 1;
le_4 = le_98 * 2;
zb_5 = 16;
ivtmp_1 = 0;
_6 = le_4 * le_4;
return _6;
}
with DOM folding le_4 * le_4 via gimple_fold_stmt_to_constant_1. You can
trivially see the exponential work we do when applying the pattern if you
consider that we re-simplify each simplification result.
/* Reassociate (X * CST) * Y to (X * Y) * CST. This does not introduce
signed overflow for CST != 0 && CST != -1. */
(simplify
(mult:c (mult:s @0 INTEGER_CST@1) @2)
(if (TREE_CODE (@2) != INTEGER_CST
&& !integer_zerop (@1) && !integer_minus_onep (@1))
(mult (mult @0 @2) @1)))
The issue is that we get into non-linear complexity once there's more
than a single use of the inner multiplication, so an explicit single_use
fixes that.