[PATCH] simplify-rtx: Punt on simplify_associative_operation with large operands [PR102356]

Jakub Jelinek jakub@redhat.com
Tue Nov 30 09:26:13 GMT 2021


Hi!

Seems simplify_associate_operation is quadratic, which isn't a big deal
for use during combine and other similar RTL passes, because those never
try to combine expressions from more than a few instructions and because
those instructions need to be recognized the machine description also bounds
how many expressions can appear in there.
var-tracking has depth limits only for some cases and unlimited depth
for the vt_expand_loc though:
/* This is the value used during expansion of locations.  We want it
   to be unbounded, so that variables expanded deep in a recursion
   nest are fully evaluated, so that their values are cached
   correctly.  We avoid recursion cycles through other means, and we
   don't unshare RTL, so excess complexity is not a problem.  */
#define EXPR_DEPTH (INT_MAX)
/* We use this to keep too-complex expressions from being emitted as
   location notes, and then to debug information.  Users can trade
   compile time for ridiculously complex expressions, although they're
   seldom useful, and they may often have to be discarded as not
   representable anyway.  */
#define EXPR_USE_DEPTH (param_max_vartrack_expr_depth)

IMO for very large expressions it isn't worth trying to reassociate though,
in fact e.g. for the new testcase below keeping it as is has bigger chance
of generating smaller debug info which the dwarf2out.c part of the change
tries to achieve - if a binary operation has the same operands, we can
use DW_OP_dup and not bother computing the possibly large operand again.

This patch punts if the associate operands contain together more than
64 same operations, which can happen only during var-tracking.
During bootstrap/regtest on x86_64-linux and i686-linux, this triggers
only on the new testcase and on gcc.dg/torture/pr88597.c.
I think given the 16 element static buffer in subrtx_iterator::array_type
it shouldn't slow down the common case of small expressions, but have
been wondering whether we shouldn't have some in_vartrack global flag
or guard it with
(current_pass && strcmp (current_pass->name, "vartrack") == 0).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Another possibility to deal with the power expressions in debug info
would be to introduce some new RTL operation for the pow{,i} (x, n)
case, allow that solely in debug insns and expand those into DWARF
using a loop.  But that seems like quite a lot of work for something rarely
used (especially when powi for larger n is only useful for 0 and 1 inputs
because anything else overflows).

2021-11-30  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/102356
	* simplify-rtx.c: Include rtl-iter.h.
	(simplify_associative_operation): Don't reassociate very large
	expressions with 64 or more CODE subrtxes in the operands.
	* dwarf2out.c (mem_loc_descriptor): Optimize binary operation
	with both operands the same using DW_OP_dup.

	* gcc.dg/pr102356.c: New test.

--- gcc/simplify-rtx.c.jj	2021-11-05 00:43:22.576624649 +0100
+++ gcc/simplify-rtx.c	2021-11-29 19:46:29.674750656 +0100
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.
 #include "selftest-rtl.h"
 #include "rtx-vector-builder.h"
 #include "rtlanal.h"
+#include "rtl-iter.h"
 
 /* Simplification and canonicalization of RTL.  */
 
@@ -2263,9 +2264,40 @@ simplify_context::simplify_associative_o
 {
   rtx tem;
 
+  if (GET_CODE (op0) == code || GET_CODE (op1) == code)
+    {
+      /* During vartrack, the expressions can grow arbitrarily large.
+	 Reassociation isn't really useful for larger expressions
+	 and can be very compile time expensive.  */
+      unsigned count = 0;
+      subrtx_iterator::array_type array;
+      FOR_EACH_SUBRTX (iter, array, op0, NONCONST)
+	{
+	  const_rtx x = *iter;
+	  if (GET_CODE (x) == code)
+	    {
+	      if (count++ >= 64)
+		return NULL_RTX;
+	    }
+	  else
+	    iter.skip_subrtxes ();
+	}
+      FOR_EACH_SUBRTX (iter, array, op1, NONCONST)
+	{
+	  const_rtx x = *iter;
+	  if (GET_CODE (x) == code)
+	    {
+	      if (count++ >= 64)
+		return NULL_RTX;
+	    }
+	  else
+	    iter.skip_subrtxes ();
+	}
+    }
+
   /* Linearize the operator to the left.  */
   if (GET_CODE (op1) == code)
     {
       /* "(a op b) op (c op d)" becomes "((a op b) op c) op d)".  */
       if (GET_CODE (op0) == code)
 	{
--- gcc/dwarf2out.c.jj	2021-11-29 14:24:14.053634713 +0100
+++ gcc/dwarf2out.c	2021-11-29 19:44:54.192113401 +0100
@@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod
     do_binop:
       op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
 				VAR_INIT_STATUS_INITIALIZED);
+      if (XEXP (rtl, 0) == XEXP (rtl, 1))
+	{
+	  if (op0 == 0)
+	    break;
+	  mem_loc_result = op0;
+	  add_loc_descr (&mem_loc_result, new_loc_descr (DW_OP_dup, 0, 0));
+	  add_loc_descr (&mem_loc_result, new_loc_descr (op, 0, 0));
+	  break;
+	}
       op1 = mem_loc_descriptor (XEXP (rtl, 1), mode, mem_mode,
 				VAR_INIT_STATUS_INITIALIZED);
 
--- gcc/testsuite/gcc.dg/pr102356.c.jj	2021-11-29 19:57:47.061075856 +0100
+++ gcc/testsuite/gcc.dg/pr102356.c	2021-11-29 19:57:26.852364489 +0100
@@ -0,0 +1,33 @@
+/* PR rtl-optimization/102356 */
+/* { dg-do compile { target int32plus } } */
+/* { dg-options "-O3 -g" } */
+
+signed char a = 0;
+unsigned char b = 9;
+unsigned long long c = 0xF1FBFC17225F7A57ULL;
+int d = 0x3A6667C6;
+
+unsigned char
+foo (unsigned int x)
+{
+  unsigned int *e = &x;
+  if ((c /= ((0 * (*e *= b)) <= 0)))
+    ;
+  for (d = 9; d > 2; d -= 2)
+    {
+      c = -2;
+      do
+	if ((*e *= *e))
+	  {
+	    a = 4;
+	    do
+	      {
+		a -= 3;
+		if ((*e *= *e))
+		  b = 9;
+	      }
+	    while (a > 2);
+	  }
+      while (c++);
+    }
+}

	Jakub



More information about the Gcc-patches mailing list