[PATCH] sra: Fix sra_modify_expr handling of partial writes (PR 94482)

Richard Biener rguenther@suse.de
Wed Apr 8 18:56:41 GMT 2020


On April 8, 2020 8:34:08 PM GMT+02:00, Martin Jambor <mjambor@suse.cz> wrote:
>Hi,
>
>On Wed, Apr 08 2020, Richard Biener wrote:
>> On Tue, 7 Apr 2020, Richard Biener wrote:
>>
>>> On April 7, 2020 6:25:26 PM GMT+02:00, Martin Jambor
><mjambor@suse.cz> wrote:
>>> >Hi,
>>> >
>>> >On Tue, Apr 07 2020, Richard Biener wrote:
>>> >> On Tue, 7 Apr 2020, Martin Jambor wrote:
>>> >>
>>> >>> Hi,
>>> >>> 
>>> >>> when sra_modify_expr is invoked on an expression that modifies
>only
>>> >>> part of the underlying replacement, such as a BIT_FIELD_REF on a
>LHS
>>> >>> of an assignment and the SRA replacement's type is not
>compatible
>>> >with
>>> >>> what is being replaced (0th operand of the B_F_R in the above
>>> >>> example), it does not work properly, basically throwing away the
>>> >partd
>>> >>> of the expr that should have stayed intact.
>>> >>> 
>>> >>> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on
>the
>>> >>> binary image of the replacement (and so in a way serve as a
>>> >>> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>>> >>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR
>>> >under
>>> >>> the complex partial access expression, which is OK even in a LHS
>of
>>> >an
>>> >>> assignment (and other write contexts) because in those cases the
>>> >>> replacement is not going to be a giple register anyway.
>>> >>> 
>>> >>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a
>bit
>>> >>> fragile because SRA prefers complex and vector types over
>anything
>>> >else
>>> >>> (and in between the two it decides based on TYPE_UID which in my
>>> >testing
>>> >>> today always preferred complex types) and even when GIMPLE is
>wrong
>>> >I
>>> >>> often still did not get failing runs, so I only run it at -O1
>(which
>>> >is
>>> >>> the only level where the the test fails for me).
>>> >>> 
>>> >>> Bootstrapped and tested on x86_64-linux, bootstrap and testing
>on
>>> >>> i686-linux and aarch64-linux underway.
>>> >>> 
>>> >>> OK for trunk (and subsequently for release branches) if it
>passes?
>>> >>
>>> >> OK.
>>> >
>>> >I must have been already half asleep when writing that it passed
>>> >testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c
>fails
>>> >even on x86_64, because fwprop happily combines
>>> >
>>> >  u$ci_10 = MEM[(union U *)&u];
>>> >
>>> >and
>>> >
>>> >  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>> >
>>> >into
>>> >
>>> >  f1_6 = IMAGPART_EXPR <MEM[(union U *)&u]>;
>>> >
>>> >and the gimple verifier of course does not like that.  I'll have a
>look
>>> >at that tomorrow.
>>> 
>>> Ah, that might be a genuine fwprop bug. 
>>
>> On a second thought
>>
>>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>
>> shouldn't be valid GIMPLE since 'complex float' is a register type
>> and thus it should have been
>>
>>   _11 = VIEW_CONVERT_EXPR<complex float>(u$ci_10);
>>   f1_6 = IMAGPART_EXPR <_11>;
>>
>
>OK, the newest version of the patch below is careful to do that if the
>replacement is going to be a gimple_register,
>
>> now it's a bit of a grey area since a load like
>>
>>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u);
>>
>> is valid (we don't want to force a whole _Complex load here).
>
>and the above for non-gimple-registers.
>
>>
>> For example in VN
>>
>>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>
>> goes through the load machinery.
>>
>> So let's say the IL is undesirable at least.
>>
>> The following fixes the forwprop laziness, please include it in the
>> patch so it gets cherry-picked for backports.
>
>I added it to the modified patch, it was still necessary.  The version
>below has passed bootstrap and testing on x86-64-linux and
>aarch64-linux
>(and I have double checked!) and bootstrap on i686-linux, testing on
>the
>32-bit platform is still ongoing.  OK if it passes there too?

OK. 

Richard. 

>Thanks,
>
>Martin
>
>
>sra: Fix sra_modify_expr handling of partial writes (PR 94482)
>
>when sra_modify_expr is invoked on an expression that modifies only
>part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>of an assignment and the SRA replacement's type is not compatible with
>what is being replaced (0th operand of the B_F_R in the above
>example), it does not work properly, basically throwing away the partd
>of the expr that should have stayed intact.
>
>This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>binary image of the replacement (and so in a way serve as a
>VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>REALPART_EXPRs and IMAGPART_EXPRs, if the replacement is not a
>register, we insert a VIEW_CONVERT_EXPR under
>the complex partial access expression, which is always OK, for loads
>from registers we take the extra step of converting it to a temporary.
>
>This revealed a bug in fwprop which is fixed with the hunk from Richi.
>
>The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>fragile because SRA prefers complex and vector types over anything
>else (and in between the two it decides based on TYPE_UID which in my
>testing today always preferred complex types) and so I only run it at
>-O1 (which is the only level where the the test fails for me).
>
>
>2020-04-08  Martin Jambor  <mjambor@suse.cz>
>	    Richard Biener  <rguenther@suse.de>
>
>	PR tree-optimization/94482
>	* tree-sra.c (create_access_replacement): Dump new replacement with
>	TDF_UID.
>	(sra_modify_expr): Fix handling of cases when the original EXPR writes
>	to only part of the replacement.
>	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
>	the first operand of combinations into REAL/IMAGPART_EXPR and
>	BIT_FIELD_REF.
>
>	testsuite/
>	* gcc.dg/torture/pr94482.c: New test.
>	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
>---
> gcc/ChangeLog                             | 12 ++++++
> gcc/testsuite/ChangeLog                   |  6 +++
> gcc/testsuite/gcc.dg/torture/pr94482.c    | 34 +++++++++++++++
> gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++++++++++++++++++++++
> gcc/tree-sra.c                            | 31 ++++++++++++--
> gcc/tree-ssa-forwprop.c                   |  6 ++-
> 6 files changed, 133 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>
>diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>index e10fb251c16..675ac2a4b6a 100644
>--- a/gcc/ChangeLog
>+++ b/gcc/ChangeLog
>@@ -1,3 +1,15 @@
>+2020-04-08  Martin Jambor  <mjambor@suse.cz>
>+	    Richard Biener  <rguenther@suse.de>
>+
>+	PR tree-optimization/94482
>+	* tree-sra.c (create_access_replacement): Dump new replacement with
>+	TDF_UID.
>+	(sra_modify_expr): Fix handling of cases when the original EXPR
>writes
>+	to only part of the replacement.
>+	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
>+	the first operand of combinations into REAL/IMAGPART_EXPR and
>+	BIT_FIELD_REF.
>+
> 2020-04-05 Zachary Spytz  <zspytz@gmail.com>
> 
> 	* extend.texi: Add free to list of ISO C90 functions that
>diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>index 88bab5d3d19..b0d94b5394e 100644
>--- a/gcc/testsuite/ChangeLog
>+++ b/gcc/testsuite/ChangeLog
>@@ -1,3 +1,9 @@
>+2020-04-08  Martin Jambor  <mjambor@suse.cz>
>+
>+	PR tree-optimization/94482
>+	* gcc.dg/torture/pr94482.c: New test.
>+	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
>+
> 2020-04-05  Iain Sandoe  <iain@sandoe.co.uk>
> 
> 	* g++.dg/coroutines/torture/co-await-14-template-traits.C: Rename...
>diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c
>b/gcc/testsuite/gcc.dg/torture/pr94482.c
>new file mode 100644
>index 00000000000..5bb19e745c2
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
>@@ -0,0 +1,34 @@
>+/* { dg-do run } */
>+
>+typedef unsigned V __attribute__ ((__vector_size__ (16)));
>+union U
>+{
>+  V j;
>+  unsigned long long i __attribute__ ((__vector_size__ (16)));
>+};
>+
>+static inline __attribute__((always_inline)) V
>+foo (unsigned long long a)
>+{
>+  union U z = { .j = (V) {} };
>+  for (unsigned long i = 0; i < 1; i++)
>+    z.i[i] = a;
>+  return z.j;
>+}
>+
>+static inline __attribute__((always_inline)) V
>+bar (V a, unsigned long long i, int q)
>+{
>+  union U z = { .j = a };
>+  z.i[q] = i;
>+  return z.j;
>+}
>+
>+int
>+main ()
>+{
>+  union U z = { .j = bar (foo (1729), 2, 1) };
>+  if (z.i[0] != 1729)
>+    __builtin_abort ();
>+  return 0;
>+}
>diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>new file mode 100644
>index 00000000000..fcac9d5e439
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>@@ -0,0 +1,50 @@
>+/* { dg-do run } */
>+/* { dg-options "-O1" } */
>+
>+typedef unsigned long V __attribute__ ((__vector_size__ (8)));
>+typedef _Complex int Ci;
>+typedef _Complex float Cf;
>+
>+union U
>+{
>+  Ci ci;
>+  Cf cf;
>+};
>+
>+volatile Ci vgi;
>+
>+Cf foo (Cf c)
>+{
>+  __real c = 0x1ffp10;
>+  return c;
>+}
>+
>+Ci ioo (Ci c)
>+{
>+  __real c = 50;
>+  return c;
>+}
>+
>+
>+int main (int argc, char *argv[])
>+{
>+  union U u;
>+
>+  __real u.ci = 500;
>+  __imag u.ci = 1000;
>+  vgi = u.ci;
>+
>+  u.ci = ioo (u.ci);
>+  __imag u.ci = 100;
>+
>+  if (__real u.ci != 50 || __imag u.ci != 100)
>+    __builtin_abort();
>+
>+  u.cf = foo (u.cf);
>+  __imag u.cf = 0x1p3;
>+
>+  if (__real u.cf != 0x1ffp10 || __imag u.cf != 0x1p3)
>+    __builtin_abort();
>+
>+  return 0;
>+}
>diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>index b2056b58750..84c113c403c 100644
>--- a/gcc/tree-sra.c
>+++ b/gcc/tree-sra.c
>@@ -2257,7 +2257,7 @@ create_access_replacement (struct access *access,
>tree reg_type = NULL_TREE)
> 	  print_generic_expr (dump_file, access->base);
> 	  fprintf (dump_file, " offset: %u, size: %u: ",
> 		   (unsigned) access->offset, (unsigned) access->size);
>-	  print_generic_expr (dump_file, repl);
>+	  print_generic_expr (dump_file, repl, TDF_UID);
> 	  fprintf (dump_file, "\n");
> 	}
>     }
>@@ -3698,6 +3698,7 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator
>*gsi, bool write)
>   location_t loc;
>   struct access *access;
>   tree type, bfr, orig_expr;
>+  bool partial_cplx_access = false;
> 
>   if (TREE_CODE (*expr) == BIT_FIELD_REF)
>     {
>@@ -3708,7 +3709,10 @@ sra_modify_expr (tree *expr,
>gimple_stmt_iterator *gsi, bool write)
>     bfr = NULL_TREE;
> 
>if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) ==
>IMAGPART_EXPR)
>-    expr = &TREE_OPERAND (*expr, 0);
>+    {
>+      expr = &TREE_OPERAND (*expr, 0);
>+      partial_cplx_access = true;
>+    }
>   access = get_access_for_expr (*expr);
>   if (!access)
>     return false;
>@@ -3736,13 +3740,32 @@ sra_modify_expr (tree *expr,
>gimple_stmt_iterator *gsi, bool write)
>   be accessed as a different type too, potentially creating a need for
>  type conversion (see PR42196) and when scalarized unions are involved
>          in assembler statements (see PR42398).  */
>-      if (!useless_type_conversion_p (type, access->type))
>+      if (!bfr && !useless_type_conversion_p (type, access->type))
> 	{
> 	  tree ref;
> 
> 	  ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
> 
>-	  if (write)
>+	  if (partial_cplx_access)
>+	    {
>+	    /* VIEW_CONVERT_EXPRs in partial complex access are always fine
>in
>+	       the case of a write because in such case the replacement
>cannot
>+	       be a gimple register.  In the case of a load, we have to
>+	       differentiate in between a register an non-register
>+	       replacement.  */
>+	      tree t = build1 (VIEW_CONVERT_EXPR, type, repl);
>+	      gcc_checking_assert (!write || access->grp_partial_lhs);
>+	      if (!access->grp_partial_lhs)
>+		{
>+		  tree tmp = make_ssa_name (type);
>+		  gassign *stmt = gimple_build_assign (tmp, t);
>+		  /* This is always a read. */
>+		  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
>+		  t = tmp;
>+		}
>+	      *expr = t;
>+	    }
>+	  else if (write)
> 	    {
> 	      gassign *stmt;
> 
>diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
>index e7eaa18ccad..3d8acf7eb03 100644
>--- a/gcc/tree-ssa-forwprop.c
>+++ b/gcc/tree-ssa-forwprop.c
>@@ -2815,7 +2815,8 @@ pass_forwprop::execute (function *fun)
> 		    continue;
> 		  if (!is_gimple_assign (use_stmt)
> 		      || (gimple_assign_rhs_code (use_stmt) != REALPART_EXPR
>-			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR))
>+			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
>+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
> 		    {
> 		      rewrite = false;
> 		      break;
>@@ -2877,7 +2878,8 @@ pass_forwprop::execute (function *fun)
> 		  if (is_gimple_debug (use_stmt))
> 		    continue;
> 		  if (!is_gimple_assign (use_stmt)
>-		      || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF)
>+		      || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF
>+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
> 		    {
> 		      rewrite = false;
> 		      break;



More information about the Gcc-patches mailing list