[PATCH] sra: Fix sra_modify_expr handling of partial writes (PR 94482)
Richard Biener
rguenther@suse.de
Wed Apr 8 18:56:41 GMT 2020
On April 8, 2020 8:34:08 PM GMT+02:00, Martin Jambor <mjambor@suse.cz> wrote:
>Hi,
>
>On Wed, Apr 08 2020, Richard Biener wrote:
>> On Tue, 7 Apr 2020, Richard Biener wrote:
>>
>>> On April 7, 2020 6:25:26 PM GMT+02:00, Martin Jambor
><mjambor@suse.cz> wrote:
>>> >Hi,
>>> >
>>> >On Tue, Apr 07 2020, Richard Biener wrote:
>>> >> On Tue, 7 Apr 2020, Martin Jambor wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> when sra_modify_expr is invoked on an expression that modifies
>only
>>> >>> part of the underlying replacement, such as a BIT_FIELD_REF on a
>LHS
>>> >>> of an assignment and the SRA replacement's type is not
>compatible
>>> >with
>>> >>> what is being replaced (0th operand of the B_F_R in the above
>>> >>> example), it does not work properly, basically throwing away the
>>> >partd
>>> >>> of the expr that should have stayed intact.
>>> >>>
>>> >>> This is fixed in two ways. For BIT_FIELD_REFs, which operate on
>the
>>> >>> binary image of the replacement (and so in a way serve as a
>>> >>> VIEW_CONVERT_EXPR) we just do not bother with convertsing. For
>>> >>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR
>>> >under
>>> >>> the complex partial access expression, which is OK even in a LHS
>of
>>> >an
>>> >>> assignment (and other write contexts) because in those cases the
>>> >>> replacement is not going to be a giple register anyway.
>>> >>>
>>> >>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a
>bit
>>> >>> fragile because SRA prefers complex and vector types over
>anything
>>> >else
>>> >>> (and in between the two it decides based on TYPE_UID which in my
>>> >testing
>>> >>> today always preferred complex types) and even when GIMPLE is
>wrong
>>> >I
>>> >>> often still did not get failing runs, so I only run it at -O1
>(which
>>> >is
>>> >>> the only level where the the test fails for me).
>>> >>>
>>> >>> Bootstrapped and tested on x86_64-linux, bootstrap and testing
>on
>>> >>> i686-linux and aarch64-linux underway.
>>> >>>
>>> >>> OK for trunk (and subsequently for release branches) if it
>passes?
>>> >>
>>> >> OK.
>>> >
>>> >I must have been already half asleep when writing that it passed
>>> >testing. It did not, testsuite/gcc.c-torture/compile/pr42196-3.c
>fails
>>> >even on x86_64, because fwprop happily combines
>>> >
>>> > u$ci_10 = MEM[(union U *)&u];
>>> >
>>> >and
>>> >
>>> > f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>> >
>>> >into
>>> >
>>> > f1_6 = IMAGPART_EXPR <MEM[(union U *)&u]>;
>>> >
>>> >and the gimple verifier of course does not like that. I'll have a
>look
>>> >at that tomorrow.
>>>
>>> Ah, that might be a genuine fwprop bug.
>>
>> On a second thought
>>
>> f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>
>> shouldn't be valid GIMPLE since 'complex float' is a register type
>> and thus it should have been
>>
>> _11 = VIEW_CONVERT_EXPR<complex float>(u$ci_10);
>> f1_6 = IMAGPART_EXPR <_11>;
>>
>
>OK, the newest version of the patch below is careful to do that if the
>replacement is going to be a gimple_register,
>
>> now it's a bit of a grey area since a load like
>>
>> f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u);
>>
>> is valid (we don't want to force a whole _Complex load here).
>
>and the above for non-gimple-registers.
>
>>
>> For example in VN
>>
>> f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>
>> goes through the load machinery.
>>
>> So let's say the IL is undesirable at least.
>>
>> The following fixes the forwprop laziness, please include it in the
>> patch so it gets cherry-picked for backports.
>
>I added it to the modified patch, it was still necessary. The version
>below has passed bootstrap and testing on x86-64-linux and
>aarch64-linux
>(and I have double checked!) and bootstrap on i686-linux, testing on
>the
>32-bit platform is still ongoing. OK if it passes there too?
OK.
Richard.
>Thanks,
>
>Martin
>
>
>sra: Fix sra_modify_expr handling of partial writes (PR 94482)
>
>when sra_modify_expr is invoked on an expression that modifies only
>part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>of an assignment and the SRA replacement's type is not compatible with
>what is being replaced (0th operand of the B_F_R in the above
>example), it does not work properly, basically throwing away the partd
>of the expr that should have stayed intact.
>
>This is fixed in two ways. For BIT_FIELD_REFs, which operate on the
>binary image of the replacement (and so in a way serve as a
>VIEW_CONVERT_EXPR) we just do not bother with convertsing. For
>REALPART_EXPRs and IMAGPART_EXPRs, if the replacement is not a
>register, we insert a VIEW_CONVERT_EXPR under
>the complex partial access expression, which is always OK, for loads
>from registers we take the extra step of converting it to a temporary.
>
>This revealed a bug in fwprop which is fixed with the hunk from Richi.
>
>The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>fragile because SRA prefers complex and vector types over anything
>else (and in between the two it decides based on TYPE_UID which in my
>testing today always preferred complex types) and so I only run it at
>-O1 (which is the only level where the the test fails for me).
>
>
>2020-04-08 Martin Jambor <mjambor@suse.cz>
> Richard Biener <rguenther@suse.de>
>
> PR tree-optimization/94482
> * tree-sra.c (create_access_replacement): Dump new replacement with
> TDF_UID.
> (sra_modify_expr): Fix handling of cases when the original EXPR writes
> to only part of the replacement.
> * tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
> the first operand of combinations into REAL/IMAGPART_EXPR and
> BIT_FIELD_REF.
>
> testsuite/
> * gcc.dg/torture/pr94482.c: New test.
> * gcc.dg/tree-ssa/pr94482-2.c: Likewise.
>---
> gcc/ChangeLog | 12 ++++++
> gcc/testsuite/ChangeLog | 6 +++
> gcc/testsuite/gcc.dg/torture/pr94482.c | 34 +++++++++++++++
> gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++++++++++++++++++++++
> gcc/tree-sra.c | 31 ++++++++++++--
> gcc/tree-ssa-forwprop.c | 6 ++-
> 6 files changed, 133 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>
>diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>index e10fb251c16..675ac2a4b6a 100644
>--- a/gcc/ChangeLog
>+++ b/gcc/ChangeLog
>@@ -1,3 +1,15 @@
>+2020-04-08 Martin Jambor <mjambor@suse.cz>
>+ Richard Biener <rguenther@suse.de>
>+
>+ PR tree-optimization/94482
>+ * tree-sra.c (create_access_replacement): Dump new replacement with
>+ TDF_UID.
>+ (sra_modify_expr): Fix handling of cases when the original EXPR
>writes
>+ to only part of the replacement.
>+ * tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
>+ the first operand of combinations into REAL/IMAGPART_EXPR and
>+ BIT_FIELD_REF.
>+
> 2020-04-05 Zachary Spytz <zspytz@gmail.com>
>
> * extend.texi: Add free to list of ISO C90 functions that
>diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>index 88bab5d3d19..b0d94b5394e 100644
>--- a/gcc/testsuite/ChangeLog
>+++ b/gcc/testsuite/ChangeLog
>@@ -1,3 +1,9 @@
>+2020-04-08 Martin Jambor <mjambor@suse.cz>
>+
>+ PR tree-optimization/94482
>+ * gcc.dg/torture/pr94482.c: New test.
>+ * gcc.dg/tree-ssa/pr94482-2.c: Likewise.
>+
> 2020-04-05 Iain Sandoe <iain@sandoe.co.uk>
>
> * g++.dg/coroutines/torture/co-await-14-template-traits.C: Rename...
>diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c
>b/gcc/testsuite/gcc.dg/torture/pr94482.c
>new file mode 100644
>index 00000000000..5bb19e745c2
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
>@@ -0,0 +1,34 @@
>+/* { dg-do run } */
>+
>+typedef unsigned V __attribute__ ((__vector_size__ (16)));
>+union U
>+{
>+ V j;
>+ unsigned long long i __attribute__ ((__vector_size__ (16)));
>+};
>+
>+static inline __attribute__((always_inline)) V
>+foo (unsigned long long a)
>+{
>+ union U z = { .j = (V) {} };
>+ for (unsigned long i = 0; i < 1; i++)
>+ z.i[i] = a;
>+ return z.j;
>+}
>+
>+static inline __attribute__((always_inline)) V
>+bar (V a, unsigned long long i, int q)
>+{
>+ union U z = { .j = a };
>+ z.i[q] = i;
>+ return z.j;
>+}
>+
>+int
>+main ()
>+{
>+ union U z = { .j = bar (foo (1729), 2, 1) };
>+ if (z.i[0] != 1729)
>+ __builtin_abort ();
>+ return 0;
>+}
>diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>new file mode 100644
>index 00000000000..fcac9d5e439
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>@@ -0,0 +1,50 @@
>+/* { dg-do run } */
>+/* { dg-options "-O1" } */
>+
>+typedef unsigned long V __attribute__ ((__vector_size__ (8)));
>+typedef _Complex int Ci;
>+typedef _Complex float Cf;
>+
>+union U
>+{
>+ Ci ci;
>+ Cf cf;
>+};
>+
>+volatile Ci vgi;
>+
>+Cf foo (Cf c)
>+{
>+ __real c = 0x1ffp10;
>+ return c;
>+}
>+
>+Ci ioo (Ci c)
>+{
>+ __real c = 50;
>+ return c;
>+}
>+
>+
>+int main (int argc, char *argv[])
>+{
>+ union U u;
>+
>+ __real u.ci = 500;
>+ __imag u.ci = 1000;
>+ vgi = u.ci;
>+
>+ u.ci = ioo (u.ci);
>+ __imag u.ci = 100;
>+
>+ if (__real u.ci != 50 || __imag u.ci != 100)
>+ __builtin_abort();
>+
>+ u.cf = foo (u.cf);
>+ __imag u.cf = 0x1p3;
>+
>+ if (__real u.cf != 0x1ffp10 || __imag u.cf != 0x1p3)
>+ __builtin_abort();
>+
>+ return 0;
>+}
>diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>index b2056b58750..84c113c403c 100644
>--- a/gcc/tree-sra.c
>+++ b/gcc/tree-sra.c
>@@ -2257,7 +2257,7 @@ create_access_replacement (struct access *access,
>tree reg_type = NULL_TREE)
> print_generic_expr (dump_file, access->base);
> fprintf (dump_file, " offset: %u, size: %u: ",
> (unsigned) access->offset, (unsigned) access->size);
>- print_generic_expr (dump_file, repl);
>+ print_generic_expr (dump_file, repl, TDF_UID);
> fprintf (dump_file, "\n");
> }
> }
>@@ -3698,6 +3698,7 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator
>*gsi, bool write)
> location_t loc;
> struct access *access;
> tree type, bfr, orig_expr;
>+ bool partial_cplx_access = false;
>
> if (TREE_CODE (*expr) == BIT_FIELD_REF)
> {
>@@ -3708,7 +3709,10 @@ sra_modify_expr (tree *expr,
>gimple_stmt_iterator *gsi, bool write)
> bfr = NULL_TREE;
>
>if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) ==
>IMAGPART_EXPR)
>- expr = &TREE_OPERAND (*expr, 0);
>+ {
>+ expr = &TREE_OPERAND (*expr, 0);
>+ partial_cplx_access = true;
>+ }
> access = get_access_for_expr (*expr);
> if (!access)
> return false;
>@@ -3736,13 +3740,32 @@ sra_modify_expr (tree *expr,
>gimple_stmt_iterator *gsi, bool write)
> be accessed as a different type too, potentially creating a need for
> type conversion (see PR42196) and when scalarized unions are involved
> in assembler statements (see PR42398). */
>- if (!useless_type_conversion_p (type, access->type))
>+ if (!bfr && !useless_type_conversion_p (type, access->type))
> {
> tree ref;
>
> ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
>
>- if (write)
>+ if (partial_cplx_access)
>+ {
>+ /* VIEW_CONVERT_EXPRs in partial complex access are always fine
>in
>+ the case of a write because in such case the replacement
>cannot
>+ be a gimple register. In the case of a load, we have to
>+ differentiate in between a register an non-register
>+ replacement. */
>+ tree t = build1 (VIEW_CONVERT_EXPR, type, repl);
>+ gcc_checking_assert (!write || access->grp_partial_lhs);
>+ if (!access->grp_partial_lhs)
>+ {
>+ tree tmp = make_ssa_name (type);
>+ gassign *stmt = gimple_build_assign (tmp, t);
>+ /* This is always a read. */
>+ gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
>+ t = tmp;
>+ }
>+ *expr = t;
>+ }
>+ else if (write)
> {
> gassign *stmt;
>
>diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
>index e7eaa18ccad..3d8acf7eb03 100644
>--- a/gcc/tree-ssa-forwprop.c
>+++ b/gcc/tree-ssa-forwprop.c
>@@ -2815,7 +2815,8 @@ pass_forwprop::execute (function *fun)
> continue;
> if (!is_gimple_assign (use_stmt)
> || (gimple_assign_rhs_code (use_stmt) != REALPART_EXPR
>- && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR))
>+ && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
>+ || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
> {
> rewrite = false;
> break;
>@@ -2877,7 +2878,8 @@ pass_forwprop::execute (function *fun)
> if (is_gimple_debug (use_stmt))
> continue;
> if (!is_gimple_assign (use_stmt)
>- || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF)
>+ || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF
>+ || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
> {
> rewrite = false;
> break;
More information about the Gcc-patches
mailing list