This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch tree-ssa-forwprop]: Add type raising in shift-operations
- From: Kai Tietz <ktietz at redhat dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Guenther <richard dot guenther at gmail dot com>
- Date: Fri, 22 Nov 2013 20:43:24 -0500 (EST)
- Subject: Re: [patch tree-ssa-forwprop]: Add type raising in shift-operations
- Authentication-results: sourceware.org; auth=none
- References: <1875508960 dot 38568682 dot 1385046927867 dot JavaMail dot root at redhat dot com> <528FE1D4 dot 9030501 at redhat dot com>
----- Original Message -----
> On 11/21/13 08:15, Kai Tietz wrote:
> > Hi,
> >
> > this is the required forward-propagation part for the type-demotion pass as
> > separate patch.
> > This patch is sharing some adjustments in testsuite due new optimization of
> > type-casts on shift-operations.
> >
> > This patch tries to generate the "normal" form (TYPE) (X shift-op Y) out of
> > the "denormal" form "((TYPE) X) shift-op Y".
> >
> > ChangeLog
> >
> > 2013-11-21 Kai Tietz <ktietz@redhat.com>
> >
> > * tree-ssa-forwprop.c (simplify_shift): New function.
> > (ssa_forward_propagate_and_combine): Use it.
> >
> > ChangeLog testsuite
> >
> > * gcc.dg/tree-ssa-ts-shift-2.c: New test.
> >
> >
> > Shared testsuite part between type-demotion and forward-propagation
> > patches.
> >
> > Changelog gcc/testsuite:
> >
> > * gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
> > * gcc.dg/vect/vect-over-widen-1.c: Likewise.
> > * gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
> > * gcc.dg/vect/vect-over-widen-3.c: Likewise.
> > * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
> > * gcc.dg/vect/vect-over-widen-4.c: Likewise.
> >
> > Bootstrapped for x86_64-unknown-linux-gnu. Ok for apply?
> >
> > Index: gcc-org/gcc/testsuite/gcc.dg/tree-ssa/ts-shift-2.c
> > ===================================================================
> > --- /dev/null
> > +++ gcc-org/gcc/testsuite/gcc.dg/tree-ssa/ts-shift-2.c
> > @@ -0,0 +1,33 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-optimized" } */
> > +
> > +unsigned char f1 (char x)
> > +{
> > + unsigned char x1 = (unsigned char) x;
> > + return (x1 << 5);
> > +}
> > +
> > +unsigned short f2 (unsigned int x)
> > +{
> > + unsigned short x1 = (unsigned short) x;
> > + return (x1 << 15);
> > +}
> > +
> > +unsigned int f3 (unsigned short x)
> > +{
> > + unsigned long long x1 = (unsigned long long) x;
> > + return (unsigned int) (x1 >> 15);
> > +}
> > +
> > +unsigned long long f4 (unsigned short x)
> > +{
> > + unsigned int x1 = (unsigned int) x;
> > + return (unsigned long long) (x1 >> 7);
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "= \\\(long long unsigned int\\\)" 1
> > "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "= \\\(short unsigned int\\\)" 1
> > "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "= \\\(unsigned int\\\)" 1
> > "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "= \\\(unsigned char\\\)" 1
> > "optimized" } } */
> So at least for f1 and f2, these regexps match regardless of whether or
> not the code was transformed.
>
> Without your patch (f1)
> x1_2 = (unsigned char) x_1(D);
> _3 = x1_2 << 5;
> return _3;
>
> With your patch (f1):
> _4 = x_1(D) << 5;
> _2 = (unsigned char) _4;
> return _2;
>
> So all that's happened is we changed which object got casted -- we have
> the same number casts of an object to (unsigned char).
That is actual wished. We shouldn't come to patterns, which have more type-casts by this patch.
What we see here is the normalization of shift-left/right operations. This patch takes care that we prefer in general (Type) (X shift-left/right y) over ((Type) X) shift-left/right y notation.
> ISTM you might be better off scanning the details of the forwprop dump.
> For example, in f1 we get:
>
> Replaced _3 = x1_2 << 5;
> x1_2 = (unsigned char) x_1(D);
> by by _3 = (unsigned char) _5;
> _5 = x_1(D) << 5;
>
>
> We basically see the same thing happen in f2.
>
> For f1 & f2 I don't see that we've done anything other than change the
> casts. They don't show any benefit from this patch.
>
> f3 & f4 on the other hand do show improvements.
I adjusted test so, that f1, and f2 are showing an effect, too. I had kept here simple transformation to see normalized pattern-form.
In new variant the checks won't match without that patch anymore.
>
> WHen I looked over a bunch of .i files, the results were mixed.
> Sometimes better, sometimes worse without any clear bias.
Hmm, this sample here is questionable.
> So here's an example that gets worse:
>
>
> *************** rshift_double (long unsigned int l1, lon
> *** 651,661 ****
> _29 = (long int) _28;
> *hv_13(D) = _29;
> _31 = l1_9(D) >> _27;
> ! _32 = (unsigned int) count_4(D);
> ! _33 = 63 - _32;
> ! _34 = (int) _33;
> ! _35 = h1.18_26 << _34;
> ! _36 = _35 << 1;
> _37 = _36 | _31;
> *lv_12(D) = _37;
> ;; succ: 11
> --- 652,663 ----
> _29 = (long int) _28;
> *hv_13(D) = _29;
> _31 = l1_9(D) >> _27;
> ! _33 = (unsigned int) count_4(D);
> ! _34 = 63 - _33;
> ! _35 = (int) _34;
> ! _71 = h1_10(D) << _35;
> ! _32 = _71 << 1;
> ! _36 = (long unsigned int) _32;
> _37 = _36 | _31;
> *lv_12(D) = _37;
>
>
> > +/* { dg-final { cleanup-tree-dump "optimized" } } */
> > +
> > Index: gcc-org/gcc/tree-ssa-forwprop.c
> > ===================================================================
> > --- gcc-org.orig/gcc/tree-ssa-forwprop.c
> > +++ gcc-org/gcc/tree-ssa-forwprop.c
> > @@ -551,6 +551,107 @@ forward_propagate_into_gimple_cond (gimp
> > return 0;
> > }
> >
> > +/* Try to simplify shift-statement ((TYPE1) X) CODE Y for
> > + integral-kind types.
> > + Returns none-zero if the stmt was changed. */
> Simplify it to what? I know from reading the patch explanation, but
> it'd be better to include that in a comment. Even better would be to
> show a little sequence of psuedo-gimple and how it gets changed.
Agreed. Simplify might be the wrong term here. Due we are doing pattern-transformation - if possible - into its normalized form.
I modified comment here, and adjusted the print to display by-string only once.
>
> > +
> > +static int
> > +simplify_shift (enum tree_code code, gimple_stmt_iterator *gsi_p)
> > +{
> > + gimple stmt = gsi_stmt (*gsi_p);
> > + gimple def, newop;
> > + tree op1, opx, op2, t_opx, tem;
> > + int ret;
> > +
> > + if (!INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (stmt))))
> > + return 0;
> > +
> > + op1 = gimple_assign_rhs1 (stmt);
> > + if (TREE_CODE (op1) != SSA_NAME
> > + || !(def = SSA_NAME_DEF_STMT (op1))
> > + || !is_gimple_assign (def)
> > + || !gimple_assign_cast_p (def)
> > + || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op1)
> > + || !has_single_use (op1))
> > + return 0;
> > +
> > + opx = gimple_assign_rhs1 (def);
> > + if (TREE_CODE (opx) != SSA_NAME)
> > + return 0;
> > +
> > + t_opx = TREE_TYPE (opx);
> > +
> > + if (!INTEGRAL_TYPE_P (t_opx))
> > + return 0;
> > +
> > + op2 = gimple_assign_rhs2 (stmt);
> > +
> > + if (code == LSHIFT_EXPR)
> > + {
> > + if (TYPE_PRECISION (TREE_TYPE (op1))
> > + > TYPE_PRECISION (t_opx))
> > + return 0;
> > + }
> > + else if (code == RSHIFT_EXPR)
> > + {
> > + /* If type of OPX and OP1 are compatible, we don't need to check OP2
> > + for validity as we don't change range of operation.
> > + Otherwise we need to check that right-hand operand of shift-right
> > + doesn't exceed type-precision of inner operand. */
> There's some kind of whitespace problem here.
>
> > +
> > + /* Do transformation ((T1) X) shift-code N -> (T1) (X shift-code N). */
> > + if (dump_file)
> > + {
> > + fprintf (dump_file, " Replaced ");
> > + print_gimple_stmt (dump_file, stmt, 0, 0);
> > + fprintf (dump_file, " ");
> > + print_gimple_stmt (dump_file, def, 0, 0);
> > + fprintf (dump_file, " by ");
> > + }
> > +
> > + tem = make_ssa_name (t_opx, NULL);
> > + newop = gimple_build_assign_with_ops (code, tem, opx, op2);
> > + gimple_set_location (newop, gimple_location (stmt));
> > + gsi_insert_before (gsi_p, newop, GSI_SAME_STMT);
> > + gimple_assign_set_rhs_with_ops_1 (gsi_p, NOP_EXPR, tem, NULL_TREE,
> > NULL_TREE);
> > +
> > + if (gimple_location (def) != UNKNOWN_LOCATION)
> > + gimple_set_location (stmt, gimple_location (def));
> > +
> > + stmt = gsi_stmt (*gsi_p);
> > + update_stmt (stmt);
> > +
> > + if (TREE_CODE (op1) == SSA_NAME)
> > + ret = remove_prop_source_from_use (op1) ? 2 : 1;
> > + else
> > + ret = 1;
> > + if (dump_file)
> > + {
> > + fprintf (dump_file, " by ");
> > + print_gimple_stmt (dump_file, stmt, 0, 0);
> > + fprintf (dump_file, " ");
> > + print_gimple_stmt (dump_file, newop, 0, 0);
> As someone has already noted, the dump files are sub-optimal with the
> "by by" output:
Yeah, adjusted it in attached patch.
> Replaced _3 = x1_2 << 5;
> x1_2 = (unsigned char) x_1(D);
> by by _3 = (unsigned char) _5;
> _5 = x_1(D) << 5;
> >
> > /* Propagate from the ssa name definition statements of COND_EXPR
> > in the rhs of statement STMT into the conditional if that simplifies
> > it.
> > @@ -3432,6 +3533,15 @@ ssa_forward_propagate_and_combine (void)
> > || code == NEGATE_EXPR)
> > && TREE_CODE (rhs1) == SSA_NAME)
> > changed = simplify_not_neg_expr (&gsi);
> > + else if (code == LSHIFT_EXPR
> > + || code == RSHIFT_EXPR)
> > + {
> > + int did_something;
> > + did_something = simplify_shift (code, &gsi);
> > + if (did_something == 2)
> > + cfg_changed = true;
> > + changed = did_something != 0;
> > + }
> > else if (code == COND_EXPR
> > || code == VEC_COND_EXPR)
> > {
> Does this apply to rotates as well?
For rotate this transformation doesn't apply in the same way. Just in case that type-cast and X have same precision we could do this transformation. So this case might be some future enhancement for this function.
>
>
> >
> > Index: gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
> > ===================================================================
> > --- gcc-trunk.orig/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
> > +++ gcc-trunk/gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
> > @@ -58,9 +58,8 @@ int main (void)
> > return 0;
> > }
> >
> > -/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern:
> > detected" 2 "vect" { target vect_widen_shift } } } */
> > -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern:
> > detected" 2 "vect" { target vect_widen_shift } } } */
> > -/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern:
> > detected" 4 "vect" { target { ! vect_widen_shift } } } } */
> > +/* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern:
> > detected" 4 "vect" { target vect_widen_shift } } } */
> > +/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern:
> > detected" 0 "vect" } } */
> > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> > /* { dg-final { cleanup-tree-dump "vect" } } */
> I'm guessing this was one of the cases you mentioned to me privately via
> email. Are we still vectorizing as well, just differently without the
> need for as many conversions?
Yes. The vectorization still happens in the same way as before. Just the shift-widening-patterns getting superfluous by this patch.
> Simliarly for the other vect testcases. At the least you need to
> indicate we're not actually regressing on those tests -- when you change
> an epected output in a way that makes it look like the test might be
> compromised, you need to explain yourself more thoroughly. My obvious
> concern is we're not vectorizing as much and the testsuite changes are
> merely a way of hiding that fact.
The output-test isn't changed. All tests still result in vectorized loop and if you take a closer look to generated output, you will notice that generated vectorization is identical to that one without this patch.
> I think there's some questions that need to be answered before this can
> go forward.
>
> jeff
>
>
>
Regards,
Kai
ChangeLog
2013-11-22 Kai Tietz <ktietz@redhat.com>
* tree-ssa-forwprop.c (simplify_shift): New function.
(ssa_forward_propagate_and_combine): Use it.
ChangeLog testsuite
* gcc.dg/tree-ssa-ts-shift-2.c: New test.
Bootstrapped for x86_64-unknown-linux-gnu. Ok for apply?
Regards,
Kai
Index: gcc-org/gcc/testsuite/gcc.dg/tree-ssa/ts-shift-2.c
===================================================================
--- /dev/null
+++ gcc-org/gcc/testsuite/gcc.dg/tree-ssa/ts-shift-2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+char f1 (char x)
+{
+ unsigned char x1 = (unsigned char) x;
+ return (x1 << 5);
+}
+
+short f2 (unsigned int x)
+{
+ unsigned short x1 = (unsigned short) x;
+ return (x1 << 15);
+}
+
+unsigned int f3 (unsigned short x)
+{
+ unsigned long long x1 = (unsigned long long) x;
+ return (unsigned int) (x1 >> 15);
+}
+
+unsigned long long f4 (unsigned short x)
+{
+ unsigned int x1 = (unsigned int) x;
+ return (unsigned long long) (x1 >> 7);
+}
+
+/* { dg-final { scan-tree-dump-times "= \\\(long long unsigned int\\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\\(short int\\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\\(unsigned int\\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\\(unsigned char\\\)" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+
Index: gcc-org/gcc/tree-ssa-forwprop.c
===================================================================
--- gcc-org.orig/gcc/tree-ssa-forwprop.c
+++ gcc-org/gcc/tree-ssa-forwprop.c
@@ -551,6 +551,107 @@ forward_propagate_into_gimple_cond (gimp
return 0;
}
+/* Try to normalize shift-left/right statement from form
+ ((TYPE1) X) CODE Y to (TYPE) (X CODE Y). TYPE, and X have to be
+ integral-kind types.
+ Returns none-zero if the stmt was changed. */
+
+static int
+simplify_shift (enum tree_code code, gimple_stmt_iterator *gsi_p)
+{
+ gimple stmt = gsi_stmt (*gsi_p);
+ gimple def, newop;
+ tree op1, opx, op2, t_opx, tem;
+ int ret;
+
+ if (!INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (stmt))))
+ return 0;
+
+ op1 = gimple_assign_rhs1 (stmt);
+ if (TREE_CODE (op1) != SSA_NAME
+ || !(def = SSA_NAME_DEF_STMT (op1))
+ || !is_gimple_assign (def)
+ || !gimple_assign_cast_p (def)
+ || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op1)
+ || !has_single_use (op1))
+ return 0;
+
+ opx = gimple_assign_rhs1 (def);
+ if (TREE_CODE (opx) != SSA_NAME)
+ return 0;
+
+ t_opx = TREE_TYPE (opx);
+
+ if (!INTEGRAL_TYPE_P (t_opx))
+ return 0;
+
+ op2 = gimple_assign_rhs2 (stmt);
+
+ if (code == LSHIFT_EXPR)
+ {
+ if (TYPE_PRECISION (TREE_TYPE (op1))
+ > TYPE_PRECISION (t_opx))
+ return 0;
+ }
+ else if (code == RSHIFT_EXPR)
+ {
+ /* If type of OPX and OP1 are compatible, we don't need to check OP2
+ for validity as we don't change range of operation.
+ Otherwise we need to check that right-hand operand of shift-right
+ doesn't exceed type-precision of inner operand. */
+ if (!useless_type_conversion_p (TREE_TYPE (op1), t_opx))
+ {
+ if (TREE_CODE (op2) != INTEGER_CST)
+ return 0;
+ if (integer_zerop (fold_binary (LT_EXPR, boolean_type_node, op2,
+ build_int_cst (TREE_TYPE (op2),
+ TYPE_PRECISION (t_opx)))))
+ return 0;
+
+ /* See if cast can be moved out of the shift-right operation. */
+ if (TYPE_PRECISION (TREE_TYPE (op1)) <= TYPE_PRECISION (t_opx)
+ || (!TYPE_UNSIGNED (t_opx) && TYPE_UNSIGNED (TREE_TYPE (op1))))
+ return 0;
+ }
+ }
+ else
+ return 0;
+
+ /* Do transformation ((T1) X) shift-code N -> (T1) (X shift-code N). */
+ if (dump_file)
+ {
+ fprintf (dump_file, " Replaced ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ fprintf (dump_file, " ");
+ print_gimple_stmt (dump_file, def, 0, 0);
+ }
+
+ tem = make_ssa_name (t_opx, NULL);
+ newop = gimple_build_assign_with_ops (code, tem, opx, op2);
+ gimple_set_location (newop, gimple_location (stmt));
+ gsi_insert_before (gsi_p, newop, GSI_SAME_STMT);
+ gimple_assign_set_rhs_with_ops_1 (gsi_p, NOP_EXPR, tem, NULL_TREE, NULL_TREE);
+
+ if (gimple_location (def) != UNKNOWN_LOCATION)
+ gimple_set_location (stmt, gimple_location (def));
+
+ stmt = gsi_stmt (*gsi_p);
+ update_stmt (stmt);
+
+ if (TREE_CODE (op1) == SSA_NAME)
+ ret = remove_prop_source_from_use (op1) ? 2 : 1;
+ else
+ ret = 1;
+ if (dump_file)
+ {
+ fprintf (dump_file, " by ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ fprintf (dump_file, " ");
+ print_gimple_stmt (dump_file, newop, 0, 0);
+ }
+
+ return ret;
+}
/* Propagate from the ssa name definition statements of COND_EXPR
in the rhs of statement STMT into the conditional if that simplifies it.
@@ -3432,6 +3533,15 @@ ssa_forward_propagate_and_combine (void)
|| code == NEGATE_EXPR)
&& TREE_CODE (rhs1) == SSA_NAME)
changed = simplify_not_neg_expr (&gsi);
+ else if (code == LSHIFT_EXPR
+ || code == RSHIFT_EXPR)
+ {
+ int did_something;
+ did_something = simplify_shift (code, &gsi);
+ if (did_something == 2)
+ cfg_changed = true;
+ changed = did_something != 0;
+ }
else if (code == COND_EXPR
|| code == VEC_COND_EXPR)
{