This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 11 Feb 2015 11:56:18 +0100
- Subject: Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
- Authentication-results: sourceware.org; auth=none
- References: <54DA7029 dot 4070001 at redhat dot com>
On Tue, Feb 10, 2015 at 9:55 PM, Jeff Law <law@redhat.com> wrote:
>
> This PR was originally minor issue where we regressed on this kind of
> sequence:
>
> typedef struct toto_s *toto_t;
> toto_t add (toto_t a, toto_t b) {
> int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
> return (toto_t)(intptr_t) tmp;
> }
>
>
> There was talk of trying to peephole this in the x86 backend. But later
> Jakub speculated that if we had good type narrowing this could be done in
> the tree optimizers...
>
> Soooo, here we go. I didn't do anything with logicals are those are already
> handled elsewhere in match.pd. I didn't try to handle MULT as in the early
> experiments I did, it was a lose because of the existing mechanisms for
> widening multiplications.
>
> Interestingly enough, this patch seems to help out libjava more than
> anything else in a GCC build and it really only helps a few routines. There
> weren't any routines I could see where the code regressed after this patch.
> This is probably an indicator that these things aren't *that* common, or the
> existing shortening code better than we thought, or some important
> shortening case is missing.
>
>
> I think we should pull the other tests from 47477 which are not regressions
> out into their own bug for future work. Or alternately, when this fix is
> checked in remove the regression marker in 47477.
>
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for the
> trunk?
>
>
>
>
>
>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 7f3816c..7a95029 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10 Jeff Law <law@redhat.com>
> +
> + * match.pd (convert (plus/minus (convert @0) (convert @1): New
> + simplifier to narrow arithmetic.
> +
Please reference PR47477 from here
> 2015-02-10 Richard Biener <rguenther@suse.de>
>
> PR tree-optimization/64909
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 81c4ee6..abc703e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3. If not see
> (logs (pows @0 @1))
> (mult @1 (logs @0)))))
>
> +/* If we have a narrowing conversion of an arithmetic operation where
> + both operands are widening conversions from the same type as the outer
> + narrowing conversion. Then convert the innermost operands to a suitable
> + unsigned type (to avoid introducing undefined behaviour), perform the
> + operation and convert the result to the desired type.
> +
> + This narrows the arithmetic operation. */
This is also a part of what shorten_binary_op does, so please refer to
that in the comment so we can eventually complete this from there
and remove shorten_binary_op.
> +(for op (plus minus)
> + (simplify
> + (convert (op (convert@2 @0) (convert @1)))
> + (if (TREE_TYPE (@0) == TREE_TYPE (@1)
> + && TREE_TYPE (@0) == type
Otherwhere in match.pd we use
(GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
|| (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1))
for type equality checks. And I think even for GENERIC you want
to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0))
== TYPE_MAIN_VARAINT (TREE_TYPE (@1)).
> + && INTEGRAL_TYPE_P (type)
So instead of testing TREE_TYPE (@0) == type we probably want
to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION
(type) to allow sign-changes. Say for
short i, j;
(unsigned short) (i + j)
we still want to narrow the widened i and j.
> + && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
> (@0))
> + /* This prevents infinite recursion. */
> + && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
How can it recurse with the type precision constraint right above?
If you think it's necessary then nesting that condition below as
> + (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
(if (utype != TREE_TYPE (@2))
avoids evaluating unsigned_type_for twice.
Note that technically we don't need to perform the operation in an unsigned
type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)). Thus
(if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
(convert (op @0 @1)))
(with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
(convert (op (convert:utype @0) (convert:utype @1))))))
You can see that GCC exploits that with -fwrapv. Maybe this
simplification should be split out into a separate pattern though,
tacking sign-changes for binary ops only.
Thanks,
Richard.
> + (convert (op (convert:utype @0) (convert:utype @1)))))))
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 15d5e2d..76e5254 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10 Jeff Law <law@redhat.com>
> +
> + PR rtl-optimization/47477
> + * gcc.dg/tree-ssa/narrow-arith-1.c: New test.
> +
> 2015-02-10 Richard Biener <rguenther@suse.de>
>
> PR tree-optimization/64909
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> new file mode 100644
> index 0000000..104cb6f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/47477 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -w" } */
> +/* { dg-require-effective-target ilp32 } */
> +
> +typedef int int64_t __attribute__ ((__mode__ (__DI__)));
> +typedef int * intptr_t;
> +
> +typedef struct toto_s *toto_t;
> +toto_t add (toto_t a, toto_t b) {
> + int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
> + return (toto_t)(intptr_t) tmp;
> +}
> +
> +/* For an ILP32 target there'll be 6 casts when we start, but just 4
> + if the match.pd pattern is successfully matched. */
> +/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" }
> } */
> +/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1
> "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
> +
>