[PATCH] tree-ssa-math-opts: Pattern recognize hand written __builtin_mul_overflow_p with same unsigned types even when target just has highpart umul [PR101856]
Richard Biener
rguenther@suse.de
Fri May 19 10:43:19 GMT 2023
> Am 19.05.2023 um 10:00 schrieb Jakub Jelinek <jakub@redhat.com>:
>
> Hi!
>
> As can be seen on the following testcase, we pattern recognize it on
> i?86/x86_64 as return __builtin_mul_overflow_p (x, y, 0UL) and avoid
> that way the extra division, but don't do it e.g. on aarch64 or ppc64le,
> even when return __builtin_mul_overflow_p (x, y, 0UL); actually produces
> there better code. The reason for testing the presence of the optab
> handler is to make sure the generated code for it is short to ensure
> we don't actually pessimize code instead of optimizing it.
> But, we have one case that the internal-fn.cc .MUL_OVERFLOW expansion
> handles nicely, and that is when arguments/result is the same mode
> TYPE_UNSIGNED type, we only use IMAGPART_EXPR of it (i.e.
> __builtin_mul_overflow_p rather than __builtin_mul_overflow) and
> umul_highpart_optab supports the particular mode, in that case
> we emit comparison of the highpart umul result against zero.
>
> So, the following patch matches what we do in internal-fn.cc and
> also pattern matches __builtin_mul_overflow_p if
> 1) we only need the flag whether it overflowed (i.e. !use_seen)
> 2) it is unsigned (i.e. !cast_stmt)
> 3) umul_highpart is supported for the mode
>
> Bootstrapped/regtested on x86_64-linux, i686-linux, aarch64-linux and
> powerpc64le-linux, ok for trunk?
Ok.
Richard
> 2023-05-19 Jakub Jelinek <jakub@redhat.com>
>
> PR tree-optimization/101856
> * tree-ssa-math-opts.cc (match_arith_overflow): Pattern detect
> unsigned __builtin_mul_overflow_p even when umulv4_optab doesn't
> support it but umul_highpart_optab does.
>
> * gcc.dg/tree-ssa/pr101856.c: New test.
>
> --- gcc/tree-ssa-math-opts.cc.jj 2023-05-17 20:57:59.537914382 +0200
> +++ gcc/tree-ssa-math-opts.cc 2023-05-18 12:04:09.332336899 +0200
> @@ -4074,7 +4074,10 @@ match_arith_overflow (gimple_stmt_iterat
> TYPE_MODE (type)) == CODE_FOR_nothing)
> || (code == MULT_EXPR
> && optab_handler (cast_stmt ? mulv4_optab : umulv4_optab,
> - TYPE_MODE (type)) == CODE_FOR_nothing))
> + TYPE_MODE (type)) == CODE_FOR_nothing
> + && (use_seen
> + || cast_stmt
> + || !can_mult_highpart_p (TYPE_MODE (type), true))))
> {
> if (code != PLUS_EXPR)
> return false;
> --- gcc/testsuite/gcc.dg/tree-ssa/pr101856.c.jj 2023-05-18 11:57:17.681206745 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr101856.c 2023-05-18 11:56:51.662577752 +0200
> @@ -0,0 +1,11 @@
> +/* PR tree-optimization/101856 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump " .MUL_OVERFLOW " "optimized" { target i?86-*-* x86_64-*-* aarch64*-*-* powerpc64le-*-* } } } */
> +
> +int
> +foo (unsigned long x, unsigned long y)
> +{
> + unsigned long z = x * y;
> + return z / y != x;
> +}
>
> Jakub
>
More information about the Gcc-patches
mailing list