[PATCH] tree-ssa-math-opts: Pattern recognize hand written __builtin_mul_overflow_p with same unsigned types even when target just has highpart umul [PR101856]

Richard Biener rguenther@suse.de
Fri May 19 10:43:19 GMT 2023



> Am 19.05.2023 um 10:00 schrieb Jakub Jelinek <jakub@redhat.com>:
> 
> Hi!
> 
> As can be seen on the following testcase, we pattern recognize it on
> i?86/x86_64 as return __builtin_mul_overflow_p (x, y, 0UL) and avoid
> that way the extra division, but don't do it e.g. on aarch64 or ppc64le,
> even when return __builtin_mul_overflow_p (x, y, 0UL); actually produces
> there better code.  The reason for testing the presence of the optab
> handler is to make sure the generated code for it is short to ensure
> we don't actually pessimize code instead of optimizing it.
> But, we have one case that the internal-fn.cc .MUL_OVERFLOW expansion
> handles nicely, and that is when arguments/result is the same mode
> TYPE_UNSIGNED type, we only use IMAGPART_EXPR of it (i.e.
> __builtin_mul_overflow_p rather than __builtin_mul_overflow) and
> umul_highpart_optab supports the particular mode, in that case
> we emit comparison of the highpart umul result against zero.
> 
> So, the following patch matches what we do in internal-fn.cc and
> also pattern matches __builtin_mul_overflow_p if
> 1) we only need the flag whether it overflowed (i.e. !use_seen)
> 2) it is unsigned (i.e. !cast_stmt)
> 3) umul_highpart is supported for the mode
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux, aarch64-linux and
> powerpc64le-linux, ok for trunk?

Ok.

Richard 

> 2023-05-19  Jakub Jelinek  <jakub@redhat.com>
> 
>    PR tree-optimization/101856
>    * tree-ssa-math-opts.cc (match_arith_overflow): Pattern detect
>    unsigned __builtin_mul_overflow_p even when umulv4_optab doesn't
>    support it but umul_highpart_optab does.
> 
>    * gcc.dg/tree-ssa/pr101856.c: New test.
> 
> --- gcc/tree-ssa-math-opts.cc.jj    2023-05-17 20:57:59.537914382 +0200
> +++ gcc/tree-ssa-math-opts.cc    2023-05-18 12:04:09.332336899 +0200
> @@ -4074,7 +4074,10 @@ match_arith_overflow (gimple_stmt_iterat
>                TYPE_MODE (type)) == CODE_FOR_nothing)
>       || (code == MULT_EXPR
>      && optab_handler (cast_stmt ? mulv4_optab : umulv4_optab,
> -                TYPE_MODE (type)) == CODE_FOR_nothing))
> +                TYPE_MODE (type)) == CODE_FOR_nothing
> +      && (use_seen
> +          || cast_stmt
> +          || !can_mult_highpart_p (TYPE_MODE (type), true))))
>     {
>       if (code != PLUS_EXPR)
>    return false;
> --- gcc/testsuite/gcc.dg/tree-ssa/pr101856.c.jj    2023-05-18 11:57:17.681206745 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr101856.c    2023-05-18 11:56:51.662577752 +0200
> @@ -0,0 +1,11 @@
> +/* PR tree-optimization/101856 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump " .MUL_OVERFLOW " "optimized" { target i?86-*-* x86_64-*-* aarch64*-*-* powerpc64le-*-* } } } */
> +
> +int
> +foo (unsigned long x, unsigned long y)
> +{
> +  unsigned long z = x * y;
> +  return z / y != x;
> +}
> 
>    Jakub
> 


More information about the Gcc-patches mailing list