LLVM optimizes bar into tgt here but not foo. https://godbolt.org/z/nhEjaoanx int foo(int a, int b) { return b * (1 + a / b) - a; } int bar(int a, int b) { return b * (a / b) + b - a; } int tgt(int a, int b) { return b - a % b; } LLVM appears to miss this too.
Correction on first line: *GCC optimizes bar into tgt here but not foo. Pardon my sloppy copy-paste from my bug report over on LLVM's bugzilla!
So on aarch64, the code for all three functions are almost the same: foo(int, int): sdiv w2, w0, w1 madd w1, w2, w1, w1 sub w0, w1, w0 ret bar(int, int): sdiv w2, w0, w1 madd w1, w2, w1, w1 sub w0, w1, w0 ret tgt(int, int): sdiv w2, w0, w1 msub w0, w2, w1, w0 sub w0, w1, w0 ret MSVC can do the transformation.
Created attachment 51752 [details] [PATCH] PR tree-optimization/102232
This patch I attached will fix this problem and does include the test [1]. You can follow the discussion in GCC-Patches here [1]. Although it seems I still have problem to fix with MIME type of the patch in mailing list. 1) https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583737.html
Comment on attachment 51752 [details] [PATCH] PR tree-optimization/102232 >From 7c2abb0eab05766ab879066b000c13de827e3b3d Mon Sep 17 00:00:00 2001 >From: Navid Rahimi <navidrahimi@microsoft.com> >Date: Mon, 8 Nov 2021 13:57:19 -0800 >Subject: [PATCH] PR tree-optimization/102232 > > * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization. > * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization. >--- > gcc/match.pd | 7 ++++ > gcc/testsuite/gcc.dg/tree-ssa/pr102232.c | 52 ++++++++++++++++++++++++ > 2 files changed, 59 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102232.c > >diff --git a/gcc/match.pd b/gcc/match.pd >index 71cf6f9df0a..37c01e79d97 100644 >--- a/gcc/match.pd >+++ b/gcc/match.pd >@@ -1295,6 +1295,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (bit_xor (bit_ior:c (bit_not @0) @1) (bit_ior:c @0 (bit_not @1))) > (bit_xor @0 @1)) > >+/* x * (1 + y / x) - y -> x - y % x */ >+(simplify >+ (minus (mult:cs @0 (plus:cs integer_onep (trunc_div:s @1 @0))) @1) >+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) >+ && types_match (@0, @1)) >+ (minus @0 (trunc_mod @1 @0)))) >+ > /* ((x & y) - (x | y)) - 1 -> ~(x ^ y) */ > (simplify > (plus (nop_convert1? (minus@2 (nop_convert2? (bit_and:c @0 @1)) >diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c b/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c >new file mode 100644 >index 00000000000..e7485cf24e9 >--- /dev/null >+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c >@@ -0,0 +1,52 @@ >+/* PR tree-optimization/102232 */ >+/* { dg-do run } */ >+/* { dg-options "-O3 -fdump-tree-optimized" } */ >+ >+int __attribute__ ((noipa)) foo (int a, int b) >+{ >+ return b * (1 + a / b) - a; >+} >+ >+int >+main (void) >+{ >+ // few randomly generated test cases >+ if (foo (71856034, 238) != 212) >+ { >+ return 1; >+ } >+ if (foo (71856034, 10909) != 1549) >+ { >+ return 1; >+ } >+ if (foo (20350, 1744) != 578) >+ { >+ return 1; >+ } >+ if (foo (444813, 88563) != 86565) >+ { >+ return 1; >+ } >+ if (foo (112237, 63004) != 13771) >+ { >+ return 1; >+ } >+ if (foo (68268386, 787116) != 210706) >+ { >+ return 1; >+ } >+ if (foo (-444813, 88563) != 90561) >+ { >+ return 1; >+ } >+ if (foo (-68268386, 787116) != 1363526) >+ { >+ return 1; >+ } >+ >+ return 0; >+} >+ >+/* Verify that multiplication and division has been removed. */ >+/* { dg-final { scan-tree-dump-not " \\* " "optimized" } } */ >+/* { dg-final { scan-tree-dump-not " / " "optimized" } } */ >\ No newline at end of file >-- >2.25.1 >
Created attachment 51760 [details] [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
The new version of the patch I attached to this bug has been approved by Richard Biener in this thread [1]. 1) https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583935.html
The master branch has been updated by Jeff Law <law@gcc.gnu.org>: https://gcc.gnu.org/g:df1a0d526e2e4c75311345c0b73ce8483e243899 commit r12-5460-gdf1a0d526e2e4c75311345c0b73ce8483e243899 Author: Navid Rahimi <navidrahimi@microsoft.com> Date: Mon Nov 22 22:07:35 2021 -0500 Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd PR tree-optimization/102232 gcc/ * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization. gcc/testsuite/ * gcc.dg/tree-ssa/pr102232.c: Testcase for this optimization.
This is fixed on mainline; the godbolt link in comment #1 shows that GCC now generates the same code for all three functions.