Bug 105991 - [12 Regression] rldicl+sldi+add generated instead of rldimi
Summary: [12 Regression] rldicl+sldi+add generated instead of rldimi
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 12.0
: P3 normal
Target Milestone: 13.2
Assignee: Not yet assigned to anyone
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2022-06-15 17:09 UTC by Marek Polacek
Modified: 2023-04-26 08:01 UTC (History)
5 users (show)

See Also:
Host: powerpc64le-unknown-linux-gnu
Target: powerpc64le-unknown-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-06-16 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marek Polacek 2022-06-15 17:09:34 UTC
Starting with r12-2731-g96146e61cd7aee, this code (on ppc64le)

unsigned long long
foo (unsigned long long value)
{
  value &= 0xffffffff;
  value |= value << 32;
  return value;
}

compiled with -O2 generates

	rldicl 9,3,0,32
	sldi 3,3,32
	add 3,3,9
	blr

while previously it was just

	rldimi 3,3,32,0
	blr


It doesn't look like a wrong code problem, but it seems more optimal to use rldimi (rotate left, mask insert) rather than rotate left by 0 bits, AND with a mask, shift left, and add.
Comment 1 Roger Sayle 2022-06-16 07:57:06 UTC
The following patch appears to correct this for me on a cross-compiler to powerpcle64, but it's tricky for me to do a full bootstrap/regression test.

2022-06-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        PR target/105991
        * config/rs6000/rs6000.md (plus_xor): New code iterator.
        (*rotl<mode>3_insert_3_<code>): New define_insn_and_split.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c55ee7e..695ec33 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4188,6 +4188,23 @@
 }
   [(set_attr "type" "insert")])

+; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3
+(define_code_iterator plus_xor [plus xor])
+
+(define_insn_and_split "*rotl<mode>3_insert_3_<code>"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+       (plus_xor:GPR
+         (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0")
+                  (match_operand:GPR 4 "const_int_operand" "n"))
+         (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
+                     (match_operand:SI 2 "const_int_operand" "n"))))]
+  "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+       (ior:GPR (and:GPR (match_dup 3) (match_dup 4))
+                (ashift:GPR (match_dup 1) (match_dup 2))))])
+
 (define_code_iterator plus_ior_xor [plus ior xor])

 (define_split
Comment 2 Marek Polacek 2022-06-16 14:49:36 UTC
(In reply to Roger Sayle from comment #1)
> The following patch appears to correct this for me on a cross-compiler to
> powerpcle64, but it's tricky for me to do a full bootstrap/regression test.

Thanks for the patch.  I'm testing it and will report back the results.
Comment 3 Marek Polacek 2022-06-16 17:33:47 UTC
Regtest/bootstrap passed on powerpc64le-unknown-linux-gnu.  I did not test Ada.
Comment 4 Segher Boessenkool 2022-06-18 01:55:29 UTC
(In reply to Marek Polacek from comment #0)
> It doesn't look like a wrong code problem, but it seems more optimal to use
> rldimi (rotate left, mask insert) rather than rotate left by 0 bits, AND
> with a mask, shift left, and add.

Confirmed.  The original code is much better (and yes, the current is correct
as well).
Comment 5 Roger Sayle 2022-06-18 06:30:38 UTC
Patch proposed: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596778.html
Comment 6 GCC Commits 2022-06-21 23:10:37 UTC
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:4306339798b6843937c628c5ece8c234b309b13d

commit r13-1191-g4306339798b6843937c628c5ece8c234b309b13d
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Wed Jun 22 00:08:56 2022 +0100

    PR target/105991: Recognize PLUS and XOR forms of rldimi in rs6000.md.
    
    This patch addresses PR target/105991 where a change to prefer representing
    shifts and adds at the tree-level as multiplications, causes problems for
    the rldimi patterns in the powerpc backend.  The issue is that rs6000.md
    models this pattern using IOR, and some variants that have the equivalent
    PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
    This is fixed in this patch by adding a define_insn_and_split to locally
    canonicalize the PLUS and XOR forms to the backend's preferred IOR form.
    
    An alternative fix might be for the RTL optimizers to define a canonical
    form for these plus_xor_ior equivalent expressions, but the logical
    choice might be plus (which may appear in an addressing mode), and such
    a change may require a number of tweaks to update various backends
    (i.e.  a more intrusive change than the one proposed here).
    
    Many thanks for Marek Polacek for bootstrapping and regression testing
    this change without problems.
    
    2022-06-22  Roger Sayle  <roger@nextmovesoftware.com>
                Marek Polacek  <polacek@redhat.com>
                Segher Boessenkool  <segher@kernel.crashing.org>
                Kewen Lin  <linkw@linux.ibm.com>
    
    gcc/ChangeLog
            PR target/105991
            * config/rs6000/rs6000.md (rotl<mode>3_insert_3): Check that
            exact_log2 doesn't return -1 (or zero).
            (plus_xor): New code iterator.
            (*rotl<mode>3_insert_3_<code>): New define_insn_and_split.
    
    gcc/testsuite/ChangeLog
            PR target/105991
            * gcc.target/powerpc/pr105991.c: New test case.
Comment 7 Roger Sayle 2022-06-24 09:49:22 UTC
This should now be fixed on mainline.  If anyone feels strongly that the fix should be backported to the GCC 12 branch, please feel free to reopen this PR.
Thanks again to Marek.
Comment 8 Segher Boessenkool 2022-06-24 15:33:44 UTC
Yes, this needs a backport.
Comment 9 GCC Commits 2022-07-04 13:03:12 UTC
The releases/gcc-12 branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:6c175b3d170de2bb02b7bd45b3348eec05d28451

commit r12-8547-g6c175b3d170de2bb02b7bd45b3348eec05d28451
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Mon Jul 4 13:58:37 2022 +0100

    PR target/105991: Recognize PLUS and XOR forms of rldimi in rs6000.md.
    
    This patch addresses PR target/105991 where a change to prefer representing
    shifts and adds at the tree-level as multiplications, causes problems for
    the rldimi patterns in the powerpc backend.  The issue is that rs6000.md
    models this pattern using IOR, and some variants that have the equivalent
    PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
    This is fixed in this patch by adding a define_insn_and_split to locally
    canonicalize the PLUS and XOR forms to the backend's preferred IOR form.
    
    Backported from master.
    
    2022-07-04  Roger Sayle  <roger@nextmovesoftware.com>
                Marek Polacek  <polacek@redhat.com>
                Segher Boessenkool  <segher@kernel.crashing.org>
                Kewen Lin  <linkw@linux.ibm.com>
    
    gcc/ChangeLog
            PR target/105991
            * config/rs6000/rs6000.md (rotl<mode>3_insert_3): Check that
            exact_log2 doesn't return -1 (or zero).
            (plus_xor): New code iterator.
            (*rotl<mode>3_insert_3_<code>): New define_insn_and_split.
    
    gcc/testsuite/ChangeLog
            PR target/105991
            * gcc.target/powerpc/pr105991.c: New test case.
Comment 10 Richard Biener 2023-04-26 06:56:09 UTC
GCC 13.1 is being released, retargeting bugs to GCC 13.2.
Comment 11 Roger Sayle 2023-04-26 08:01:25 UTC
Doh!  This has been fixed on both the GCC 13 and GCC 12 branches. The target milestone was when it was fixed, not when it will be fixed.