Starting with r12-2731-g96146e61cd7aee, this code (on ppc64le) unsigned long long foo (unsigned long long value) { value &= 0xffffffff; value |= value << 32; return value; } compiled with -O2 generates rldicl 9,3,0,32 sldi 3,3,32 add 3,3,9 blr while previously it was just rldimi 3,3,32,0 blr It doesn't look like a wrong code problem, but it seems more optimal to use rldimi (rotate left, mask insert) rather than rotate left by 0 bits, AND with a mask, shift left, and add.
The following patch appears to correct this for me on a cross-compiler to powerpcle64, but it's tricky for me to do a full bootstrap/regression test. 2022-06-16 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/105991 * config/rs6000/rs6000.md (plus_xor): New code iterator. (*rotl<mode>3_insert_3_<code>): New define_insn_and_split. diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index c55ee7e..695ec33 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4188,6 +4188,23 @@ } [(set_attr "type" "insert")]) +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3 +(define_code_iterator plus_xor [plus xor]) + +(define_insn_and_split "*rotl<mode>3_insert_3_<code>" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (plus_xor:GPR + (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0") + (match_operand:GPR 4 "const_int_operand" "n")) + (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") + (match_operand:SI 2 "const_int_operand" "n"))))] + "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)" + "#" + "&& 1" + [(set (match_dup 0) + (ior:GPR (and:GPR (match_dup 3) (match_dup 4)) + (ashift:GPR (match_dup 1) (match_dup 2))))]) + (define_code_iterator plus_ior_xor [plus ior xor]) (define_split
(In reply to Roger Sayle from comment #1) > The following patch appears to correct this for me on a cross-compiler to > powerpcle64, but it's tricky for me to do a full bootstrap/regression test. Thanks for the patch. I'm testing it and will report back the results.
Regtest/bootstrap passed on powerpc64le-unknown-linux-gnu. I did not test Ada.
(In reply to Marek Polacek from comment #0) > It doesn't look like a wrong code problem, but it seems more optimal to use > rldimi (rotate left, mask insert) rather than rotate left by 0 bits, AND > with a mask, shift left, and add. Confirmed. The original code is much better (and yes, the current is correct as well).
Patch proposed: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596778.html
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>: https://gcc.gnu.org/g:4306339798b6843937c628c5ece8c234b309b13d commit r13-1191-g4306339798b6843937c628c5ece8c234b309b13d Author: Roger Sayle <roger@nextmovesoftware.com> Date: Wed Jun 22 00:08:56 2022 +0100 PR target/105991: Recognize PLUS and XOR forms of rldimi in rs6000.md. This patch addresses PR target/105991 where a change to prefer representing shifts and adds at the tree-level as multiplications, causes problems for the rldimi patterns in the powerpc backend. The issue is that rs6000.md models this pattern using IOR, and some variants that have the equivalent PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns. This is fixed in this patch by adding a define_insn_and_split to locally canonicalize the PLUS and XOR forms to the backend's preferred IOR form. An alternative fix might be for the RTL optimizers to define a canonical form for these plus_xor_ior equivalent expressions, but the logical choice might be plus (which may appear in an addressing mode), and such a change may require a number of tweaks to update various backends (i.e. a more intrusive change than the one proposed here). Many thanks for Marek Polacek for bootstrapping and regression testing this change without problems. 2022-06-22 Roger Sayle <roger@nextmovesoftware.com> Marek Polacek <polacek@redhat.com> Segher Boessenkool <segher@kernel.crashing.org> Kewen Lin <linkw@linux.ibm.com> gcc/ChangeLog PR target/105991 * config/rs6000/rs6000.md (rotl<mode>3_insert_3): Check that exact_log2 doesn't return -1 (or zero). (plus_xor): New code iterator. (*rotl<mode>3_insert_3_<code>): New define_insn_and_split. gcc/testsuite/ChangeLog PR target/105991 * gcc.target/powerpc/pr105991.c: New test case.
This should now be fixed on mainline. If anyone feels strongly that the fix should be backported to the GCC 12 branch, please feel free to reopen this PR. Thanks again to Marek.
Yes, this needs a backport.
The releases/gcc-12 branch has been updated by Roger Sayle <sayle@gcc.gnu.org>: https://gcc.gnu.org/g:6c175b3d170de2bb02b7bd45b3348eec05d28451 commit r12-8547-g6c175b3d170de2bb02b7bd45b3348eec05d28451 Author: Roger Sayle <roger@nextmovesoftware.com> Date: Mon Jul 4 13:58:37 2022 +0100 PR target/105991: Recognize PLUS and XOR forms of rldimi in rs6000.md. This patch addresses PR target/105991 where a change to prefer representing shifts and adds at the tree-level as multiplications, causes problems for the rldimi patterns in the powerpc backend. The issue is that rs6000.md models this pattern using IOR, and some variants that have the equivalent PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns. This is fixed in this patch by adding a define_insn_and_split to locally canonicalize the PLUS and XOR forms to the backend's preferred IOR form. Backported from master. 2022-07-04 Roger Sayle <roger@nextmovesoftware.com> Marek Polacek <polacek@redhat.com> Segher Boessenkool <segher@kernel.crashing.org> Kewen Lin <linkw@linux.ibm.com> gcc/ChangeLog PR target/105991 * config/rs6000/rs6000.md (rotl<mode>3_insert_3): Check that exact_log2 doesn't return -1 (or zero). (plus_xor): New code iterator. (*rotl<mode>3_insert_3_<code>): New define_insn_and_split. gcc/testsuite/ChangeLog PR target/105991 * gcc.target/powerpc/pr105991.c: New test case.
GCC 13.1 is being released, retargeting bugs to GCC 13.2.
Doh! This has been fixed on both the GCC 13 and GCC 12 branches. The target milestone was when it was fixed, not when it will be fixed.