Bug 101129 - [11/12 Regression] wrong code at -O1 since r11-5839
Summary: [11/12 Regression] wrong code at -O1 since r11-5839
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 12.0
: P2 normal
Target Milestone: 11.2
Assignee: Bill Schmidt
URL:
Keywords: wrong-code
Depends on:
Blocks: 101345
  Show dependency treegraph
 
Reported: 2021-06-18 18:47 UTC by Zdenek Sojka
Modified: 2024-05-02 02:52 UTC (History)
8 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: powerpc64le-unknown-linux-gnu
Build:
Known to work: 12.0
Known to fail:
Last reconfirmed: 2021-06-18 00:00:00


Attachments
reduced testcase (287 bytes, text/plain)
2021-06-18 18:47 UTC, Zdenek Sojka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2021-06-18 18:47:20 UTC
Created attachment 51037 [details]
reduced testcase

Output:
$ powerpc64le-unknown-linux-gnu-gcc -O testcase.c -static
$ qemu-ppc64le -- ./a.out 
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted


$ powerpc64le-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-powerpc64le/bin/powerpc64le-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-1607-20210617224441-g2786064d91f-checking-yes-rtl-df-extra-powerpc64le/bin/../libexec/gcc/powerpc64le-unknown-linux-gnu/12.0.0/lto-wrapper
Target: powerpc64le-unknown-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --with-sysroot=/usr/powerpc64le-unknown-linux-gnu --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=powerpc64le-unknown-linux-gnu --with-ld=/usr/bin/powerpc64le-unknown-linux-gnu-ld --with-as=/usr/bin/powerpc64le-unknown-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r12-1607-20210617224441-g2786064d91f-checking-yes-rtl-df-extra-powerpc64le
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 20210618 (experimental) (GCC)
Comment 1 Jakub Jelinek 2021-06-18 20:39:55 UTC
Started with r11-5839-g3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1
Comment 2 Jakub Jelinek 2021-06-18 20:46:14 UTC
Optimized difference:
--- pr101129.c.244t.optimized.r11-5838	2021-06-18 22:40:20.915971652 +0200
+++ pr101129.c.244t.optimized.r11-5839	2021-06-18 22:40:34.739789553 +0200
@@ -8,6 +8,7 @@ void foo0 (u32 u32_0, U * ret)
   __int128 _1;
   vector(16) unsigned char _2;
   unsigned char _3;
+  vector(16) <signed-boolean:8> _4;
   vector(16) signed char _5;
   __int128 _6;
   __int128 _7;
@@ -31,7 +32,8 @@ void foo0 (u32 u32_0, U * ret)
   _14 = u32_0_13(D) & 4;
   _3 = (unsigned char) _14;
   _2 = {_3, _3, _3, _3, _3, _3, _3, _3, _3, _3, _3, _3, _3, _3, _3, _3};
-  _5 = .VCONDU (_2, { 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, 108);
+  _4 = _2 < { 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+  _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_4);
   _6 = VIEW_CONVERT_EXPR<__int128>(_5);
   u128_2_15 = _1 * _6;
   _7 = u128_2_15 >> 64;
Assembly difference:
--- pr101129.s.r11-5838	2021-06-18 22:40:20.919971599 +0200
+++ pr101129.s.r11-5839	2021-06-18 22:40:34.743789500 +0200
@@ -14,19 +14,16 @@ foo0:
 	.localentry	foo0,.-foo0
 	rlwinm 9,3,0,29,29
 	mtvsrd 32,9
-	vspltb 0,0,7
+	vspltb 1,0,7
 	addis 9,2,.LC0@toc@ha
 	addi 9,9,.LC0@toc@l
-	lxvd2x 33,0,9
-	vcmpgtub 0,1,0
-	addi 9,1,-16
-	xxpermdi 32,32,32,2
-	stxvd2x 32,0,9
-	ld 9,-8(1)
-	ld 10,-16(1)
+	lxvd2x 32,0,9
+	vcmpgtub 0,0,1
+	mfvsrd 11,32
+	xxpermdi 0,32,32,3
+	mfvsrd 10,0
+	add 10,11,10
 	mulld 10,10,3
-	mulhdu 3,3,9
-	add 10,10,3
 	addis 9,2,.LANCHOR0@toc@ha
 	lbz 9,.LANCHOR0@toc@l(9)
 	add 10,10,9
Comment 3 Richard Biener 2021-06-21 06:18:23 UTC
The GIMPLE doesn't look wrong but eventually

+  _4 = _2 < { 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+  _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_4);

goes wrong somewhere in RTL expansion?
Comment 4 Jakub Jelinek 2021-07-13 19:39:57 UTC
I think this is a bug in the swaps rs6000 specific pass.
It in particular changes
 (insn 20 19 21 2 (set (reg:DI 155)
         (mult:DI (reg:DI 224)
-            (subreg:DI (reg:V16QI 148) 0))) "pr101129.c":14:8 154 {muldi3}
+            (subreg:DI (reg:V16QI 148) 8))) "pr101129.c":14:8 154 {muldi3}
      (nil))
 (insn 21 20 22 2 (set (reg:DI 156)
         (subreg:DI (mult:TI (zero_extend:TI (reg:DI 224))
-                (zero_extend:TI (subreg:DI (reg:V16QI 148) 0))) 8)) "pr101129.c":14:8 166 {umuldi3_highpart_le}
+                (zero_extend:TI (subreg:DI (reg:V16QI 148) 8))) 0)) "pr101129.c":14:8 166 {umuldi3_highpart_le}
      (nil))

The SUBREG_BYTE changes in subregs of pseudo 148 look reasonable as the pass
removed a xxswapd_v16qi that was swapping the value set to that pseudo.
But the 8 to 0 change of SUBREG_BYTE with SUBREG_REG of MULT looks wrong, that wouldn't even match unless simplified and turns a highpart multiply into lowpart multiply.
So either the swaps pass needs to be fixed not to do that, or the highpart multiply instructions should be changed to e.g. use lshiftrt by 64 instead of highpart subreg of the result.
Comment 5 Bill Schmidt 2021-07-13 21:19:49 UTC
Uh, yeah, that is completely unexpected behavior for swaps.  I'll try to look at this soon.
Comment 6 Bill Schmidt 2021-07-14 20:22:50 UTC
Testing this patch.

diff --git a/gcc/config/rs6000/rs6000-p8swap.c b/gcc/config/rs6000/rs6000-p8swap.c
index 21cbcb2e28a..00693e6dc60 100644
--- a/gcc/config/rs6000/rs6000-p8swap.c
+++ b/gcc/config/rs6000/rs6000-p8swap.c
@@ -1523,6 +1523,20 @@ replace_swap_with_copy (swap_web_entry *insn_entry, unsigned i)
   insn->set_deleted ();
 }
 
+/* INSN is known to contain a SUBREG, which we can normally handle,
+   but if the SUBREG itself contains a MULT then we need to leave it alone
+   to avoid turning a mult_hipart into a mult_lopart, for example.  */
+static bool
+has_part_mult (rtx_insn *insn)
+{
+  rtx body = PATTERN (insn);
+  rtx src = SET_SRC (body);
+  if (GET_CODE (src) != SUBREG)
+    return false;
+  rtx inner = XEXP (src, 0);
+  return (GET_CODE (inner) == MULT);
+}
+
 /* Make NEW_MEM_EXP's attributes and flags resemble those of
    ORIGINAL_MEM_EXP.  */
 static void
@@ -2501,6 +2515,9 @@ rs6000_analyze_swaps (function *fun)
                    insn_entry[uid].is_swappable = 0;
                  else if (special != SH_NONE)
                    insn_entry[uid].special_handling = special;
+                 else if (insn_entry[uid].contains_subreg
+                          && has_part_mult (insn))
+                   insn_entry[uid].is_swappable = 0;
                  else if (insn_entry[uid].contains_subreg)
                    insn_entry[uid].special_handling = SH_SUBREG;
                }
Comment 7 Segher Boessenkool 2021-07-14 21:38:00 UTC
That patch is pre-approved (if it works ;-) ) Bill.  Thanks!
Comment 8 Bill Schmidt 2021-07-14 21:54:09 UTC
Small change required to actually check that it's a SET insn.  (oops)  Otherwise looks like it passed regstrap.  Testing the revised patch now.  Thanks for the pre-review!
Comment 9 GCC Commits 2021-07-15 15:17:20 UTC
The master branch has been updated by William Schmidt <wschmidt@gcc.gnu.org>:

https://gcc.gnu.org/g:ad5f8ac1d2f2dc92d43663243b52f9e9eb3cf7c0

commit r12-2325-gad5f8ac1d2f2dc92d43663243b52f9e9eb3cf7c0
Author: Bill Schmidt <wschmidt@linux.ibm.com>
Date:   Thu Jul 15 10:16:17 2021 -0500

    rs6000: Don't let swaps pass break multiply low-part (PR101129)
    
    2021-07-15  Bill Schmidt  <wschmidt@linux.ibm.com>
    
    gcc/
            PR target/101129
            * config/rs6000/rs6000-p8swap.c (has_part_mult): New.
            (rs6000_analyze_swaps): Insns containing a subreg of a mult are
            not swappable.
    
    gcc/testsuite/
            PR target/101129
            * gcc.target/powerpc/pr101129.c: New.
Comment 10 Bill Schmidt 2021-07-15 15:18:40 UTC
Fixed on trunk, but will require backports.
Comment 11 GCC Commits 2021-07-19 17:52:44 UTC
The releases/gcc-11 branch has been updated by William Schmidt <wschmidt@gcc.gnu.org>:

https://gcc.gnu.org/g:ac0efe3c6fc6231b20ffd684956a4a5c3c54a96b

commit r11-8780-gac0efe3c6fc6231b20ffd684956a4a5c3c54a96b
Author: Bill Schmidt <wschmidt@linux.ibm.com>
Date:   Mon Jul 19 12:49:17 2021 -0500

    rs6000: Don't let swaps pass break multiply low-part (PR101129)
    
    Backport from mainline.
    
    2021-07-15  Bill Schmidt  <wschmidt@linux.ibm.com>
    
    gcc/
            PR target/101129
            * config/rs6000/rs6000-p8swap.c (has_part_mult): New.
            (rs6000_analyze_swaps): Insns containing a subreg of a mult are
            not swappable.
    
    gcc/testsuite/
            PR target/101129
            * gcc.target/powerpc/pr101129.c: New.
Comment 12 GCC Commits 2021-07-19 19:42:00 UTC
The releases/gcc-10 branch has been updated by William Schmidt <wschmidt@gcc.gnu.org>:

https://gcc.gnu.org/g:ecad28561c47f5b2652a03e9b67cf37c13d88f53

commit r10-9990-gecad28561c47f5b2652a03e9b67cf37c13d88f53
Author: Bill Schmidt <wschmidt@linux.ibm.com>
Date:   Mon Jul 19 14:38:23 2021 -0500

    rs6000: Don't let swaps pass break multiply low-part (PR101129)
    
    Backport from mainline.
    
    2021-07-15  Bill Schmidt  <wschmidt@linux.ibm.com>
    
    gcc/
            PR target/101129
            * config/rs6000/rs6000-p8swap.c (has_part_mult): New.
            (rs6000_analyze_swaps): Insns containing a subreg of a mult are
            not swappable.
    
    gcc/testsuite/
            PR target/101129
            * gcc.target/powerpc/pr101129.c: New.
Comment 13 GCC Commits 2021-07-19 21:16:40 UTC
The releases/gcc-9 branch has been updated by William Schmidt <wschmidt@gcc.gnu.org>:

https://gcc.gnu.org/g:3e4b43ec9a937866ead88e996bd200b116a1766c

commit r9-9632-g3e4b43ec9a937866ead88e996bd200b116a1766c
Author: Bill Schmidt <wschmidt@linux.ibm.com>
Date:   Mon Jul 19 16:14:36 2021 -0500

    rs6000: Don't let swaps pass break multiply low-part (PR101129)
    
    Backport from mainline.
    
    2021-07-15  Bill Schmidt  <wschmidt@linux.ibm.com>
    
    gcc/
            PR target/101129
            * config/rs6000/rs6000-p8swap.c (has_part_mult): New.
            (rs6000_analyze_swaps): Insns containing a subreg of a mult are
            not swappable.
    
    gcc/testsuite/
            PR target/101129
            * gcc.target/powerpc/pr101129.c: New.
Comment 14 Bill Schmidt 2021-07-19 21:18:16 UTC
Now fixed everywhere.