Bug 93533 - [10 Regression] ICE due to popcounthi2 expansion with -march=z196 since r10-3720
Summary: [10 Regression] ICE due to popcounthi2 expansion with -march=z196 since r10-3720
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.0
: P1 normal
Target Milestone: 10.0
Assignee: Jakub Jelinek
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-01 12:42 UTC by Jakub Jelinek
Modified: 2020-02-03 08:37 UTC (History)
1 user (show)

See Also:
Host:
Target: s390x-linux
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-02-01 00:00:00


Attachments
gcc10-pr93533.patch (930 bytes, patch)
2020-02-01 12:48 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2020-02-01 12:42:11 UTC
The following testcase ICEs on s390x with -march=z196 -O2 (or any -O+) since
r10-3720-gac87f0f3459a57f03503e51aeffc54bb6ef36b90 :
unsigned
foo (unsigned short a)
{
  a = a - (a >> 1 & 21845);
  a = (a & 13107) + (a >> 2 & 13107);
  return (unsigned short) ((a + (a >> 4) & 3855) * 257) >> 8;
}

The problem is in the popcounthi2_z196 expander that emits invalid RTL.
Comment 1 Jakub Jelinek 2020-02-01 12:42:54 UTC
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 2959d8c0f17..e37ba49444a 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -11670,21 +11670,28 @@ (define_expand "popcountsi2"
 })
 
 (define_expand "popcounthi2_z196"
-  [; popcnt op0, op1
-   (parallel [(set (match_operand:HI 0 "register_operand" "")
+  [; popcnt op2, op1
+   (parallel [(set (match_dup 2)
 		   (unspec:HI [(match_operand:HI 1 "register_operand")]
 			      UNSPEC_POPCNT))
 	      (clobber (reg:CC CC_REGNUM))])
-   ; sllk op2, op0, 8
-   (set (match_dup 2)
-	(ashift:SI (match_dup 0) (const_int 8)))
-   ; ar op0, op2
-   (parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 2)))
+   ; lr op3, op2
+   (set (match_dup 3) (subreg:SI (match_dup 2) 0))
+   ; srl op4, op3, 8
+   (set (match_dup 4) (lshiftrt:SI (match_dup 3) (const_int 8)))
+   ; ar op3, op4
+   (parallel [(set (match_dup 3) (plus:SI (match_dup 3) (match_dup 4)))
 	      (clobber (reg:CC CC_REGNUM))])
-   ; srl op0, op0, 8
-   (set (match_dup 0) (lshiftrt:HI (match_dup 0) (const_int 8)))]
+   ; llgc op0, op3
+   (parallel [(set (match_operand:HI 0 "register_operand" "")
+		   (and:HI (subreg:HI (match_dup 3) 2) (const_int 255)))
+	      (clobber (reg:CC CC_REGNUM))])]
   "TARGET_Z196"
-  "operands[2] = gen_reg_rtx (SImode);")
+{
+  operands[2] = gen_reg_rtx (HImode);
+  operands[3] = gen_reg_rtx (SImode);
+  operands[4] = gen_reg_rtx (SImode);
+})
 
 (define_expand "popcounthi2"
   [(set (match_dup 2)

seems to work for me.
Comment 2 Jakub Jelinek 2020-02-01 12:48:52 UTC
Created attachment 47761 [details]
gcc10-pr93533.patch

Full untested patch.
Comment 3 CVS Commits 2020-02-03 08:04:18 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:f626ae5478887b0cec886160dcfc4d59bf6fda07

commit r10-6400-gf626ae5478887b0cec886160dcfc4d59bf6fda07
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Feb 3 09:00:19 2020 +0100

    s390x: Fix popcounthi2_z196 expander [PR93533]
    
    The following testcase started to ICE when .POPCOUNT matching has been added
    to match.pd; we had __builtin_popcount*, but nothing would use the
    popcounthi2 expander before.
    
    The problem is that the popcounthi2_z196 expander doesn't emit valid RTL:
    error: unrecognizable insn:
    (insn 138 137 139 27 (set (reg:SI 190)
            (ashift:SI (reg:HI 95 [ _105 ])
                (const_int 8 [0x8]))) -1
         (nil))
    during RTL pass: vregs
    The following patch is an attempt to fix that, furthermore I've tried to
    slightly simplify it as well, it makes no sense to me to perform
    (x + (x << 8)) >> 8 when we need to either zero extend or mask the result
    at the end in order to avoid bits from above HImode to affect it, when we
    can do
    (x + (x >> 8)) & 0xff (or zero extension).
    
    2020-02-03  Jakub Jelinek  <jakub@redhat.com>
    
    	PR target/93533
    	* config/s390/s390.md (popcounthi2_z196): Fix up expander to emit
    	valid RTL to sum up the lowest and second lowest bytes of the popcnt
    	result.
    
    	* gcc.c-torture/compile/pr93533.c: New test.
    	* gcc.target/s390/pr93533.c: New test.
Comment 4 Jakub Jelinek 2020-02-03 08:37:52 UTC
Fixed.