| Summary: | [8/9/10 Regression] git miscompilation on s390x-linux with -O2 -march=zEC12 -mtune=z13 starting with r8-1288 | ||
|---|---|---|---|
| Product: | gcc | Reporter: | Jakub Jelinek <jakub> |
| Component: | rtl-optimization | Assignee: | Jakub Jelinek <jakub> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | rsandifo |
| Priority: | P2 | Keywords: | wrong-code |
| Version: | 9.2.1 | ||
| Target Milestone: | 8.4 | ||
| Host: | Target: | ||
| Build: | Known to work: | ||
| Known to fail: | Last reconfirmed: | 2020-02-24 00:00:00 | |
| Attachments: | gcc10-pr93908.patch | ||
|
Description
Jakub Jelinek
2020-02-24 16:46:41 UTC
I'd say the bug is in the way how we try to lower the zero_extract store:
Trying 19, 13, 21 -> 22:
Failed to match this instruction:
(set (zero_extract:DI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ])
(const_int 10 [0xa])) [0 +0 S2 A16])
(const_int 2 [0x2])
(const_int 7 [0x7]))
(const_int -1 [0xffffffffffffffff]))
Successfully matched this instruction:
(set (reg:HI 67)
(const_int -128 [0xffffffffffffff80]))
Successfully matched this instruction:
(set (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ])
(const_int 10 [0xa])) [0 +0 S2 A16])
(ior:HI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ])
(const_int 10 [0xa])) [0 +0 S2 A16])
(reg:HI 67)))
I'd say storing a -1 into a zero_extract is fine, at least IMHO we shouldn't rely on it being zero extended.
Created attachment 47900 [details] gcc10-pr93908.patch Untested fix. The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:73dc4ae47418aef2eb470b8f71cef57dce37349e commit r10-6844-g73dc4ae47418aef2eb470b8f71cef57dce37349e Author: Jakub Jelinek <jakub@redhat.com> Date: Tue Feb 25 13:56:47 2020 +0100 combine: Fix find_split_point handling of constant store into ZERO_EXTRACT [PR93908] git is miscompiled on s390x-linux with -O2 -march=zEC12 -mtune=z13. I've managed to reduce it into the following testcase. The problem is that during combine we see the s->k = -1; bitfield store and change the SET_SRC from a pseudo into a constant: (set (zero_extract:DI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ]) (const_int 10 [0xa])) [0 +0 S2 A16]) (const_int 2 [0x2]) (const_int 7 [0x7])) (const_int -1 [0xffffffffffffffff])) This on s390x with the above option isn't recognized as valid instruction, so find_split_point decides to handle it as IOR or IOR/AND. src is -1, mask is 3 and pos is 7. src != mask (this is also incorrect, we want to set all (both) bits in the bitfield), so we go for IOR/AND, but instead of trying mem = (mem & ~0x180) | ((-1 << 7) & 0x180) we actually try mem = (mem & ~0x180) | (-1 << 7) and that is further simplified into: mem = mem | (-1 << 7) aka mem = mem | 0xff80 which doesn't set just the 2-bit bitfield, but also many other bitfields that shouldn't be touched. We really should do: mem = mem | 0x180 instead. The problem is that we assume that no bits but those low len (2 here) will be set in the SET_SRC, but there is nothing that can prevent that, we just should ignore the other bits. The following patch fixes it by masking src with mask, this way already the src == mask test will DTRT, and as the code for or_mask uses gen_int_mode, if the most significant bit is set after shifting it left by pos, it will be properly sign-extended. 2020-02-25 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/93908 * combine.c (find_split_point): For store into ZERO_EXTRACT, and src with mask. * gcc.c-torture/execute/pr93908.c: New test. The releases/gcc-9 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:eeb31391b7f223e6ca8cbd4452b99b55f4afdb1c commit r9-8276-geeb31391b7f223e6ca8cbd4452b99b55f4afdb1c Author: Jakub Jelinek <jakub@redhat.com> Date: Tue Feb 25 13:56:47 2020 +0100 combine: Fix find_split_point handling of constant store into ZERO_EXTRACT [PR93908] git is miscompiled on s390x-linux with -O2 -march=zEC12 -mtune=z13. I've managed to reduce it into the following testcase. The problem is that during combine we see the s->k = -1; bitfield store and change the SET_SRC from a pseudo into a constant: (set (zero_extract:DI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ]) (const_int 10 [0xa])) [0 +0 S2 A16]) (const_int 2 [0x2]) (const_int 7 [0x7])) (const_int -1 [0xffffffffffffffff])) This on s390x with the above option isn't recognized as valid instruction, so find_split_point decides to handle it as IOR or IOR/AND. src is -1, mask is 3 and pos is 7. src != mask (this is also incorrect, we want to set all (both) bits in the bitfield), so we go for IOR/AND, but instead of trying mem = (mem & ~0x180) | ((-1 << 7) & 0x180) we actually try mem = (mem & ~0x180) | (-1 << 7) and that is further simplified into: mem = mem | (-1 << 7) aka mem = mem | 0xff80 which doesn't set just the 2-bit bitfield, but also many other bitfields that shouldn't be touched. We really should do: mem = mem | 0x180 instead. The problem is that we assume that no bits but those low len (2 here) will be set in the SET_SRC, but there is nothing that can prevent that, we just should ignore the other bits. The following patch fixes it by masking src with mask, this way already the src == mask test will DTRT, and as the code for or_mask uses gen_int_mode, if the most significant bit is set after shifting it left by pos, it will be properly sign-extended. 2020-02-25 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/93908 * combine.c (find_split_point): For store into ZERO_EXTRACT, and src with mask. * gcc.c-torture/execute/pr93908.c: New test. The releases/gcc-8 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:d9814ee5972b48eb67a7af171957d1a285453f46 commit r8-10063-gd9814ee5972b48eb67a7af171957d1a285453f46 Author: Jakub Jelinek <jakub@redhat.com> Date: Tue Feb 25 13:56:47 2020 +0100 combine: Fix find_split_point handling of constant store into ZERO_EXTRACT [PR93908] git is miscompiled on s390x-linux with -O2 -march=zEC12 -mtune=z13. I've managed to reduce it into the following testcase. The problem is that during combine we see the s->k = -1; bitfield store and change the SET_SRC from a pseudo into a constant: (set (zero_extract:DI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ]) (const_int 10 [0xa])) [0 +0 S2 A16]) (const_int 2 [0x2]) (const_int 7 [0x7])) (const_int -1 [0xffffffffffffffff])) This on s390x with the above option isn't recognized as valid instruction, so find_split_point decides to handle it as IOR or IOR/AND. src is -1, mask is 3 and pos is 7. src != mask (this is also incorrect, we want to set all (both) bits in the bitfield), so we go for IOR/AND, but instead of trying mem = (mem & ~0x180) | ((-1 << 7) & 0x180) we actually try mem = (mem & ~0x180) | (-1 << 7) and that is further simplified into: mem = mem | (-1 << 7) aka mem = mem | 0xff80 which doesn't set just the 2-bit bitfield, but also many other bitfields that shouldn't be touched. We really should do: mem = mem | 0x180 instead. The problem is that we assume that no bits but those low len (2 here) will be set in the SET_SRC, but there is nothing that can prevent that, we just should ignore the other bits. The following patch fixes it by masking src with mask, this way already the src == mask test will DTRT, and as the code for or_mask uses gen_int_mode, if the most significant bit is set after shifting it left by pos, it will be properly sign-extended. 2020-02-25 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/93908 * combine.c (find_split_point): For store into ZERO_EXTRACT, and src with mask. * gcc.c-torture/execute/pr93908.c: New test. Fixed. |