Summary: | [12/13/14/15 Regression] suboptimal code for bool bitfield tests | ||
---|---|---|---|
Product: | gcc | Reporter: | Martin Sebor <msebor> |
Component: | tree-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | NEW --- | ||
Severity: | normal | CC: | fkorta, pinskia |
Priority: | P2 | Keywords: | missed-optimization |
Version: | 11.0 | ||
Target Milestone: | 12.5 | ||
See Also: |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97588 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99919 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110637 |
||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | 10.2.0, 11.0, 6.3.0, 7.0.1, 8.3.0, 9.3.0 | Last reconfirmed: | 2021-04-06 00:00:00 |
Bug Depends on: | 99919 | ||
Bug Blocks: | 19466, 85316 |
Description
Martin Sebor
2021-04-05 20:15:29 UTC
This comes down to lowering bitfields too soon. my bet it will happen even integer bitfields will have a problem. Bisection points to r225825 as the revision where GCC started to fail to fold the code in g(). This only seems to affect C _Bool bit-fields and not C++ bool. (In reply to Andrew Pinski from comment #1) > This comes down to lowering bitfields too soon. > my bet it will happen even integer bitfields will have a problem. Yes, unsigned bit-fields suffer the same problem but unlike for _Bool, GCC never emitted optimal code for those for this test case. The main issue is optimize_bit_field_compare in fold-const.c which produces during GENERIC folding in .005t.original: if ((BIT_FIELD_REF <b, 8, 0> & 1) != 0) { b.j = 0; } else { b.j = b.i; } return b.j; that's premature in this place. For f() it also takes until DOM3 to do the folding unless you disable SRA which then makes EVRP recognize the second store as a.j = 0. With SRA we fail to derive ranges for a_10 in a_10 = MEM <unsigned char> [(struct A *)&a]; a$1_11 = MEM <unsigned char> [(struct A *)&a + 1B]; _1 = VIEW_CONVERT_EXPR<_Bool>(a_10); if (_1 != 0) goto <bb 4>; [INV] else goto <bb 3>; [INV] <bb 3> : <bb 4> : # a$1_9 = PHI <0(2), a_10(3)> _7 = VIEW_CONVERT_EXPR<_Bool>(a$1_9); thus we're missing looking through VIEW_CONVERT_EXPR in register_assert_for. Amending that would eventually also allow optimizing the prematurely folded vairant. GCC 9.4 is being released, retargeting bugs to GCC 9.5. GCC 9 branch is being closed GCC 10.4 is being released, retargeting bugs to GCC 10.5. Another simple example: #include <cstdint> struct SomeClass { bool cfg1 : 1; bool cfg2 : 1; bool cfg3 : 1; bool check() const noexcept { return cfg1 || cfg2 || cfg3; } }; bool check(const SomeClass& rt) { return rt.check(); } Emits: check(SomeClass const&): movzx edx, BYTE PTR [rdi] mov eax, edx and eax, 1 jne .L1 mov eax, edx shr al and eax, 1 je .L4 .L1: ret .L4: mov eax, edx shr al, 2 and eax, 1 ret While it should: check(SomeClass const&): test byte ptr [rdi], 7 setne al ret GCC 10 branch is being closed. (In reply to Martin Sebor from comment #2) > Bisection points to r225825 as the revision where GCC started to fail to > fold the code in g(). the fold-const didn't check `types_match (type, TREE_TYPE (@0))` but rather just did the equivalent to: (simplify (ne @0 integer_zerop@1) (if (TREE_CODE (TREE_TYPE (@0)) == BOOLEAN_TYPE) (non_lvalue (convert @0)))) While match now does not do the convert and checks the types_match check instead. GCC 11 branch is being closed. |