Bug 99918 - [11/12/13/14/15 Regression] suboptimal code for bool bitfield tests
Summary: [11/12/13/14/15 Regression] suboptimal code for bool bitfield tests
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.0
: P2 normal
Target Milestone: 11.5
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on: 99919
Blocks: bitfield VRP
  Show dependency treegraph
 
Reported: 2021-04-05 20:15 UTC by Martin Sebor
Modified: 2024-04-26 10:39 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 10.2.0, 11.0, 6.3.0, 7.0.1, 8.3.0, 9.3.0
Last reconfirmed: 2021-04-06 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Sebor 2021-04-05 20:15:29 UTC
GCC does a better job folding operations involving plain Booleans than it does with bool bit-fields.  The example below shows that in f() the return statement is folded to zero while in g() it's not.  This is behind a class of -Wmaybe-uninitialized warnings.

$ cat z.c && gcc -O2 -S -Wall -fdump-tree-optimized=/dev/stdout z.c
struct A { _Bool i, j; };

_Bool f (struct A a)
{
  if (a.i)
    a.j = 0;
  else
    a.j = a.i;

  return a.j;    // folded to 0
}

struct B { _Bool i: 1, j: 1; };
  
_Bool g (struct B b)
{
  if (b.i)
    b.j = 0;
  else
    b.j = b.i;

  return b.j;    // not folded
}


;; Function f (f, funcdef_no=0, decl_uid=1946, cgraph_uid=1, symbol_order=0)

_Bool f (struct A a)
{
  <bb 2> [local count: 1073741824]:
  return 0;

}



;; Function g (g, funcdef_no=1, decl_uid=1953, cgraph_uid=2, symbol_order=1)

Removing basic block 5
_Bool g (struct B b)
{
  _Bool b$j;
  unsigned char _1;
  unsigned char _2;
  _Bool _3;

  <bb 2> [local count: 1073741824]:
  _1 = VIEW_CONVERT_EXPR<unsigned char>(b);
  _2 = _1 & 1;
  if (_2 != 0)
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]

  <bb 3> [local count: 536870913]:
  _3 = b.i;

  <bb 4> [local count: 1073741824]:
  # b$j_5 = PHI <0(2), _3(3)>
  return b$j_5;

}
Comment 1 Andrew Pinski 2021-04-05 20:22:47 UTC
This comes down to lowering bitfields too soon.
my bet it will happen even integer bitfields will have a problem.
Comment 2 Martin Sebor 2021-04-05 20:23:00 UTC
Bisection points to r225825 as the revision where GCC started to fail to fold the code in g().
Comment 3 Martin Sebor 2021-04-05 20:27:57 UTC
This only seems to affect C _Bool bit-fields and not C++ bool.
Comment 4 Martin Sebor 2021-04-05 20:40:38 UTC
(In reply to Andrew Pinski from comment #1)
> This comes down to lowering bitfields too soon.
> my bet it will happen even integer bitfields will have a problem.

Yes, unsigned bit-fields suffer the same problem but unlike for _Bool, GCC never emitted optimal code for those for this test case.
Comment 5 Richard Biener 2021-04-06 08:39:22 UTC
The main issue is optimize_bit_field_compare in fold-const.c which produces
during GENERIC folding in .005t.original:

  if ((BIT_FIELD_REF <b, 8, 0> & 1) != 0)
    {
      b.j = 0;
    }
  else
    {
      b.j = b.i;
    }
  return b.j;

that's premature in this place.  For f() it also takes until DOM3 to do
the folding unless you disable SRA which then makes EVRP recognize the
second store as a.j = 0.  With SRA we fail to derive ranges for a_10 in

  a_10 = MEM <unsigned char> [(struct A *)&a];
  a$1_11 = MEM <unsigned char> [(struct A *)&a + 1B];
  _1 = VIEW_CONVERT_EXPR<_Bool>(a_10);
  if (_1 != 0)
    goto <bb 4>; [INV]
  else
    goto <bb 3>; [INV]

  <bb 3> :

  <bb 4> :
  # a$1_9 = PHI <0(2), a_10(3)>
  _7 = VIEW_CONVERT_EXPR<_Bool>(a$1_9);

thus we're missing looking through VIEW_CONVERT_EXPR in register_assert_for.
Amending that would eventually also allow optimizing the prematurely folded vairant.
Comment 6 Richard Biener 2021-06-01 08:20:08 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Comment 7 Richard Biener 2022-05-27 09:44:51 UTC
GCC 9 branch is being closed
Comment 8 Jakub Jelinek 2022-06-28 10:44:05 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
Comment 9 Franek Korta 2022-07-01 12:49:41 UTC
Another simple example:
#include <cstdint>
    
struct SomeClass {
    bool         cfg1 : 1;
    bool         cfg2 : 1;
    bool         cfg3 : 1;
    bool check() const noexcept { return cfg1 || cfg2 || cfg3; }
};

bool check(const SomeClass& rt) {
    return rt.check();
}

Emits:
check(SomeClass const&):
        movzx   edx, BYTE PTR [rdi]
        mov     eax, edx
        and     eax, 1
        jne     .L1
        mov     eax, edx
        shr     al
        and     eax, 1
        je      .L4
.L1:
        ret
.L4:
        mov     eax, edx
        shr     al, 2
        and     eax, 1
        ret

While it should:
check(SomeClass const&):
        test    byte ptr [rdi], 7
        setne   al
        ret
Comment 10 Richard Biener 2023-07-07 10:39:22 UTC
GCC 10 branch is being closed.
Comment 11 Andrew Pinski 2023-07-12 22:05:04 UTC
(In reply to Martin Sebor from comment #2)
> Bisection points to r225825 as the revision where GCC started to fail to
> fold the code in g().

the fold-const didn't check `types_match (type, TREE_TYPE (@0))` but rather just did the equivalent to:
(simplify
 (ne @0 integer_zerop@1)
 (if (TREE_CODE (TREE_TYPE (@0)) == BOOLEAN_TYPE)
  (non_lvalue (convert @0))))

While match now does not do the convert and checks the types_match check instead.