This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/83171] std::bitset::count not inlining __popcountdi2


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83171

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|x86_64-linux-gnu            |
             Status|UNCONFIRMED                 |NEW
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2017-11-26
          Component|c++                         |tree-optimization
               Host|x86_64-linux-gnu            |
     Ever confirmed|0                           |1
              Build|x86_64-linux-gnu            |

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Works for me with aarch64:
_Z3fooj:
.LFB1136:
        .cfi_startproc
        and     x0, x0, 255
        fmov    d0, x0
        cnt     v0.8b, v0.8b
        addv    b0, v0.8b
        umov    w0, v0.b[0]
        and     x0, x0, 255
        ret


And works for me with -march=native:
_Z3fooj:
.LFB1162:
        .cfi_startproc
        movzbl  %dil, %eax
        popcntq %rax, %rax
        ret

Basically the following is not being optimized:
  int _3;
  long unsigned int _4;
  long long unsigned int _5;
  unsigned int _6;

  <bb 2> [100.00%]:
  _6 = value_1(D) & 255;
  _5 = (long long unsigned int) _6;
  _3 = __builtin_popcountl (_5);
  _4 = (long unsigned int) _3;

To just:

  _6 = value_1(D) & 255;
  _3 = __builtin_popcount (_6);
  _4 = (long unsigned int) _3;

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]