This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/82498] Missed optimization for x86 rotate instruction


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82498

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Two further cases:
unsigned
f10 (unsigned x, unsigned char y)
{
  y %= __CHAR_BIT__ * __SIZEOF_INT__;
  return (x << y) | (x >> (-y & ((__CHAR_BIT__ * __SIZEOF_INT__) - 1)));
}

unsigned
f11 (unsigned x, unsigned short y)
{
  y %= __CHAR_BIT__ * __SIZEOF_INT__;
  return (x << y) | (x >> (-y & ((__CHAR_BIT__ * __SIZEOF_INT__) - 1)));
}

On f11 GCC generates also efficient code, on f10 useless &.
Guess the f10 case would be improved by addition of a
*<rotate_insn><mode>3_mask_1 define_insn_and_split (and similarly the
inefficient/nonportable  f1 code would be slightly improved).

Looking at LLVM, f1/f3/f5 are worse in LLVM than in GCC, and in all cases
instead of cmov it uses branching; f7/f8/f9/f10/f11 all generate efficient code
though, so the same like GCC in case of f8 and f11.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]