This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug target/82498] Missed optimization for x86 rotate instruction

From: "jakub at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Wed, 11 Oct 2017 17:17:35 +0000
Subject: [Bug target/82498] Missed optimization for x86 rotate instruction
Auto-submitted: auto-generated
References: <bug-82498-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82498

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Two further cases:
unsigned
f10 (unsigned x, unsigned char y)
{
  y %= __CHAR_BIT__ * __SIZEOF_INT__;
  return (x << y) | (x >> (-y & ((__CHAR_BIT__ * __SIZEOF_INT__) - 1)));
}

unsigned
f11 (unsigned x, unsigned short y)
{
  y %= __CHAR_BIT__ * __SIZEOF_INT__;
  return (x << y) | (x >> (-y & ((__CHAR_BIT__ * __SIZEOF_INT__) - 1)));
}

On f11 GCC generates also efficient code, on f10 useless &.
Guess the f10 case would be improved by addition of a
*<rotate_insn><mode>3_mask_1 define_insn_and_split (and similarly the
inefficient/nonportable  f1 code would be slightly improved).

Looking at LLVM, f1/f3/f5 are worse in LLVM than in GCC, and in all cases
instead of cmov it uses branching; f7/f8/f9/f10/f11 all generate efficient code
though, so the same like GCC in case of f8 and f11.

References:
- [Bug target/82498] New: Missed optimization for x86 rotate instruction
  - From: lloyd at randombit dot net

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]