This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/17886] variable rotate and long long rotate should be better optimized
- From: "mmitchel at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 29 Sep 2005 03:52:38 -0000
- Subject: [Bug middle-end/17886] variable rotate and long long rotate should be better optimized
- References: <20041008000425.17886.ak@muc.de>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Additional Comments From mmitchel at gcc dot gnu dot org 2005-09-29 03:52 -------
Here is the current status for the four functions in Andi's testcase, with "f2"
changed to use "32 - y" so that it is a proper rotation:
* f still generates a complex code sequence, but I'm not sure how much better we
can do. Our code sequence doesn't look a lot worse than the sequence generated
by icc 9.0, at first glance. We could try something like:
if %ecx > 31:
mov %eax, %ebx
shldl $31, %edx, %eax
shldl $31, %ebx, %edx
%ecx -= 31
if %ecx > 31:
mov %eax, %ebx
shldl $31, %edx, %eax
shldl $31, %ebx, %edx
%ecx -= 31
if %ecx != 0:
mov %eax, %ebx
shldl %cl, %edx, %eax
shldl %cl, %ebx, %edx
but, that doesn't seem clearly better than what we presently generate.
* f2 uses the roll instruction, which appears optimal.
* f3 uses two shdl instructions, which appears optimal.
* f4 uses the rorl instruction, which appears optimal.
For all of f2 and f3, it looks like we generate code better than you get with
icc 9.0.
I have no plans to work on this further, for the time being, but I'll not close
out the PRt; someone else might want to try to attack the code generated for the
variable rotation case. Or, if people are satisfied, we can close the PR.
--
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|mark at codesourcery dot com|unassigned at gcc dot gnu
| |dot org
Status|ASSIGNED |NEW
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17886