(CVS sources, ~6AM this morning US/eastern) typedef unsigned long long uint64_t; uint64_t foo (uint64_t n) { return (n >> 32) | (n << 32); } compiled with -O9 -fomit-frame-pointer: foo: pushl %ebx movl 12(%esp), %ebx movl 8(%esp), %ecx movl %ebx, %eax popl %ebx movl %ecx, %edx ret It should've been able to load 4(%esp) and 8(%esp) into %edx and %eax respectively, without using the extra stack slot to save ebx.
This is the normal subreg problem with the current RA. There are a couple other bugs about this already opened.
Confirmed, basicially the same issue as PR 15792.
Note, since this is a rotate, the patches I proposed in 17886 will generate much better code for this one case (basically mov/mov/xchgl -- it could be improved by a peephole to do the moves directly instead of xchgl). However, the more general subreg problem needs to be looked at.
With my current set of subreg patches, for this test case with -O2 -momit-leaf-frame-pointer, I get this: foo: movl 4(%esp), %edx movl 8(%esp), %eax ret which I suspect is optimal.
*** Bug 27202 has been marked as a duplicate of this bug. ***
Fixed by: 2007-01-31 Richard Henderson <rth@redhat.com> Ian Lance Taylor <iant@google.com> * lower-subreg.c: New file.