This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Some middle-end improvements for bitfield handling
On Tue, Jun 29, 2004 at 02:46:46PM -0600, Roger Sayle wrote:
> > * simplify-rtx.c (simplify_binary_operation): Simplify
> > ((A & N) + B) & M -> (A + B) & M if M is pow2 minus 1 constant and
> > N has at least all bits in M set as well.
> >
> > * expr.c (expand_assignment): Optimize += or -= on a bit field in
> > most significant bits.
> >
> > * gcc.c-torture/execute/20040629-1.c: New test.
>
> This is OK for mainline. Could you try these changes on the testcases
> given in PR tree-optimization/15310, and see if that PR can be closed?
Didn't know we have PRs for just about everything ;)
Both testcases are optimized into addl $2, sdata(%rip); ret on x86-64
with this patch.
> Your new test is case is a bit more cyptic that it needs to be.
> You'll need to change the '#include "n.c"' to be "20040629-1.c"
Sorry for that, I noticed it too shortly after mailing the patch,
though did not think it needs a repost.
> to recursively include itself, and you seem to be using "T" both as
> a macro and as the name of a structure. All good stuff to stress
Fixed, bootstrapped/regtested on i686 {c,c++,fortran,java,objc} and
{x86_64,ppc,ppc64,s390,s390x,ia64} {c,c++,fortran}, commited.
Looking through the mails which lead to PR 15310, your one-bit
bitfield handling could be doable with a smallish change to
expand_assignment too. Just handle bitsize == 1 with constant
TREE_OPERAND (from, 1) in addition to count + bitsize == GET_MODE_BITSIZE ().
Shall I prepare a patch for that?
Another optimization would be to use ROTATE instructions if available:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x)
{
b.j += x;
}
GCC after my commit compiles this into:
movl b(%rip), %edx
movl %edx, %eax
andl $-131009, %edx
shrl $6, %eax
addl %edi, %eax
andl $2047, %eax
sall $6, %eax
orl %eax, %edx
movl %edx, b(%rip)
ret
on x86_64, while:
movl b(%rip), %eax
sall $21, %edi
roll $15, %eax
addl %edi, %eax
rorl $15, %eax
movl %eax, b(%rip)
ret
is 12 bytes shorter and on my Hammer the same number of ticks,
on P4 similar 32-bit code is 4 ticks faster.
Not sure if it wouldn't be better to expose bitfields operations
alreary during gimplification or somewhere in the middle of tree-ssa
passes though, i.e. rtl expanders would only see shifting/masking etc.
and tree-ssa passes could already optimize.
Jakub