This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Constrain valid arguments to BIT_FIELD_REF
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Diego Novillo <dnovillo at google dot com>
- Cc: Richard Guenther <rguenther at suse dot de>, gcc at gcc dot gnu dot org
- Date: Tue, 4 Mar 2008 17:47:36 +0100
- Subject: Re: Constrain valid arguments to BIT_FIELD_REF
- References: <Pine.LNX.4.64.0803041650000.4133@zhemvz.fhfr.qr> <47CD7584.2010004@google.com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Tue, Mar 04, 2008 at 11:15:00AM -0500, Diego Novillo wrote:
> >fold currently optimizes a.b.c == 0 to BIT_FIELD_REF <a, 8, big-num> & 1
> >for bit field field-decls c. IMHO this is bad because it pessimizes
> >TBAA (needs to use a's alias set, not the underlying integral type
> >alias set) and it "breaks" type correctness as arbitrary structure
> >types appear as operand zero.
>
> Agreed. Unless this was done to fix some target-specific problem, I
> think it should disappear.
Perhaps not in early GIMPLE passes, but we certainly want to lower
bitfield accesses to BIT_FIELD_REFs or something similar before expansion,
otherwise expander and RTL optimization passes aren't able to optimize but
the most trivial cases. GCC generates for bitfields terrible code ATM,
try say:
struct S
{
unsigned int a : 3;
unsigned int b : 3;
unsigned int c : 3;
unsigned int d : 3;
unsigned int e : 3;
unsigned int f : 3;
unsigned int g : 3;
unsigned int h : 11;
} a, b, c;
void foo (void)
{
a.a = b.a | c.a;
a.b = b.b | c.b;
a.c = b.c | c.c;
a.d = b.d | c.d;
a.e = b.e | c.e;
a.f = b.f | c.f;
a.g = b.g | c.g;
a.h = b.h | c.h;
}
which could be optimized into BIT_FIELD_REF <a, 32, 0> = BIT_FIELD_REF <b, 32, 0> | BIT_FIELD_REF <c, 32, 0>;
so something like 3 or 4 instructions, yet we generate 51.
Operating on adjacent bitfield fields is fairly common.
Similarly (and perhaps far more common in the wild) is e.g.
void bar (void)
{
a.a = 1;
a.b = 2;
a.c = 3;
a.d = 4;
a.e = 5;
a.f = 6;
a.g = 7;
a.h = 8;
}
- on x86_64 24 instructions on the trunk, 1 is enough.
RTL is too late to try to optimize this, I've tried that once.
Given combiner's limitation of only trying to combine 3 instructions
at once, we'd need more. So this is something that needs to
be optimized at the tree level, either by having a separate pass
that takes care of it, or by lowering it early enough into something
that the optimizers will handle.
Jakub