This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
- From: "torvalds at linux-foundation dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 20 Apr 2011 16:17:36 +0000
- Subject: [Bug rtl-optimization/48696] Horrible bitfield code generation on x86
- Auto-submitted: auto-generated
- References: <bug-48696-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696
--- Comment #11 from Linus Torvalds <torvalds@linux-foundation.org> 2011-04-20 16:16:52 UTC ---
(In reply to comment #8)
>
> Unfortunately the underlying type isn't easily available (at least I didn't
> yet find it ...). But I suppose we have to guess anyway considering
> targets that don't handle unaligned accesses well or packed bitfields.
> Thus, an idea was to use aligned word-size loads/stores and only at the
> start/end of a structure fall back to smaller accesses (for strict align
> targets).
That sounds fine.
The only reason to bother with the "underlying type" is that I suspect it could
be possible for educated programmers to use it as a code generation hint. IOW,
if all the individual fields end up fitting nicely in "char", using that as a
base type (even if the _total_ fields don't fit in a single byte) might be a
good hint for the compiler that it can/should use byte accesses and small
constants.
But using the biggest aligned word-size is probably equally good in practice.
And if you end up narrowing the types on _reads_, I think that's fine on x86. I
forget the exact store buffer forwarding rules (and they probably vary a bit
between different microarchitectures anyway), but I think almost all of them
support forwarding a larger store into a smaller (aligned) load.
It's just the writes that should normally not be narrowed.
(Of course, sometimes you may really want to narrow it. Replacing a
andl $0xffffff00,(%rax)
with a simple
movb $0,(%rax)
is certainly a very tempting optimization, but it really only works if there
are no subsequent word-sized loads that would get fouled by the write buffer
entry.