This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Memory corruption due to word sharing

From: Richard Guenther <rguenther at suse dot de>
To: Linus Torvalds <torvalds at linux-foundation dot org>
Cc: Michael Matz <matz at suse dot de>, Jiri Kosina <jkosina at suse dot cz>, Colin Walters <walters at verbum dot org>, Jan Kara <jack at suse dot cz>, LKML <linux-kernel at vger dot kernel dot org>, linux-ia64 at vger dot kernel dot org, dsterba at suse dot cz, ptesarik at suse dot cz, gcc at gcc dot gnu dot org
Date: Thu, 2 Feb 2012 10:35:34 +0100 (CET)
Subject: Re: Memory corruption due to word sharing
References: <20120201151918.GC16714@quack.suse.cz> <1328114266.5355.44.camel@lenny> <CA+55aFxmdskUXX1iBaPx7rUGD95UqNJxhe1BLeVtZjqHjBpsPA@mail.gmail.com> <alpine.LRH.2.00.1202011808240.22725@twin.jikos.cz> <Pine.LNX.4.64.1202011818570.25409@wotan.suse.de> <CA+55aFxt3rG8_HyoRqiMOKVDqaGTkRk=Kh+bAJTAxxdHAGzuVw@mail.gmail.com>

On Wed, 1 Feb 2012, Linus Torvalds wrote:

> On Wed, Feb 1, 2012 at 9:41 AM, Michael Matz <matz@suse.de> wrote:
> >
> > One problem is that it's not a new problem, GCC emitted similar code since
> > about forever, and still they turned up only now (well, probably because
> > ia64 is dead, but sparc64 should have similar problems). ?The bitfield
> > handling code is _terribly_ complex and fixing it is quite involved. ?So
> > don't expect any quick fixes.
> 
> I agree that bitfields are nasty, I've had to handle them myself in
> sparse. And we have actually had problems with bitfields before, to
> the point where we've replaced use of bitfields with masking and
> setting bits by hand.
> 
> But I also think that gcc is simply *buggy*, and has made them much
> nastier than they should be. What gcc *should* have done is to turn
> bitfield accesses into shift-and-masking of the underlying field as
> early as possible, and then do all optimizations at that level.
> 
> In fact, there is another gcc bug outstanding (48696) where I complain
> about absolutely horrible code generation, and that one was actually
> the exact same issue except in reverse: gcc wouldn't take the
> underlying size of the bitfield into account, and use the wrong
> (smaller) size for the access, causing absolutely horrendous code
> generation that mixes byte and word accesses, and causes slowdowns by
> orders of magnitude.

Yeah, sorry for dropping the ball on this one (and missing my original
GCC 4.7 target ...)

> And it really is the same issue: gcc has forgotten what the base type
> is, and tries to "compute" some type using the actual bits. Which is
> simply *wrong*. Always has been.
> 
> It's stupid, it generates bad code (both from performance *and* from a
> correctness angle), and it has actually resulted in gcc having *extra*
> complexity because it keeps the bitfield data around much too late.
> 
> > The other problem is specification. ?While you think that the code you
> > wrote is totally obvious it might not actually be so. ?For instance, what
> > about this struct:
> >
> > {long l:32; int i1:16; short s; int i2:1; char c:7; short s2:8; short s3;}
> >
> > What are the obviously correct accesses for various writes into this
> > struct?
> 
> I really don't think that example matters. You should instead turn the
> question around, and look at the real code examples, make *those*
> generate good and obviously correct code, and then *after* you've done
> that, you start looking at the mixed case where people do odd things.

Well, it matters because it shows that simply using the declared type
isn't going to work in the end.  What we instead need to do is
remember the underlying objects the bitfield packing algorithm uses.
It then simply becomes obvious how to implement the bitfield accesses.
I hope to get back to this for GCC 4.8.

> Quite frankly, the layout that makes *sense* for something like the
> above is not to pack them. You'd just do

Heh, but of course the ABI specifies they are supposed to be packed ...

In the end we all agree GCC does something nasty (and I would call
it a bug even), but any solution we find in GCC won't be backportable
to earlier releases so you have to deal with the GCC bug for quite
some time and devise workarounds in the kernel.  You'll hit the
bug for all structure fields that share the largest aligned
machine word with a bitfield (thus the size depends on the alignment
of the full object, not that of the struct containing the bitfield
in case that struct is nested inside another more aligned one).
This situation should be easily(?) detectable with sparse.

Thanks,
Richard.

References:
- Memory corruption due to word sharing
  - From: Jan Kara
- Re: Memory corruption due to word sharing
  - From: Colin Walters
- Re: Memory corruption due to word sharing
  - From: Linus Torvalds
- Re: Memory corruption due to word sharing
  - From: Jiri Kosina
- Re: Memory corruption due to word sharing
  - From: Michael Matz
- Re: Memory corruption due to word sharing
  - From: Linus Torvalds

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]