This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: g77 alignment bug (egcs-1.1.1)


After an off line discussion, I'm returning to the egcs list as
suggested.

craig@jcb-sc.com wrote:
> 
> >
> >Alignment is *not* a mere performance issue.  Some codes can be totally
> >*broken* by incorrect alignment because indexing gets messed up.  Since
> >REAL*8 arrays are indexed in 8-byte increments, there is no way to get
> >to another chunk of memory unless it shares the first array's
> >alignment.  The way things are, one gets core dumps with 50%
> >probability, depending on how the compiler aligned REAL*8 arrays.
> 
> Perhaps you should have continued this discussion on the egcs mailing
> list?
> 
> REAL*8 arrays are *not* indexed in 8-byte increments on the Intel x86
> architecture.  They are indexed as are every other data type: by
> byte offset.
> 
> If you are linking with software that makes some kind of incorrect
> assumptions about alignments, then get that software fixed.  To my
> knowledge, there is *no* problem with aligning REAL*8 items off of
> 8-byte boundaries on x86 *other* than performance.  If you disagree,
> please submit some clear evidence (e.g. a pointer to a specification
> saying so), ideally to the egcs mailing list, so we (including people
> more knowledgable than I) can all comment.

Certainly.  Here it goes...

REAL*8 arrays are indexed as any other array: by integers 1,2,3,...
These indices are compiled into memory references, i.e. byte offsets.
The byte offsets from the base of a REAL*8 array are 0,8,16,...
Fortran users see integer indices, not actual byte offsets.
This should fix the terms, so that we know what the difference is.

Now let us consider a common F77 programming practice.  One would like
to have work arrays, say 3D arrays, dynamically allocated at runtime. 
F77 does not do that, but C does, so here is what F77 programmers
typically do:

(1) Declare REAL*8 WORK(1) in the main program
(2) Use a C function ptr=malloc(...) to get a chunk of memory
(3) Compute the offset from WORK(1) to ptr, measured in bytes
(4) Compute the INDEX=(offset/8)+1
(5) Pass WORK(INDEX) to the rest of the code, seen as WORK3(N1,N2,N3)

...and things generally work just fine.  However, note the step (4),
where the offset (in bytes) is divided by 8 to get the INDEX.  The
implicit assumption is that WORK and the actual workspace at ptr are
separated by an integer multiple of 8 bytes.

This assumption is generally correct.  Properly aligned REAL*8 variables
are located at 64-bit boundaries; moreover, so is the malloc()'d memory
chunk (if you are not sure and your system offers memalign(), use it
instead).  Therefore, the offset will be divisible by 8 and everything
is fine.

Or at least it was fine.  The code in question used this approach on
lots of different computers and ran reliably for over a decade, until we
tried it on our x86 Linux box using g77.  As expected, malloc() returned
a 64-bit aligned pointer.  However, about half of the time g77 aligned
WORK *only* on 32-bit boundary, so that the offset was not divisible by
8.

This of course broke the code, because it became impossible to reach the
allocated block via the WORK(INDEX) mechanism.  When compiler puts WORK
at byte 4 and malloc() gives a chunk of memory at byte 8, INDEX
evaluates to 1 (because it cannot be 1.5).  The code runs, but two bad
things happen:

(1) four bytes at 4,5,6,7 just below ptr=8 are clobbered
(2) when WORK(INDEX) at address 4 is passed to the C function free(),
program crashes since this location was never allocated (it is four
bytes below the allocated chunk)

Of course, one could code around this g77 compiler bug, but it is not
trivial.  A *large* collection of Fortran routines use this mechanism to
get memory numerous times during a run.  Fixing Fortran code is too much
work; fixing the C code is easier but still requires subtle logic of the
following type:

(1) examine WORK address, check alignment
(2) if not on 64-bit bdry, malloc() a chunk 4 bytes larger
(3) compute INDEX so that WORK(INDEX) fits within allocated memory
(4) to free WORK(INDEX), free() the first lower 64-bit aligned address

...etc.  

Although this bug can be bypassed, it was quite tricky to find.  WORK
was sometimes aligned correctly by g77, sometimes not, depending on
optimization switches.  The code sometimes ran perfectly, other times it
would seem to run correctly then crash.  Since this code has been
thoroughly tested on numerous platforms for over a decade, it is not
surprising that the problem turned out to be a (documented) compiler
bug.

Sincerely,
Josip






-- 
Dr. Josip Loncaric, Senior Staff Scientist        mailto:josip@icase.edu
ICASE, Mail Stop 403                        http://www.icase.edu/~josip/
NASA Langley Research Center             mailto:j.loncaric@larc.nasa.gov
Hampton, VA 23681-2199, USA    Tel. +1 757 864-2192  Fax +1 757 864-6134


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]