This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] GCC bootstrap failure on cygwin is gawk 3.1.2 bug!


Dear all,

I've just spend a few hours tracking down the source of the mysterious
bootstrap failures of mainline GCC from CVS on cygwin.  It transpires
that the root cause of the problem is a bug in the latest full
release of GNU awk, gawk v3.1.2.  The error is present on all
platforms, but I suspect these issues have shown up first on cygwin
because it's the only OS currently shipping 3.1.2 as its system gawk,
which was only released 23rd March.

The bug itself is in the function rsnull_get_a_record of io.c when the
end of a record is reached when searching for a blank line, i.e. it only
triggers in RS="" mode.  The problem is that after the remaining
unread data is shuffled to the beginning of the buffer, the "bp"
pointer is set to end of the unread data, rather than reset to the
beginning.

This fouls up the code later on, on line 2826 of 3.1.2's io.c that
assumes that "bp" points to the very beginning of the record, that
skips blank lines before the record starts.  Hence, in the rare
occurrence that the character immediately following the remaining
"unread" partial record, i.e. the first character returned by "read"
is a newline, we completely skip/loose this partial line/record.

My gcc patch on Saturday that switch "cat foo bar | gawk ..." into
"gawk ... foo bar", just worked around the problem.  Reading
multiple files, rather than one large file changed the probability
of encountering a newline at the start of a buffer read.

It looks like the safest, but least efficient way to fix this is
the patch below, that resets bp to the beginning of the buffer.
Inspection of rs1_get_a_record and rsre_get_a_record reveal that
they don't suffer this problem as they don't assume bp points
to the very beginning of a record after the memmove.

My apologies to the gawk folks if they were already aware of this,
and have a bug fix awaiting their next major release.  The cygwin
folks might consider patching their version of gawk.  But I
suspect the best way out of this mess is for GCC to change the
format of lang.opt files, so that the command line options name
and attributes all appear on a single line.  The first field
being the flag name, the remaining fields being the attributes.
This would avoid requiring RS="" mode, and allow the first gawk
step to be skipped entirely, immediately proceeding to the sort.


I hope this helps.


*** io.c.old	Mon Jun 16 22:00:48 2003
--- io.c	Mon Jun 16 22:01:18 2003
*************** rsnull_get_a_record(char **out, /* point
*** 2725,2734 ****
                                  size_t dataend_off = iop->dataend - iop->off;
                                  memmove(iop->buf, iop->off, dataend_off);
                                  iop->off = iop->buf;
!                                 bp = iop->dataend = iop->buf + dataend_off;

                                  /* <reset pointers>=                                                        */
!                                 bp = iop->dataend;
                          } else {
                                  /* <save position, grow buffer>=                                            */
                                  iop->scanoff = bp - iop->off;
--- 2725,2734 ----
                                  size_t dataend_off = iop->dataend - iop->off;
                                  memmove(iop->buf, iop->off, dataend_off);
                                  iop->off = iop->buf;
!                                 iop->dataend = iop->buf + dataend_off;

                                  /* <reset pointers>=                                                        */
!                                 bp = iop->off;
                          } else {
                                  /* <save position, grow buffer>=                                            */
                                  iop->scanoff = bp - iop->off;

Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]