This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: 200.sixtrack is miscomparing on x86_64 and i686 (-O2) since Mar 19th
On Thu, 22 Mar 2007, Jerry DeLisle wrote:
> Richard Guenther wrote:
> > On Thu, 22 Mar 2007, Richard Guenther wrote:
> >
> > > On Thu, 22 Mar 2007, Richard Guenther wrote:
> > >
> > > > On Thu, 22 Mar 2007, Richard Guenther wrote:
> > > >
> > > > > On Wed, 21 Mar 2007, Jerry DeLisle wrote:
> > > > >
> > > > > > Richard Guenther wrote:
> > > > > > > The regression was introduced between rev 123033 and 123047. The
> > > > > > > likely
> > > > > > > offender is
> > > > > > >
> > > > > > > Index: libgfortran/ChangeLog
> > > > > > > ===================================================================
> > > > > > > --- libgfortran/ChangeLog (revision 123033)
> > > > > > > +++ libgfortran/ChangeLog (revision 123047)
> >>>>>> @@ -1,3 +1,15 @@
> > > > > > > +2007-03-18 Jerry DeLisle <jvdelisle@gcc.gnu.org>
> > > > > > > +
> > > > > > > + PR libgfortran/31052
> > > > > > > + * io/file_position (st_rewind): Fix comments. Remove use
> > > > > > > of
> > > > > > > + test_endfile. Don't seek if already at 0 position. Use
> > > > > > > new
> > > > > > Richard,
> > > > > >
> > > > > > Can you confirm that this occurs only with -O2 ?
> > > > > No, it occurs also with -O3 (but -O2 is the lowest level I have
> > > > > checked).
> > > > >
> > > > > > Can you revert the patch on a local tree and confirm this is it?
> > > > > I'll do so later today.
> > > > I can confirm reverting the patch fixes the problem. (Just exchanging
> > > > the libgfortran runtime library to a one with the patch reverted fixes
> > > > the problem)
> > > >
> > > > > > Would you be willing to help me debug this, the patch is not that
> > > > > > complicated?
> > > > > >
> > > > > > If not is there anyone with SPEC that I can work with?
> > > > > >
> > > > > > Also, I do work out of my home and a Non-Disclosure Agreement is no
> > > > > > problem to
> > > > > > me.
> > > > > I don't think a NDA will help here. I'll see if there's something
> > > > > obvious.
> > > > An strace difference shows (diff from good to bad)
> > > >
> >>> @@ -80532,6096 +80531,28 @@
> > > > write(1, " Alignment errors read"..., 54) = 54
> > > > write(1, "\n", 1) = 1
> > > > write(1, "\n", 1) = 1
> > > > -write(1, " From file fort.8 : "..., 56) = 56
> > > > +read(6, "", 8192) = 0
> > > > +read(6, "", 8192) = 0
> > > > write(1, "\n", 1) = 1
> > > > write(1, "\n", 1) = 1
> > > > -lseek(8, 0, SEEK_SET) = 0
> > > > -lseek(8, 0, SEEK_SET) = 0
> > > > -ftruncate(8, 0) = 0
> > > > -write(8, "\'QF9.R1\' 63.279998 "..., 161) = 161
> > > > ...
> > > >
> > > Which matches the following snippet from daten.f:
> > >
> > > write(6,*) ' Alignment errors read in ' ,
> > > + 'from external file'
> > > write(6,*)
> > > iexread=0
> > > ifiend8=0
> > > iexnum=0
> > > read(8,10020,end=1581)
> > > rewind 8 do 1580 i=1,mper*mbloz
> > > ix=ic(i)
> > > if(ix.gt.nblo) then
> > > ix=ix-nblo
> > > if(iexread.eq.0) then
> > > ilm0(1)=' '
> > > C READ IN HORIZONTAL AND VERTICAL MISALIGNMENT AND TILT
> > > if(ifiend8.eq.0) then
> > > read(8,10020,end=1550,iostat=ierro) ch
> > > if(ierro.gt.0) call error(86)
> > > else
> > > goto 1550
> > > endif
> > > call intepr(1,1,ch,ch1)
> > > read(11,*) ilm0(1),alignx,alignz,tilt
> > > iexnum=iexnum+1
> > > bezext(iexnum)=ilm0(1)
> > > iexread=1
> > > goto 1570
> > > 1550 ifiend8=1
> > > if(iexnum.eq.0) call error(86)
> > >
> > > And here we bail out coming straight from
> > > read(8,10020,end=1550,iostat=ierro) ch
> > > so we are at EOF of 8 it seems. Wrongly so. Unit 8 is opened via
> > > maincr.f: open(unit=8, file="fort.8")
> > > 00000003> ls -l fort.8 -rw-r--r-- 1 rguenther suse 0 Sep 29 1999 fort.8
> > > The calls above are the only reads from unit 8.
> >
> > The problem seems to be that we don't detect that the first read
> >
> > read(8,10020,end=1581)
> > rewind 8
> > do 1580 i=1,mper*mbloz
> >
> > isn't EOFed! (In the good case we jump right to 1581):
> >
> > 1581 continue
> > write(6,*) ' From file fort.8 :',iexnum,
> > + ' values read in.'
> > write(6,*)
> > endif
> >
> > (as we can see in the strace output -- though from the strace output
> > it seems we know it is at EOF without starting the read in the first place).
> >
> > ...
> >
> > > I don't know how I can trust the strace output, it shows in the gone-bad
> > > case only
> > >
> > > write(1, " Alignment errors read"..., 54) = 54
> > > write(1, "\n", 1) = 1
> > > write(1, "\n", 1) = 1
> > > read(6, "", 8192) = 0
> > > read(6, "", 8192) = 0
> > > write(1, "\n", 1) = 1
> > > write(1, "\n", 1) = 1
> > > write(1, "\n", 1) = 1
> > > write(1, " +++++++++++++++++++++++"..., 33) = 33
> > > write(1, "\n", 1) = 1
> > > write(1, " +++++ERROR DETECTED++++"..., 33) = 33
> > >
> > > where the error printing is from the call to error(). So I only
> > > see two reads from fd 6 (does that map 1:1 to the file numbers in the
> > > fortran source? I suppose not as 1 == 6 for stdout here appearantly)
> > >
> > > The format for 10020 is
> > >
> > > 10020 format(a80)
> > >
> > > Hope this helps.
> > >
> > > Richard.
> > >
> > >
> >
> I won't be able to study this until later tonight.
>
> I suspect that deleting the test_endfile in one of the other several places
> other than file_pos.c (st_rewind) is getting us. So we could try restoring
> that function and putting it back everywhere except st_rewind and see if that
> fixes it. The bug I fixed with the original patch was triggered in st_rewind,
> not anywhere else. I was probably overly zealous in intruding elsewhere with
> that patch.
>
> So, if you like, I will do this tonight: I will revert the patch on my local
> tree, then send you a new one that is less intrusive for you to test.
>
> Is this OK?
Yes, thanks. (note that I left fortran@ out by purpose because I quoted
from S... - oh well ;))
Richard.