This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 200.sixtrack is miscomparing on x86_64 and i686 (-O2) since Mar 19th


Richard Guenther wrote:
On Thu, 22 Mar 2007, Richard Guenther wrote:

On Thu, 22 Mar 2007, Richard Guenther wrote:

On Thu, 22 Mar 2007, Richard Guenther wrote:

On Wed, 21 Mar 2007, Jerry DeLisle wrote:

Richard Guenther wrote:
The regression was introduced between rev 123033 and 123047.  The likely
offender is

Index: libgfortran/ChangeLog
===================================================================
--- libgfortran/ChangeLog (revision 123033)
+++ libgfortran/ChangeLog (revision 123047)
@@ -1,3 +1,15 @@
+2007-03-18 Jerry DeLisle <jvdelisle@gcc.gnu.org>
+
+ PR libgfortran/31052
+ * io/file_position (st_rewind): Fix comments. Remove use of
+ test_endfile. Don't seek if already at 0 position. Use new
Richard,

Can you confirm that this occurs only with -O2 ?
No, it occurs also with -O3 (but -O2 is the lowest level I have checked).

Can you revert the patch on a local tree and confirm this is it?
I'll do so later today.
I can confirm reverting the patch fixes the problem.  (Just exchanging
the libgfortran runtime library to a one with the patch reverted fixes
the problem)

Would you be willing to help me debug this, the patch is not that complicated?

If not is there anyone with SPEC that I can work with?

Also, I do work out of my home and a Non-Disclosure Agreement is no problem to
me.
I don't think a NDA will help here. I'll see if there's something obvious.
An strace difference shows (diff from good to bad)

@@ -80532,6096 +80531,28 @@
 write(1, "           Alignment errors read"..., 54) = 54
 write(1, "\n", 1)                       = 1
 write(1, "\n", 1)                       = 1
-write(1, "         From file fort.8 :     "..., 56) = 56
+read(6, "", 8192)                       = 0
+read(6, "", 8192)                       = 0
 write(1, "\n", 1)                       = 1
 write(1, "\n", 1)                       = 1
-lseek(8, 0, SEEK_SET)                   = 0
-lseek(8, 0, SEEK_SET)                   = 0
-ftruncate(8, 0)                         = 0
-write(8, "\'QF9.R1\'  63.279998             "..., 161) = 161
...

Which matches the following snippet from daten.f:

write(6,*) ' Alignment errors read in ' ,
+ 'from external file'
write(6,*)
iexread=0
ifiend8=0
iexnum=0
read(8,10020,end=1581)
rewind 8 do 1580 i=1,mper*mbloz
ix=ic(i)
if(ix.gt.nblo) then
ix=ix-nblo
if(iexread.eq.0) then
ilm0(1)=' '
C READ IN HORIZONTAL AND VERTICAL MISALIGNMENT AND TILT
if(ifiend8.eq.0) then
read(8,10020,end=1550,iostat=ierro) ch
if(ierro.gt.0) call error(86)
else
goto 1550
endif
call intepr(1,1,ch,ch1)
read(11,*) ilm0(1),alignx,alignz,tilt
iexnum=iexnum+1
bezext(iexnum)=ilm0(1)
iexread=1
goto 1570
1550 ifiend8=1
if(iexnum.eq.0) call error(86)


And here we bail out coming straight from
read(8,10020,end=1550,iostat=ierro) ch
so we are at EOF of 8 it seems. Wrongly so. Unit 8 is opened via
maincr.f: open(unit=8, file="fort.8")
00000003> ls -l fort.8 -rw-r--r-- 1 rguenther suse 0 Sep 29 1999 fort.8
The calls above are the only reads from unit 8.

The problem seems to be that we don't detect that the first read


        read(8,10020,end=1581)
        rewind 8
        do 1580 i=1,mper*mbloz

isn't EOFed! (In the good case we jump right to 1581):

 1581   continue
        write(6,*) '        From file fort.8 :',iexnum,
     +  ' values read in.'
        write(6,*)
      endif

(as we can see in the strace output -- though from the strace output
it seems we know it is at EOF without starting the read in the first place).


...

I don't know how I can trust the strace output, it shows in the gone-bad
case only

write(1, "           Alignment errors read"..., 54) = 54
write(1, "\n", 1)                       = 1
write(1, "\n", 1)                       = 1
read(6, "", 8192)                       = 0
read(6, "", 8192)                       = 0
write(1, "\n", 1)                       = 1
write(1, "\n", 1)                       = 1
write(1, "\n", 1)                       = 1
write(1, "         +++++++++++++++++++++++"..., 33) = 33
write(1, "\n", 1)                       = 1
write(1, "         +++++ERROR DETECTED++++"..., 33) = 33

where the error printing is from the call to error().  So I only
see two reads from fd 6 (does that map 1:1 to the file numbers in the
fortran source?  I suppose not as 1 == 6 for stdout here appearantly)

The format for 10020 is

10020 format(a80)

Hope this helps.

Richard.



I won't be able to study this until later tonight.

I suspect that deleting the test_endfile in one of the other several places other than file_pos.c (st_rewind) is getting us. So we could try restoring that function and putting it back everywhere except st_rewind and see if that fixes it. The bug I fixed with the original patch was triggered in st_rewind, not anywhere else. I was probably overly zealous in intruding elsewhere with that patch.

So, if you like, I will do this tonight: I will revert the patch on my local tree, then send you a new one that is less intrusive for you to test.

Is this OK?

Jerry








Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]