Bug 38439 - I/O PD edit descriptor inconsistency
Summary: I/O PD edit descriptor inconsistency
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.4.0
: P3 normal
Target Milestone: ---
Assignee: Jerry DeLisle
URL:
Keywords: rejects-valid
Depends on:
Blocks:
 
Reported: 2008-12-07 21:34 UTC by Tobias Burnus
Modified: 2009-10-12 13:42 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2009-01-22 03:05:22


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2008-12-07 21:34:25 UTC
Found at http://groups.google.com/group/comp.lang.fortran/browse_thread/thread/2fc107eda65d9065

The following code
      WRITE (*,'(1PD24.15E4)') 1.0d0
      end
is rejected in gfortran 4.2 to 4.4 with:
  Error: Period required in format specifier at (1)
(and ifort, NAG f95 and g95 also reject it).

It is accepted with openf95, sunf95 and gfortran 4.1 [the latter with a warning matching the error message above], and it is said to be accepted with g77.

The compiled programs then prints:
 1.000000000000000D+0000  (note the "D")


Using the run-time version of the format string, i.e.
      character(len=25) :: str
      str = '(1PD24.15E4)'
      WRITE (*,str) 1.0d0
      end
it works with gfortran (4.1 to 4.4) and with g95.


The PD edit descriptor was always invalid according to the Fortran standard, using '(1PE24.15E4)' works with all compilers and is valid (but it prints "...E+0000" not "D").


The question is now whether
a) One accepts it with, e.g., -std=legacy. (The error message should then point to the -std=legacy option)
-- or --
b) One checks whether on can get rid of the "D" support in libgfortran to save some microseconds; libgfortran should then also print a run-time error message.
Comment 1 Jerry DeLisle 2008-12-08 00:52:04 UTC
After thinking about it some, I think we should accept with -std=legacy since it is accepted by g77
Comment 2 Jerry DeLisle 2008-12-25 19:45:49 UTC
g77 runtime accepts this and prints:

      character(len=25) :: str
      str = '(1PD24.15E4)'
      write (*,'(1PD24.15E4)') 1.0d0
      WRITE (*,str) 1.0d0
      end

$ g77 pr38439.f 
$ ./a.out 
   1.000000000000000E+00
   1.000000000000000E+00

gfortran 4.4:

$ gfc -std=legacy pr38439.f 
pr38439.f:3.72:

      write (*,'(1PD24.15E4)') 1.0d0                                    
                                                                        1
Warning: Period required in format specifier at (1)
[jerry@lenova pr38439]$ ./a.out 
 1.000000000000000D+0000
 1.000000000000000D+0000

I think I will make the run-time consistent with the compile as far as error messages, giving a runtime warning if not -std-legacy. The 'D' exponent character is different from g77, but is standard conforming.
Comment 3 Jerry DeLisle 2009-01-21 06:51:32 UTC
I have been doing some digging.  The "PD" edit descriptor is clearly defined in the Fortran 66 standard in sections 7.2.3.1 and 7.2.3.5.  The form of the scale factor is nP

The D designates that the internal representation of the I/O list item is double precision.  This may be represented externally (printed) with an E or D exponent designator.
Comment 4 Jerry DeLisle 2009-01-22 03:05:21 UTC
Further information:  PD is not the problem here at all.  The problem is that when using the D edit descriptor, one is not allowed to also specify the exponent digits.

Thus:  '(1pD24.15)'  is valid

While: '(1pD24.15e4)'  is invalid
Comment 5 Tobias Burnus 2009-01-22 10:53:15 UTC
> Thus:  '(1pD24.15)'  is valid
Fully agreed - that version is valid and accepted with gfortran, ifort, NAG f95 etc.

> While: '(1pD24.15e4)'  is invalid

It is, but as written sunf95/openf95/gfortran 4.1 accept it at compile time and gfortran 4.x and g95 accept it at run time and it generates the different size of the exponent ("D+0" for pD24.15e1 and "D+00000" for pD24.15e5). Still, there is the question whether one wants to allow it (at compile time) with some options, reject it at run-time, or keep the status quo.

 * * *

The other question is: Why is the location marker ("1") in the error message (see comment 2) way off? If one tries something else, the location fits much better, e.g.
      WRITE (*,'(g0.3.4)') 1.0d0
                     1

Another error question is:  '(1pd0.3)'
ifort, g95, and NAG f95 claim: "Error: Zero field width invalid for D edit descriptor"; gfortran accepts it but prints "*****" while openf95 accepts it an prints "1.0E+000". I think gfortran should compile-time-diagnose it. (When passing it as string, ifort and g95 print "1.000D+00" and f95 prints ""; I think printing '******' is also ok.)
Comment 6 Jerry DeLisle 2009-01-23 05:46:03 UTC
gfortran's current format parser is completely lost by the time an error is thrown.  I have a patch that detects the actual error and the locus is spot on.

I am fixing both compile time and run time to reject the exponent width with a D edit descriptor.

I will have a look at the d0 case as well.
Comment 7 Jerry DeLisle 2009-01-24 18:11:12 UTC
Regarding the question on location markers: If gfc_error or gfc_warning are used with the %C designator, only the current line is picked up.  The actual format token locus is saved in the format_locus pointer variable and should be used with the %L designator.  I am fixing some of these in the patch I am working up.
Comment 8 Jerry DeLisle 2009-04-23 01:51:47 UTC
Getting back to this.  We have a problem of choices here.  In format statements such as:
      WRITE (*,'(1PD24.15E4)') 1.0d0

Currently gfortran allows an extension of an optional comma separating format
specifiers.  This results in the format string above being seen as:
      '(1PD24.15,E4)'

The error message given in the original post is from the missing period after the E4.

We could choose to allow the optional comma only with -std=legacy and then these misleading situations would not occur.  I am leaning in favour of this more restrictive approach. Any opinions?

Comment 9 kargls 2009-04-23 02:52:48 UTC
(In reply to comment #8)
> Getting back to this.  We have a problem of choices here.  In format statements
> such as:
>       WRITE (*,'(1PD24.15E4)') 1.0d0
> 
> Currently gfortran allows an extension of an optional comma separating format
> specifiers.  This results in the format string above being seen as:
>       '(1PD24.15,E4)'
> 
> The error message given in the original post is from the missing period after
> the E4.
> 
> We could choose to allow the optional comma only with -std=legacy and then
> these misleading situations would not occur.  I am leaning in favour of this
> more restrictive approach. Any opinions?
> 

Conforming to the Standard is always good.  I vote for -std=legacy.
Comment 10 Jerry DeLisle 2009-10-11 17:38:03 UTC
Subject: Bug 38439

Author: jvdelisle
Date: Sun Oct 11 17:37:50 2009
New Revision: 152644

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152644
Log:
2009-10-11  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	PR libgfortran/38439
	* io/format.c (parse_format_list): Add check for tokens not allowed
	after P specifier. Fix comments.  Remove un-needed code. Fix the
	default exponent list. Correct pointer assignment error.

Modified:
    trunk/libgfortran/ChangeLog
    trunk/libgfortran/io/format.c

Comment 11 Jerry DeLisle 2009-10-11 17:41:39 UTC
Subject: Bug 38439

Author: jvdelisle
Date: Sun Oct 11 17:41:23 2009
New Revision: 152645

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152645
Log:
2009-10-11 Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	PR fortran/38439
	* io.c (check_format): Fix locus for error messages and fix a comment.

Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/io.c

Comment 12 Jerry DeLisle 2009-10-11 19:20:47 UTC
Fixed enough I think. Closing.
Comment 13 Jerry DeLisle 2009-10-12 00:53:01 UTC
Subject: Bug 38439

Author: jvdelisle
Date: Mon Oct 12 00:52:45 2009
New Revision: 152657

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152657
Log:
2009-10-11  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	PR libgfortran/38439
	* io/format.c (parse_format_list): Correct logic for FMT_F reading vs
	writing. Code clean-up.

Modified:
    trunk/libgfortran/ChangeLog
    trunk/libgfortran/io/format.c

Comment 14 Jerry DeLisle 2009-10-12 00:54:22 UTC
Subject: Bug 38439

Author: jvdelisle
Date: Mon Oct 12 00:54:11 2009
New Revision: 152658

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152658
Log:
2009-10-11  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	PR libgfortran/38439
	* gfortran.dg/fmt_error_9.f: New test.
	* gfortran.dg/fmt_error_10.f: New test.

Added:
    trunk/gcc/testsuite/gfortran.dg/fmt_error_10.f
    trunk/gcc/testsuite/gfortran.dg/fmt_error_9.f
Modified:
    trunk/gcc/testsuite/ChangeLog

Comment 15 Dominique d'Humieres 2009-10-12 07:39:38 UTC
The polyhedron test linpk.f90 now fails with:

[ibook-dhum] lin/test% linpk
     norm. resid      resid           machep         x(1)          x(n)
At line 38 of file linpk.f90 (unit = 6, file = 'stdout')
Fortran runtime error: Comma required after P descriptor
(1P5d16.8)
   ^

Although this format is not explicitly allowed by F95, it is for F2003/2008:

F95 standard:

10.1.1 FORMAT statement

R1001 format-stmt is FORMAT format-specification

R1002 format-specification is ( [ format-item-list ] )

Constraint: The format-stmt shall be labeled.

Constraint: The comma used to separate format-items in a format-item-list
may be omitted as follows:

(1) Between a P edit descriptor and an immediately following F, E, EN, ES, D, or G edit descriptor (10.6.5)

(2) Before a slash edit descriptor when the optional repeat specification is not present (10.6.2)

(3) After a slash edit descriptor

(4) Before or after a colon edit descriptor (10.6.3)

F2003/2008 standard:

10.3 Form of a format item list

10.3.1 Syntax

R1003 format-items is format-item [ [ , ] format-item ] ...

R1004 format-item is [ r ] data-edit-desc
                  or control-edit-desc
                  or char-string-edit-desc
                  or [ r ] ( format-items )

R1005 unlimited-format-item is * ( format-items )

R1006 r is int-literal-constant

C1002 (R1003) The optional comma shall not be omitted except

  between a P edit descriptor and an immediately following F, E, EN, ES, D, or G edit descriptor (10.8.5), 
  possibly preceded by a repeat specification,
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  before a slash edit descriptor when the optional repeat speci cation does
  not appear (10.8.2), after a slash edit descriptor, or

  before or after a colon edit descriptor (10.8.3)

C1003 (R1006) r shall be positive.

C1004 (R1006) A kind parameter shall not be speci ed for r .

1 The integer literal constant r is called a repeat specification.

Comment 16 Jerry DeLisle 2009-10-12 12:42:19 UTC
Interestingly, I removed this previously:

  switch (t)
     {
-    case FMT_P:
-      t = format_lex (fmt);
-      if (t == FMT_POSINT)
-	{
-	  fmt->error = "Repeat count cannot follow P descriptor";
-	  goto finished;
-	}
-
-      fmt->saved_token = t;
-      get_fnode (fmt, &head, &tail, FMT_P);
-
-      goto optional_comma;
-

It was dead code and wrong.  Lets open a new PR.
Comment 17 Dominique d'Humieres 2009-10-12 13:31:52 UTC
I think the problem is here (around line 706 in the last commit):

      if (t == FMT_F || t == FMT_EN || t == FMT_ES || t == FMT_D
          || t == FMT_G || t == FMT_E)
        {
          repeat = 1;
          goto data_desc;
        }

      if (t != FMT_COMMA && t != FMT_RPAREN && t != FMT_SLASH)
        {
          fmt->error = "Comma required after P descriptor";
          goto finished;
        }

There is not provision for a repeat count before D, E, EN, ES, F or G
descriptors without comma before it.
Comment 18 Jerry DeLisle 2009-10-12 13:42:11 UTC
See PR 41683 and continue there.