Bug 39670 - dollar sign in entities is not recognized when it is first symbol
Summary: dollar sign in entities is not recognized when it is first symbol
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.5.0
: P3 minor
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic
Depends on:
Blocks:
 
Reported: 2009-04-07 07:55 UTC by Alexander Nickolsky
Modified: 2009-04-08 19:29 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Nickolsky 2009-04-07 07:55:30 UTC
With option -fdollar-ok the compiler does not recognize symbols that start with $

test case:

	program test
	a$a = 12
	$a = 12    ! error
	end

I currently have a lot of legacy code with identifiers starting with $.
Comment 1 Daniel Franke 2009-04-07 08:42:55 UTC
Confirmed. Checked 4.3.4 and 4.5.0, both complain about '$a'.

Question is, if this is allowed at all. In comparison: digits are allowed in function names, but not as the first character; 'FUNCTION f3()' is valid, 'FUNCTION 3f()' is not. Does this apply to '$' as well?
Comment 2 Alexander Nickolsky 2009-04-07 08:59:33 UTC
As I already said, I have code, being compiled with MS Fortran, that has
a lot of variable names starting with $. MS Fortran allows it.

My personal opinion is that the Fortran compiler's primary use is support
of the legacy code. That means that the implementation of any extension 
or even strange behaviour of existing compilers could be useful and will
save hours of hard work. Forcing standards on existing code is illogical.

Comment 3 Dominique d'Humieres 2009-04-07 09:12:03 UTC
> Question is, if this is allowed at all. In comparison: digits are allowed in
> function names, but not as the first character; 'FUNCTION f3()' is valid,
> 'FUNCTION 3f()' is not. 

My fortran book says: 

names must consist of between 1 and 31 alphanumeric characters (letters, underscores, and numerals) of which the first MUST BE A LETTER.

> Does this apply to '$' as well?

Since '$' is an extension, you know the answer! It is outside the scope of the standard. 
BTW what should be the implict type of a variable starting with a $?
Now I am not sure that, in the relics of past, names starting with a '$' had some side effects.

I don't think it is likely to change this extension in gfortran. For the legacy code, the workaround depends on the kind of the variables starting with a '$'. If they are all of the same kind (say integer) then it is trivial with you favorite editor to replace all the '$name' by 'izzname' or 'i$name' (may be with some filters if you have $ elsewhere). 

> My personal opinion is that the Fortran compiler's primary use is support of the legacy code.

This is your personnal opinion, not mine and probably not of anyone coding in fortran.
In addition I have always been very suspicious about the validity of the port of such "legacy" codes.

Comment 4 Dominique d'Humieres 2009-04-07 10:18:20 UTC
Note that '$a' is also rejected by g77.
Comment 5 Tobias Burnus 2009-04-07 13:43:39 UTC
Many compilers support $ signs as extension (ISO standard Fortran does not). However, only a few support a leading $ sign. One of the questions which immediately come up, which data type is $foo (implicit typing).

I think the issue had come up before and the PRs were closed as wont-fix. This can of cause be reconsidered, but implicit typing is a real issue here.


(In reply to comment #2)
> As I already said, I have code, being compiled with MS Fortran, that has
> a lot of variable names starting with $. MS Fortran allows it. My personal 
> opinion is that the Fortran compiler's primary use is support of the legacy
> code. That means that the implementation of any extension 
> or even strange behaviour of existing compilers could be useful and will
> save hours of hard work. Forcing standards on existing code is illogical.

I full-heartedly disagree. I think the primary point of a compiler is to be standard compliant. It does not help if I have to use extension A with compiler B and C, but extension E with compiler F while compiler G does not have a feature at all and H's syntax is like B's but it does something differently. - My impression is that nowadays all compiler vendors and most of the compiler customers think likewise. (Still, supporting some old vendor extension is seen as important by both.) There exists enough newly written software, updated and also old software which is standard compliant - and thus also requires that the compiler is compliant.

Having said that I don't oppose to suppose vendor extensions, given that they (a) don't clash with the standard [though difficult to predict with regards to the future standards], (b) are reasonably widely used and (c) cannot easily be replaced by something standard conform.  (The [long-term] implementation burden can be potentially large even for a seemingly simple addition.)

For '$' as first (!) character I think (a) is fulfilled, (b) and (c) probably not.

I think gfortran does fairly well in this regard compared with other compilers, except of DEC structures all major extensions should be there.

One big issue with vendor extensions is that there is a huge number of them - and some even conflict with each other! Do you want to support the one of Microsoft, or of IBM, or of Intel, Digital, Sun, SGI, Cray, Pathscale, Portland Group, Absoft, Fujisu, g77, f2c, ...? And for vendor X - the of of version 4.0 or the one of 5.0 or ...?

> implementation of [...] even strange behaviour [...] will save hours of
> hard work.

I sincerely doubt that. I think it will cause a lot of time for application developers and will lead to strange bugs. Relying on strange behaviour can also bite you if it was unintended and was then fixed in a new version of the same compiler.

Nevertheless, one can re-consider the $ but the implicit typing issue has to be solved. It would help if you could make a survey (e.g. based on the documentation) and see how the few other compilers, which support it, are handling that. (I think IBM does, the xlf90 documentation could be a starting point.) By posting the result you show (a) how implicit typing is handled which is essential, (b) you proof that that is a feature supported by several vendors and (c) you show that you are really interested in that feature.
Comment 6 kargls 2009-04-07 14:46:03 UTC
(In reply to comment #2)
> As I already said, I have code, being compiled with MS Fortran, that has
> a lot of variable names starting with $. MS Fortran allows it.
> 
> My personal opinion is that the Fortran compiler's primary use is support
> of the legacy code. That means that the implementation of any extension 
> or even strange behaviour of existing compilers could be useful and will
> save hours of hard work. Forcing standards on existing code is illogical.

In other words, you want to shift the hard work of implementing all
of these extensions on the compiler writer.  Sorry, but is illogical
particular when there is only 5 or so people writing the compiler 
when they can some up with some free time.

Learn how to use sed and globally replace $ with whatever string 
you want.

Comment 7 kargls 2009-04-07 14:54:03 UTC
(In reply to comment #5)
> Many compilers support $ signs as extension (ISO standard Fortran does not).
> However, only a few support a leading $ sign. One of the questions which
> immediately come up, which data type is $foo (implicit typing).
> 
> I think the issue had come up before and the PRs were closed as wont-fix. This
> can of cause be reconsidered, but implicit typing is a real issue here.

Yes, that is the reason previous PRs were closed.  I suppose Alexander will
counter that we should do whatever MS Fortran did with implicit typing.  This
may have been a compelling argument except that MS Fortran hasn't been sold 
in the last 15 years or so.  

This sould be closed with WONTFIX.
Comment 8 Alexander Nickolsky 2009-04-08 07:12:39 UTC
>One of the questions which immediately come up, 
>which data type is $foo (implicit typing)

One interpretation of implicit typing of Fortran is:
I-N are integers, everything else is real.

>I think the issue had come up before and the PRs were closed as wont-fix.

Sorry, but I could not find it. I searched the database before posting.

>I think the primary point of a compiler is to be standard compliant.
>My impression is that nowadays all compiler vendors and most of the compiler
customers think likewise.

That's correct, but Fortran (and possibly also Cobol and Algol, if anybody cares about Algol, which I doubt) is probably an exception. It is not widely
used nowadays, and there are two reasons to keep it: there is a lot of legacy code and there are people who do not want to learn modern languages.

>I think gfortran does fairly well in this regard compared with other 
>compilers, except of DEC structures all major extensions should be there.

You can see a survey at http://www.polyhedron.com/pb05-win32-language0html
and find out that gfortran and g95 are somewhere in the middle, neither
best nor worst. The leader is Intel Fortran. It is also makes the fastest
code according to the same source.

>It would help if you could make a survey (e.g. based on the
>documentation) and see how the few other compilers, which support it, are
>handling that. (I think IBM does, the xlf90 documentation could be a starting
>point.)

I already know that the following compilers do support $ as the first symbol:

Intel Fortran 9.1
Open Watcom Fortran 1.8
MS Fortran for DOS 5.1

Open Watcom Fortran implicit type for $ is REAL.

(arrogant comments in the style "go learn SED" are ignored)

Speaking on $ as a legal symbol I can only add that this extension is already supported by gfortran, but in the way that is different from all other implementors. Clearly it is impossible to implement all possible extensions from all vendors ever existed, but probably there is a reason to make already implemented extension compatible with others.










Comment 9 Alexander Nickolsky 2009-04-08 07:39:27 UTC
Update regarding implicit rules:

FORTRAN 77 standard clearly says that:
"A first letter of I, J, K, L, M, or N implies type integer and ANY OTHER letter implies type real"

FORTRAN 66 standard has similar statement.

This means the following: if -fdollar-ok extension accepts currency sign as a valid LETTER in a symbolic name, then it should be treated as REAL according to the standard. There is only one argument against it, if -fdollar-ok extension accepts currency sign as a DIGIT. Is it ?







Comment 10 kargls 2009-04-08 13:33:48 UTC
(In reply to comment #8)
> 
> I already know that the following compilers do support $ as the first symbol:
> 
> Intel Fortran 9.1
> Open Watcom Fortran 1.8
> MS Fortran for DOS 5.1
> 
> Open Watcom Fortran implicit type for $ is REAL.

Then use one of those compiler.

> 
> (arrogant comments in the style "go learn SED" are ignored)
> 

It's not arrogance.  It is a method to make your nonstandard
code conform to the standard.  It would take 5 minutes to
write the script to so.



Comment 11 Dominique d'Humieres 2009-04-08 14:06:42 UTC
Note that the following code

        program test
        a$a = 12
        $a = 12    ! error
        $i = 11
        $b = 0.5
        print *, a$a, $a, $b, $i
        end

when compiled with ifort:

   12.00000              12           0          11

i.e., the variables with a name starting with a $ are integers. So, if you are interested by getting reliable results with your "legacy" code, you have to carefully read the manual of the compiler with which it was supposed to work and this could take much longer that the "5 minutes to write the script to so" of comment #11!

Comment 12 Daniel Franke 2009-04-08 14:14:03 UTC
How about this addition to the docs?

Index: invoke.texi
===================================================================
--- invoke.texi (revision 145538)
+++ invoke.texi (working copy)
@@ -256,7 +256,9 @@ the default width of @code{DOUBLE PRECIS
 @cindex $
 @cindex symbol names
 @cindex character set
-Allow @samp{$} as a valid character in a symbol name.
+Allow @samp{$} as a valid non-first character in a symbol name. Symbols
+that start with @samp{$} are rejected since it is unclear which rules to
+apply to implicit typing as different vendors implement different rules.

 @item -fbackslash
 @opindex @code{backslash}
Comment 13 Dominique d'Humieres 2009-04-08 14:17:01 UTC
> How about this addition to the docs? ...

Nice! then close as wontfix.

Comment 14 Tobias Burnus 2009-04-08 17:26:30 UTC
(In reply to comment #12)
> +++ invoke.texi (working copy)
> -Allow @samp{$} as a valid character in a symbol name.
> +Allow @samp{$} as a valid non-first character in a symbol name. Symbols
> +that start with @samp{$} are rejected since it is unclear which rules to
> +apply to implicit typing as different vendors implement different rules.

Daniel, that patch is pre-approved and OK for the trunk.

  * * *

(In reply to comment #9)
> FORTRAN 77 standard clearly says that

Well, that does not count - the Fortran standards (66 to 2008) all say that $ is not allowed.

(In reply to comment #8)
> >I think the primary point of a compiler is to be standard compliant.
> 
> That's correct, but Fortran is probably an exception. It is not widely
> used nowadays, and there are two reasons to keep it: there is a lot of legacy
> code and there are people who do not want to learn modern languages.

In (theoretical) physics there is a huge amount of code Fortran code around which is based on Fortran 90 and continues to be developed. I don't know how much percentage is old code or completely new code, but I know projects which started from scratch. And I wouldn't call Fortran 90 and less so Fortran 2003 and 2008 an old language.  Note: With vendors I meant Fortran compiler vendors.


> >I think gfortran does fairly well in this regard compared with other 
> >compilers, except of DEC structures all major extensions should be there.
> You can see a survey at http://www.polyhedron.com/pb05-win32-language0html
> and find out that gfortran and g95 are somewhere in the middle, neither
> best nor worst.

Well, it does not show all extensions - and there are some which gfortran has and ifort doesn't. The goal is also not to have all extensions but only to have the most important ones. The VAX structures mentioned there are also rather hard to implement as they partially clash with Fortran 90's user-defined operators. (It took Intel several years to get rid of all the bugs.)

> >It would help if you could make a survey (e.g. based on the
> >documentation) and see how the few other compilers, which support it, are
> >handling that. (I think IBM does, the xlf90 documentation could be a starting
> >point.)
> 
> I already know that the following compilers do support $ as the first symbol:
> 
> Intel Fortran 9.1
> Open Watcom Fortran 1.8
> MS Fortran for DOS 5.1

Well, the following compilers do not support it as first character but allow it in the middle of a symbol:

- g95
- NAG f95
- SUN Studio sunf95
- Open64 openf95
(and presumably: Pathscale as it seems to be based on Open64, which is in turn based on SGI if I remember correctly)

> Open Watcom Fortran implicit type for $ is REAL.
Ditto for IBM xlf90, however, as Dominique pointed out ifort treats it as INTEGER.

The other question is whether "IMPLICIT INTEGER($-A)" is allowed or "IMPLICIT INTEGER(A-$)" (i.e. implicit statement and when does $ come in the sequence).

> Speaking on $ as a legal symbol I can only add that this extension is already
> supported by gfortran, but in the way that is different from all other
> implementors.

No, that's not true - see above.
Comment 15 Daniel Franke 2009-04-08 17:42:47 UTC
Subject: Bug 39670

Author: dfranke
Date: Wed Apr  8 17:42:32 2009
New Revision: 145764

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=145764
Log:
2009-04-08  Daniel Franke  <franke.daniel@gmail.com>

       PR fortran/39670
       * invoke.texi (fdollar-ok): Clarify limitations.


Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/invoke.texi

Comment 16 Janne Blomqvist 2009-04-08 18:24:24 UTC
Subject: Bug 39670

Author: jb
Date: Wed Apr  8 18:23:55 2009
New Revision: 145767

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=145767
Log:
2009-04-08  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/39670
	* invoke.texi (fdollar-ok): Fix typo.


Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/invoke.texi

Comment 17 Jakub Jelinek 2009-04-08 19:29:07 UTC
Note also that in the AT&T assembly style for i386/x86_64 (the default in gcc), leading $ changes the instructions.
  movl a_, %eax
means read the value from a_ variable into eax,
  movl $a_, %eax
means set eax to the address of a_ variable.