Bug 38839 - BIND(C): Allow non-digit/underscore/alphabetic binding names
Summary: BIND(C): Allow non-digit/underscore/alphabetic binding names
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.4.0
: P3 normal
Target Milestone: 5.0
Assignee: Francois-Xavier Coudert
URL:
Keywords: rejects-valid
Depends on:
Blocks: 32630
  Show dependency treegraph
 
Reported: 2009-01-14 15:37 UTC by Tobias Burnus
Modified: 2014-06-29 14:16 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2009-03-29 08:46:40


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2009-01-14 15:37:03 UTC
Motivated by http://groups.google.com/group/comp.lang.fortran/browse_thread/thread/d977a668b0316119

Currently, BIND(C, NAME=binding-name)  checks for ISALPHA and "_". However, the C standard allows more (from ISO/IEC 9899:1999(E) = C99, Section 6.4.2 Indentifiers):

identifier:
   identifier-nondigit
   identifier   identifier-nondigit
   identifier   digit

identifier-nondigit:
   nondigit
   universal-character-name  <<<<<<<<
   other implementation-defined characters  <<<<<<

nondigit: One of
   _ a b c (...) z A B C (...) Z

digit:
  0 1 (...) 9


Note the items marked by "<<<<<". I'm not sure whether the current restriction makes any problem in practice, but I could imaging that there is a legitimated use for it, though I could not find a good example.

Currently, it is abused to call STDCALL programs which get a "@<n>" appended to the name where "<n>" is related to how much needs to be popped from the stack. The proper solution is to support STDCALL (PR 34112); thus I don't think this is a proper use.

Does it make sense to change the error into a warning? Or to allow certain other characters?

 * * *

Seemingly $ is allowed by "gcc -c -std=c99 -Wall -pedantic":

void t$FAst(void)
{
}
Comment 1 Tobias Burnus 2009-01-14 16:00:59 UTC
And for the universal-character-name, the following compiles with Intel's icc

void \u01ac(void) {
}

and should be valid C99. ICC generates the identifier "_u01ac". Using "gcc -fextended-identifiers"
it shows up in the .o file as (readelf -a):
     8: 0000000000000000     6 FUNC    GLOBAL DEFAULT    1 ^^F^354


In the Fortran 2003 standard one finds:

R509 language-binding-spec is BIND (C [, NAME = scalar-char-initialization-expr
])
C540 (R509) The scalar-char-initialization-expr shall be of default character kind.

That makes it a bit difficult to use UCNs ... From Fortran 2008 (below R508):

"NOTE 5.5
The C International Standard provides a facility for creating C identifiers whose characters are not restricted to the C basic character set. Such a C identifier is referred to as a universal character name (6.4.3 of the C International Standard). The name of such a C identifier might include characters that are not part of the representation method used by the processor for default character. If so, the C entity cannot be referenced from Fortran."

Thus currently we only need to worry about '$' and maybe some others. Optionally supporting non-default character strings and thus UCN might be done later on (cf. also PR 38838 comment 3).
Comment 2 Tobias Burnus 2009-01-14 16:09:09 UTC
For UCN see also PR 9449.
Comment 3 Daniel Franke 2009-12-10 19:11:31 UTC
See PR36275 for more possibilities on binding labels.
Comment 4 Tobias Burnus 2009-12-10 19:52:11 UTC
For "$" one should check whether it is allowed for the given target, cf. DOLLARS_IN_IDENTIFIERS (-> gcc/c-opts.c and dollars_in_ident in libcpp/).

For UCN (universal-character name), the ASCII characters $, @ and ` are allowed. Cf. C99 6.4.3 and libcpp _cpp_valid_ucn.

One could also think of supporting UCN with character kind=4 (cf. PR36275 and PR 9449) - at least when -fextended-identifiers is specified - as vendor extension. As gfortran uses libcpp and and has full support of wide chars, it should not be difficult (but shall produce an error with -std=f2008).
Comment 5 Francois-Xavier Coudert 2014-06-08 22:37:39 UTC
Should be fixed at the same time as this patch here: https://gcc.gnu.org/ml/fortran/2014-06/msg00090.html , which reworks the parsing of binding labels. I intend to find what the consensus is on this issue, incorporate it into the patch, and close this.
Comment 6 Francois-Xavier Coudert 2014-06-29 14:14:50 UTC
Author: fxcoudert
Date: Sun Jun 29 14:14:16 2014
New Revision: 212123

URL: https://gcc.gnu.org/viewcvs?rev=212123&root=gcc&view=rev
Log:
	PR fortran/36275
	PR fortran/38839

	* decl.c (check_bind_name_identifier): New function.
	(gfc_match_bind_c): Match any constant expression as binding
	label.
	* match.c (gfc_match_name_C): Remove.

	* gfortran.dg/binding_label_tests_2.f03: Adjust error messages.
	* gfortran.dg/binding_label_tests_27.f90: New file.

Added:
    trunk/gcc/testsuite/gfortran.dg/binding_label_tests_27.f90
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/decl.c
    trunk/gcc/fortran/match.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gfortran.dg/binding_label_tests_2.f03
Comment 7 Francois-Xavier Coudert 2014-06-29 14:16:40 UTC
Fixed on trunk.