[Bug preprocessor/9449] UCNs not recognized in identifiers (c++/c99)

jsm28 at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Nov 5 16:20:00 GMT 2014


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9449

--- Comment #47 from Joseph S. Myers <jsm28 at gcc dot gnu.org> ---
Author: jsm28
Date: Wed Nov  5 16:19:10 2014
New Revision: 217144

URL: https://gcc.gnu.org/viewcvs?rev=217144&root=gcc&view=rev
Log:
Enable -fextended-identifiers by default.

As proposed at <https://gcc.gnu.org/ml/gcc/2014-11/msg00014.html>,
this patch enables -fextended-identifiers by default for all standard
versions including this feature (all C++ versions, C99 and above for
C, but not C90 / C94 / gnu89 / preprocessing assembler).  It adds a
couple of tests for areas where I previously noted testsuite coverage
for extended identifiers was lacking, removes -fextended-identifiers
from existing tests, adds -g to various such tests to verify that
extended identifiers don't break debug info generation and removes the
test that was only there to verify that the feature was off by
default.

The current state of the feature may not correspond exactly to any
particular checklist from 2004/5 (see bug 9449) of what was wanted
before enabling the feature by default, but I don't think it's any
worse than plenty of other features supported by default before every
corner case is fully functional, and think problems can readily be
fixed incrementally.

The following aspects of extended identifiers could still do with more
work (and should be straightforward):

* C -aux-info (output should use UCNs).

* ObjC -gen-decls (output should use UCNs; associated diagnostics from
  the ObjC front end should use extended characters or UCNs as
  appropriate to the locale, via using %qE or identifier_to_locale).

* Use DW_AT_use_UTF8 in DWARF-3 debug info for compilation units built
  with extended identifiers enabled (or unconditionally).

* cpplib diagnostics (outputting characters or UCNs as appropriate
  depending on the locale, as done for identifiers in non-cpplib
  diagnostics).

* C++ test for UCN linking with C and extern "C".

* Check GDB support / file issues for support if needed.

* Actual UTF-8 in identifiers (?).  (Be careful about not affecting
  performance for the normal fast path of lexing identifiers, if
  possible.)

The following may be trickier:

* cpplib spelling preservation (required to diagnose macro
  redefinition with different spellings of the same identifier in the
  definition or argument names; different spellings of the name of the
  macro itself are OK, however; also required for correct handling of
  multiple stringizing in C++); correct output for -d (UCNs), DWARF
  debug info for macros (UCNs), PCH and PCH tests.  (Spelling
  preservation is the issue that needs fixing to remove references to
  corner cases in the documentation of -std=c99 and -std=c11 and in
  c99status.html.)  The idea would be to add a second pointer to
  cpp_identifier that stores the original spelling (whether for
  extended identifiers only, or for all identifiers); this does not
  enlarge cpp_token because the resulting larger cpp_identifier
  structure is no bigger than cpp_string.

* C++ translation of extended characters (including $@` and various
  control characters) to UCNs in phase 1 (note diagnostics thus
  needed, but not for C++11, for control characters in strings /
  character constants as those UCNs invalid); a likely implementation
  approach is to do translation when identifiers / strings / character
  constants are lexed, together with errors for stray $@` / control
  characters in program as not being valid UCNs in identifiers ($ only
  if not accepted in identifiers); note that this translation should
  not take place inside raw string literals.

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.

libcpp:
    PR preprocessor/9449
    * init.c (lang_defaults): Enable extended identifiers for C++ and
    C99-based standards.

gcc:
    PR preprocessor/9449
    * doc/cpp.texi (Character sets, Tokenization)
    (Implementation-defined behavior): Don't refer to UCNs in
    identifiers requiring -fextended-identifiers.
    * doc/cppopts.texi (-fextended-identifiers): Document as enabled
    by default for C99 and later and C++.
    * doc/invoke.texi (-std=c99, -std=c11): Don't refer to extended
    identifiers needing -fextended-identifiers.

gcc/testsuite:
    PR preprocessor/9449
    * lib/target-supports.exp (check_effective_target_ucn_nocache):
    Don't use -fextended-identifiers.
    * c-c++-common/cpp/normalize-3.c, c-c++-common/cpp/ucnid-2011-1.c,
    g++.dg/cpp/ucn-1.C, g++.dg/cpp/ucnid-1.C, g++.dg/other/ucnid-1.C,
    gcc.dg/cpp/normalize-1.c, gcc.dg/cpp/normalize-2.c,
    gcc.dg/cpp/normalize-4.c: Don't use -fextended-identifiers.
    * gcc.dg/cpp/ucnid-1.c: Don't use -fextended-identifiers.  Use
    -g3.
    * gcc.dg/cpp/ucnid-10.c, gcc.dg/cpp/ucnid-2.c,
    gcc.dg/cpp/ucnid-3.c, gcc.dg/cpp/ucnid-4.c, gcc.dg/cpp/ucnid-5.c,
    gcc.dg/cpp/ucnid-7.c, gcc.dg/cpp/ucnid-9.c,
    gcc.dg/cpp/warn-normalized-1.c, gcc.dg/cpp/warn-normalized-2.c,
    gcc.dg/cpp/warn-normalized-3.c: Don't use -fextended-identifiers.
    * gcc.dg/ucnid-1.c, gcc.dg/ucnid-2.c, gcc.dg/ucnid-3.c,
    gcc.dg/ucnid-4.c, gcc.dg/ucnid-5.c, gcc.dg/ucnid-6.c: Don't use
    -fextended-identifiers.  Use -g.
    * gcc.dg/ucnid-7.c, gcc.dg/ucnid-8.c: Don't use
    -fextended-identifiers.
    * gcc.dg/ucnid-9.c: Don't use -fextended-identifiers.  Use -g.
    * gcc.dg/ucnid-10.c: Don't use -fextended-identifiers.
    * gcc.dg/ucnid-11.c, gcc.dg/ucnid-12.c: Don't use
    -fextended-identifiers.  Use -g.
    * gcc.dg/ucnid-13.c: Don't use -fextended-identifiers.
    * gcc.dg/cpp/ucnid-8.c: Remove test.
    * gcc.dg/cpp/ucnid-10.c, gcc.dg/ucnid-14.c: New tests.

Added:
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-10.c
    trunk/gcc/testsuite/gcc.dg/ucnid-14.c
Removed:
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-8.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/doc/cpp.texi
    trunk/gcc/doc/cppopts.texi
    trunk/gcc/doc/invoke.texi
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/c-c++-common/cpp/normalize-3.c
    trunk/gcc/testsuite/c-c++-common/cpp/ucnid-2011-1.c
    trunk/gcc/testsuite/g++.dg/cpp/ucn-1.C
    trunk/gcc/testsuite/g++.dg/cpp/ucnid-1.C
    trunk/gcc/testsuite/g++.dg/other/ucnid-1.C
    trunk/gcc/testsuite/gcc.dg/cpp/normalize-1.c
    trunk/gcc/testsuite/gcc.dg/cpp/normalize-2.c
    trunk/gcc/testsuite/gcc.dg/cpp/normalize-4.c
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-1.c
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-2.c
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-3.c
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-4.c
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-5.c
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-7.c
    trunk/gcc/testsuite/gcc.dg/cpp/ucnid-9.c
    trunk/gcc/testsuite/gcc.dg/cpp/warn-normalized-1.c
    trunk/gcc/testsuite/gcc.dg/cpp/warn-normalized-2.c
    trunk/gcc/testsuite/gcc.dg/cpp/warn-normalized-3.c
    trunk/gcc/testsuite/gcc.dg/ucnid-1.c
    trunk/gcc/testsuite/gcc.dg/ucnid-10.c
    trunk/gcc/testsuite/gcc.dg/ucnid-11.c
    trunk/gcc/testsuite/gcc.dg/ucnid-12.c
    trunk/gcc/testsuite/gcc.dg/ucnid-13.c
    trunk/gcc/testsuite/gcc.dg/ucnid-2.c
    trunk/gcc/testsuite/gcc.dg/ucnid-3.c
    trunk/gcc/testsuite/gcc.dg/ucnid-4.c
    trunk/gcc/testsuite/gcc.dg/ucnid-5.c
    trunk/gcc/testsuite/gcc.dg/ucnid-6.c
    trunk/gcc/testsuite/gcc.dg/ucnid-7.c
    trunk/gcc/testsuite/gcc.dg/ucnid-8.c
    trunk/gcc/testsuite/gcc.dg/ucnid-9.c
    trunk/gcc/testsuite/lib/target-supports.exp
    trunk/libcpp/ChangeLog
    trunk/libcpp/init.c



More information about the Gcc-bugs mailing list