This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Implementing Universal Character Names in identifiers

Neil Booth <> writes:

> We should definitely accept it.  Why should UCNs be different from
> everything else?  I can see that C++ calls it undefined behaviour, but
> C99 appears to require it.   It's also important, to me at least, from
> a QOI perspective.

I don't think C99 requires it. says

# If, as a result, a character sequence that matches the syntax of a
# universal character name is produced, the behavior is undefined.

I think UCNs are rightfully different from nearly everything else;
they are quite similar to multi-byte characters. If you have an
escaped newline in the middle of a multi-byte character, you would not
expect concatenation to create a new multi-byte character, either,
would you?

I cannot see any important use cases for such a
feature. Implementations are allowed to reject this case, and it
simplifies the implementation to reject it, so I can see really no
reason to make life more complicated than necessary. Producing an
error now still gives the opportunity to provide an extension later.

Notice that the compiler deliberatly abstains from providing a
well-definition of undefined behaviour in some cases, to point out
portability issues. Users often complain that GCC provides too many
extensions, so I think every single extension must be judged very

> A backslash is a token; so is u00c0.  Your example is indeed an
> error, but was not what I had in mind.  I suspect pasting just works,
> anyway.

Can you please give an example for what you had in mind?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]