Universal Character Names, v2

Joseph S. Myers jsm28@cam.ac.uk
Sat Nov 30 05:03:00 GMT 2002

On Sat, 30 Nov 2002, Neil Booth wrote:

> If we find \U in a file, we should assume it is a UCN.  There is little
> use for \ as a separate token if followed by U.  If there is a syntax
> error in a UCN with an invalid char, there is no obviously right thing to
> do to recover; certainly I don't think backing up to the \U and making
> two tokens out of it is a good idea.  It might not even be worth the

It's clearly required however by the rule that each preprocessing token is
the longest sequence of characters that will form one.  There isn't a rule 
that isolated \ as a preprocessing token is undefined, whereas there is 
for isolated ' and " (which allowed the old multiline strings extension).  
(But I'm aware GCC deliberately doesn't implement the rule for
#include <foo.h
which, not matching a header name for lack of closing >, ought to be 
parsed as multple tokens, of which the h might then be macro expanded to 
h> (see past discussions on comp.std.c).)

Joseph S. Myers

More information about the Java mailing list