Universal Character Names, v2

Neil Booth neil@daikokuya.co.uk
Fri Nov 29 13:35:00 GMT 2002


Martin v. L?wis wrote:-

> +      if (utf8)
> +	{
> +	  result->flags |= NODE_USES_EXTENDED_CHARACTERS;
> +#ifndef HAVE_AS_UTF8
> +	  cpp_error (pfile, DL_ERROR, 
> +		     "Non-ASCII identifiers not supported by your assembler");
> +#endif
> +	}

This doesn't belong here.  Someone doing preprocessing only would be
not too happy at this message.

I suggest this should only be a warning (it could be -S with the
output used on a different assembler, or for some other purpose),
only be emitted once per translation unit, and be moved to c_lex().

> +	{
> +          const unsigned char *s = NODE_NAME (token->val.node);
> +          int len = NODE_LEN (token->val.node);
> +          while (len)
> +            {
> +              if (*s < 128)
> +                {
> +                  *buffer++ = *s++;
> +                  len--;
> +                }
> +              else
> +                {
> +                  const unsigned char *old = s;
> +                  cppchar_t code = utf8_to_char (&s);
> +                  if (code < 0x10000)
> +                    buffer += sprintf ((char*)buffer, "\\u%.4x", code);
> +                  else
> +                    buffer += sprintf ((char*)buffer, "\\U%.8x", code);
> +                  len -= s - old;
> +                }
> +            }
> +	}

This should be in a subroutine to avoid code duplication.  (I know this
isn't true of this code in general, but we're not in the fast path
when doing UCS's.  One day I hope to have solved the performance issue,
and then there will only be a single copy of the lot).

> +
> +static int
> +maybe_read_ucs_reader (pfile, pc)
> +     cpp_reader *pfile;
> +     cppchar_t *pc;

Can I suggest that, instead of doing this, you have a routine that
reads a UCS's digits (4 or 8) into a uchar[8] buffer, and that you
re-use maybe_read_ucs() on this buffer?  maybe_read_ucs() might
need a few small tweaks.  Again, this would avoid duplication.

Thanks,

Neil.



More information about the Java mailing list