This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Universal Character Names, v2


Martin v. L?wis wrote:-

> +      if (utf8)
> +	{
> +	  result->flags |= NODE_USES_EXTENDED_CHARACTERS;
> +#ifndef HAVE_AS_UTF8
> +	  cpp_error (pfile, DL_ERROR, 
> +		     "Non-ASCII identifiers not supported by your assembler");
> +#endif
> +	}

This doesn't belong here.  Someone doing preprocessing only would be
not too happy at this message.

I suggest this should only be a warning (it could be -S with the
output used on a different assembler, or for some other purpose),
only be emitted once per translation unit, and be moved to c_lex().

> +	{
> +          const unsigned char *s = NODE_NAME (token->val.node);
> +          int len = NODE_LEN (token->val.node);
> +          while (len)
> +            {
> +              if (*s < 128)
> +                {
> +                  *buffer++ = *s++;
> +                  len--;
> +                }
> +              else
> +                {
> +                  const unsigned char *old = s;
> +                  cppchar_t code = utf8_to_char (&s);
> +                  if (code < 0x10000)
> +                    buffer += sprintf ((char*)buffer, "\\u%.4x", code);
> +                  else
> +                    buffer += sprintf ((char*)buffer, "\\U%.8x", code);
> +                  len -= s - old;
> +                }
> +            }
> +	}

This should be in a subroutine to avoid code duplication.  (I know this
isn't true of this code in general, but we're not in the fast path
when doing UCS's.  One day I hope to have solved the performance issue,
and then there will only be a single copy of the lot).

> +
> +static int
> +maybe_read_ucs_reader (pfile, pc)
> +     cpp_reader *pfile;
> +     cppchar_t *pc;

Can I suggest that, instead of doing this, you have a routine that
reads a UCS's digits (4 or 8) into a uchar[8] buffer, and that you
re-use maybe_read_ucs() on this buffer?  maybe_read_ucs() might
need a few small tweaks.  Again, this would avoid duplication.

Thanks,

Neil.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]