This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH PR/42686] Align the help text output

On Thu, 25 Feb 2010, Shujing Zhao wrote:

> Hi,
> This patch is to align the help text output when it includes the characters
> that use three bytes in UTF-8 in some languages, such as Chinese.
> The option name is left-aligned with the width 27 or the length of itself from
> the 3th column. The help text is left-aligned with the width 50 (default) from
> the 31th column.

I can't understand this description of what you intend to achieve here.

> +  width_diff = item_width - gcc_gettext_width(item);

Missing space before "(".

> @@ -1183,10 +1184,20 @@ wrap_help (const char *help,
>  		       && help[i + 1] != ' '
>  		       && i > 0 && ISALPHA (help[i - 1]))
>  		len = i + 1;
> +	      /* For the characters use three bytes in UTF-8 in some languages,
> + 		 such as Chinese, move three bytes everytime. */
> +	      else if (help[i] < 0 && help[i + 1] < 0 && help[i + 2] < 0)
> +		{
> +		  i = i + 2;
> +		  len = i + 1;
> +		  room = room + 1;

This does not make sense to me.  Checking whether a "char" is negative is 
not meaningful in portable code since char may be either signed or 
unsigned, the locale may not use UTF-8, and some characters may have other 

You have a translated help text, in whatever the locale character set is.  
You need to find a suitable point to break a line.  In so doing, you need 
to step by multibyte characters (using interfaces such as mbrtowc to do so 
- put wrappers in intl.c that fall back to single characters if 
--disable-nls or the relevant interfaces are not available).  You can then 
examine the wide characters - not the multibyte sequences - to see what is 
a suitable point to wrap the line, and determine the actual display width 
of each wide character in the process.  I don't know if comparing with 
L'-', L'/' and L' ' is really appropriate for all languages, but don't 
have a better option.  However, the locale-independent ISALPHA, used in 
the present code, clearly seems inappropriate; using iswalpha (again with 
a fallback for --disable-nls or when this function is not present) would 
be better.

I'd be inclined to put the entire logic for testing whether something is a 
suitable point for wrapping, and for determining what characters to skip 
after wrapping, in a function in intl.c.  (The code that supports wrapping 
lines for diagnostics suffers the same problems with counting bytes rather 
than display width, and might be able to share some of the fix.)

> -      printf( "  %-*.*s %.*s\n", col_width, item_width, item, len, help);
> +      printf ("  %-*.*s %.*s\n", 
> +	      col_width + width_diff, item_width, item, len, help);
> +      width_diff = 0;

Please note that both width and precision in printf formats count *bytes* 
not *multibyte characters* or *screen columns*.  So once you've corrected 
the logic above and know both how wide the text you want to print is in 
screen columns, and how many bytes it is, you need to think carefully 
about what the correct width and precision parameters are.

> +	  buf = (char *)alloca (len + 1);

Missing space after ")" in cast.

Joseph S. Myers

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]