This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH PR/42686] Align the help text output
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: Shujing Zhao <pearly dot zhao at oracle dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Paolo Carlini <paolo dot carlini at oracle dot com>
- Date: Thu, 15 Apr 2010 21:51:03 +0000 (UTC)
- Subject: Re: [PATCH PR/42686] Align the help text output
- References: <4B863559.email@example.com> <Pine.LNX.firstname.lastname@example.org> <4B976D51.email@example.com> <Pine.LNX.firstname.lastname@example.org> <4B9A1E65.email@example.com> <Pine.LNX.firstname.lastname@example.org> <4B9DFA37.email@example.com> <Pine.LNX.firstname.lastname@example.org> <4BA07D89.email@example.com> <Pine.LNX.firstname.lastname@example.org> <4BA2010C.email@example.com> <Pine.LNX.firstname.lastname@example.org> <4BA35936.email@example.com> <Pine.LNX.firstname.lastname@example.org> <4BA72D29.email@example.com> <4BB2E45A.firstname.lastname@example.org> <Pine.LNX.email@example.com> <4BC455D1.firstname.lastname@example.org>
On Tue, 13 Apr 2010, Shujing Zhao wrote:
> > > + else if (nbytes > 1)
> > I still see no reason you should need conditionals on the number of bytes in
> > a character, instead of always working with characters regardless of the
> > number of bytes in them.
> This condition is to make the string can be break after a multi bytes wide
> alphabetic or punctuations. The line can be break after every multi bytes wide
> alphabetic or punctuations. I think the nbytes can distinguish if it is a wide
> character before decode. The letter 'a' is recognized wide character after the
> decoding, but the line can't be broken after a 'a' at the string "after". I
> think the difference between one byte wide character and the multi bytes wide
> character is the nbytes.
> Is it better changed to
> + else if ((iswalpha (wc) || iswpunct (wc)) && nbytes > 1)
I still cannot make sense of what the logical condition is that this code
is trying to implement; that is, the logical properties of a pair of
characters and what the conclusion is from those properties about whether
a break is or is not permitted at a particular location in relation to
those characters. You have a series of conditions that might be described
something like (and it's possible I'm not understanding your intent):
/* We can break at the end of the string if it is narrow enough. */
/* If there is a space, we can break before the space (and not print the
/* Break after '-' or '/' if the previous character was alphabetical, but
not in the middle of "--" (for example). */
But what in plain English is the rule this code is trying to implement for
a condition on breaking or not breaking the string?
Whatever the condition is, it relates to some logical properties of the
characters in question. The number of bytes is not a logical property;
it's a physical property of the particular locale character set and may
vary from character set to character set (gettext will automatically
translate using iconv to the LC_CTYPE character set).
Joseph S. Myers