This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 09/09/2016 07:59 AM, Joseph Myers wrote:
On Thu, 8 Sep 2016, Martin Sebor wrote:PS I used hexadecimal based on what c-format.c does but now that I checked more carefully how %qE formats string literals I see it uses octal. I think hexadecimal is preferable because it avoids ambiguity but I'm open to changing it to octal if there's a strongI'm not clear what you mean about ambiguity. In C strings, an octal escape sequence has up to three characters, so if it has three characters it's unambiguous, whereas a hex escape sequence can have any number of characters, so if the unprintable character is followed by a valid hex digit then in C you need to represent that as an escape (or use string constant concatenation, etc.). The patch doesn't try to do that as far as I can see. Now, presumably the output isn't intended to be interpreted as C strings anyway (if it was, you'd need to escape " and \ as well), so the patch is OK, but I don't think it avoids ambiguity (and there's a clear case that it shouldn't - that if the string passed to %qs is printable, it should be printed as-is even if it contains escape sequences that could also result from a non-printable string passed to %qs).
Thank you. I tried to be clear about it in the description of the changes but I see the PS caused some confusion. Let me clarify that the patch has nothing to do with with ambiguity (perceived or real) in the representation of the escape sequences. The only purpose of the change is to avoid printing non-printable characters or excessively large escape sequences in GCC diagnostics. I mentioned the hex vs octal notation to invite input into which of the two of them people would prefer to see used by the %qc and qs directives, and whether it's worth considering changing the %qE directive to use the same notation as well, for consistency (and to help with readability if there is consensus that one is clearer than the other). What I meant by ambiguity is for example a string like "\1234" where it's not obvious where the octal sequence ends. Is it '\1' followed by "234" or '\12' followed by "34" or '\123' followed by "4"? (It's only possible to tell if one knows that GCC always uses three digits for the octal character, but not everyone knows that.) To be clear: I'm talking about the GCC output and not necessarily about what the standard has to say about it. In contrast to the octal notation, I find the string "\x1234" clearer. It can only mean '\x1' followed by "234" or '\x12' followed by "34" and I think more people will expect it to be the latter because representing characters using two hex digits is more common. But this is just my own perception and YMMV. Martin
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |