Ping: [PATCH] Ensure colorization doesn't corrupt multibyte sequences in diagnostics

Jeff Law law@redhat.com
Fri Nov 13 22:27:45 GMT 2020


On 1/14/20 5:05 PM, Lewis Hyatt wrote:
> Hello-
>
> I thought I might ping this short patch please, just in case it may
> make sense to include in GCC 10 along with the other UTF-8-related
> fixes to diagnostics. Thanks!
>
> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00915.html

This is fine for the trunk.  Note that due to the changes to handle
tabs/control bytes will require this patch to be updated.  It may be as
simple as moving the c = dw.next_byte() statement up.


Go ahead and do the necessary update and retest & repost the patch for
archival purposes.  If you have commit privs, go ahead and commit the
updated patch, else indicate in the patch repost that someone needs to
apply it for you.


Thanks for your patience,

Jeff


>> #1. diagnostic_show_locus() should be sure it will not corrupt output in
>> this way, regardless of what ranges it is given to work with.

Yes.


>>
>> #2. libcpp should probably generate a range that includes the whole UTF-8
>> character. Actually in other ways the range seems not ideal, for example
>> if an invalid character appears in the middle of the identifier, the
>> diagnostic still points to the first byte of the identifier.

Probably.  We haven't traditionally worried a  lot about multitbyte
sequences, so I'm not surprised we're not handling them particularly well.


>>
>> The attached patch fixes #1. It's essentially a one-line change, plus a
>> new selftest. Would you please have a look at it sometime? bootstrap
>> and testsuite were done on linux x86-64.
>>
>> Other questions that I have:
>>
>> - I am not quite clear when a selftest is preferred vs a dejagnu test. In
>>   this case I stuck with the selftest because color diagnostics don't seem
>>   to work well with dg-error etc, and it didn't seem worth creating a new
>>   plugin-based test like g++.dg/plugin just for this. (I also considered
>>   using the existing g++.dg plugin, but it seems this test should run for
>>   gcc as well.)

It varies and there's cases that are fine in either and I suspect there
are many tests in the dejagnu suite that would be better as selftests --
selftests are a fairly new concept.


The guidance I would give is the more a particular test is tied to the
internals of the code, the more likely a selftest is the right
approach.  THe more the test needs an end-to-end run through passes of
the compiler, the more it belongs in the dejagnu suite.



>>
>> - I wasn't sure if I should create a PR for an issue such as this, if
>>   there is already a patch readily available. And if I did create a PR,
>>   not sure if it's preferred to post the patch to gcc-patches, or as an
>>   attachment to the PR.

We still prefer patches to go to gcc-patches -- I personally don't troll
BZ looking for attached patches.


>>
>> - Does it seem worth me looking into #2? I think the patch to address #1 is
>>   appropriate in any case, because it handles generically all potential
>>   cases where this may arise, but still perhaps the ranges coming out of
>>   libcpp could be improved?

I don't think it can hurt to look into the difficulty in addressing #2.


jeff



More information about the Gcc-patches mailing list