Ping: [PATCH] Ensure colorization doesn't corrupt multibyte sequences in diagnostics

Lewis Hyatt lhyatt@gmail.com
Sat Nov 14 20:33:14 GMT 2020


On Fri, Nov 13, 2020 at 5:27 PM Jeff Law <law@redhat.com> wrote:
>
>
> On 1/14/20 5:05 PM, Lewis Hyatt wrote:
> > Hello-
> >
> > I thought I might ping this short patch please, just in case it may
> > make sense to include in GCC 10 along with the other UTF-8-related
> > fixes to diagnostics. Thanks!
> >
> > https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00915.html
>
> This is fine for the trunk.  Note that due to the changes to handle
> tabs/control bytes will require this patch to be updated.  It may be as
> simple as moving the c = dw.next_byte() statement up.
>
>
> Go ahead and do the necessary update and retest & repost the patch for
> archival purposes.  If you have commit privs, go ahead and commit the
> updated patch, else indicate in the patch repost that someone needs to
> apply it for you.
>
>
> Thanks for your patience,
>
> Jeff
>
>
> >> #1. diagnostic_show_locus() should be sure it will not corrupt output in
> >> this way, regardless of what ranges it is given to work with.
>
> Yes.
>
>
> >>
> >> #2. libcpp should probably generate a range that includes the whole UTF-8
> >> character. Actually in other ways the range seems not ideal, for example
> >> if an invalid character appears in the middle of the identifier, the
> >> diagnostic still points to the first byte of the identifier.
>
> Probably.  We haven't traditionally worried a  lot about multitbyte
> sequences, so I'm not surprised we're not handling them particularly well.
>
>
> >>
> >> The attached patch fixes #1. It's essentially a one-line change, plus a
> >> new selftest. Would you please have a look at it sometime? bootstrap
> >> and testsuite were done on linux x86-64.
> >>
> >> Other questions that I have:
> >>
> >> - I am not quite clear when a selftest is preferred vs a dejagnu test. In
> >>   this case I stuck with the selftest because color diagnostics don't seem
> >>   to work well with dg-error etc, and it didn't seem worth creating a new
> >>   plugin-based test like g++.dg/plugin just for this. (I also considered
> >>   using the existing g++.dg plugin, but it seems this test should run for
> >>   gcc as well.)
>
> It varies and there's cases that are fine in either and I suspect there
> are many tests in the dejagnu suite that would be better as selftests --
> selftests are a fairly new concept.
>
>
> The guidance I would give is the more a particular test is tied to the
> internals of the code, the more likely a selftest is the right
> approach.  THe more the test needs an end-to-end run through passes of
> the compiler, the more it belongs in the dejagnu suite.
>
>
>
> >>
> >> - I wasn't sure if I should create a PR for an issue such as this, if
> >>   there is already a patch readily available. And if I did create a PR,
> >>   not sure if it's preferred to post the patch to gcc-patches, or as an
> >>   attachment to the PR.
>
> We still prefer patches to go to gcc-patches -- I personally don't troll
> BZ looking for attached patches.
>
>
> >>
> >> - Does it seem worth me looking into #2? I think the patch to address #1 is
> >>   appropriate in any case, because it handles generically all potential
> >>   cases where this may arise, but still perhaps the ranges coming out of
> >>   libcpp could be improved?
>
> I don't think it can hurt to look into the difficulty in addressing #2.
>
>
> jeff
>

Thanks very much for the detailed comments, that's all very useful to
me. This particular patch was subsumed by r11-2092, which added the
support for tab expansion, since this whole function was redone and
now handles multibyte correctly. Sorry I probably should have updated
the thread for this old patch in addition to mentioning in the new
one, to save you some time. I will try to take a look sometime at the
ranges that libcpp outputs too. Thanks again!

-Lewis


More information about the Gcc-patches mailing list