This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH 1/3] (v2) On-demand locations within string-literals
On Fri, 2016-07-29 at 08:22 -0600, Martin Sebor wrote:
> > Currently all that we need from the C family of frontends is the
> > cpp_reader and the string concatenation records. I think we can
> > reconstruct the cpp_reader if we have the options, though
> > presumably
> > that's per TU, so to support all this we'd need to capture e.g. the
> > per
> > -TU encoding information in the LTO records, for the case where one
> > TU
> > is UTF-8 encoded source to UTF-8 execution, and another TU is
> > EBCDIC
> > -encoded source to UCS-4 execution (or whatever). And there's an
> > issue
> > if different TUs compiled the same header with different encoding
> > options.
> > Or... we could not bother. This is a Quality of Implementation
> > thing,
> > for improving diagnostics, and in each case, the diagnostic is
> > required
> > to cope with substring location information not being available
> > (and
> > the code I posted in patch 2 of the kit makes it trivial to handle
> > that
> > case from a diagnostic). So we could simply have LTO use the
> > fallback mode.
> > There are two high-level approaches I've tried:
> > (a) capture the substring location information in the lexer/parser
> > in
> > the frontend as it runs, and store it somehow.
> > (b) regenerate it "on-demand" when a diagnostic needs it.
> > Approach (b) is inherently going to be prone to the LTO issues you
> > describe, but it avoids adding to the CPU cycles/memory consumption
> > for
> > the common case of not needing the information. 
> > Is approach (b) acceptable?
> If (b) means potentially reduced quality of the location ranges
> in the -Wformat-length pass (e.g., with funky C++ format strings)
> then I don't think that's enough of a problem to worry about, at
> least not for this warning.
> If it means not being able to use the solution you're working
> on in the middle end at all (unless I misunderstood that doesn't
> seem to be what you're implying, but just to be sure) then that
> would seem like a serious shortcoming. I would continue to use
> the code I copied from c-format.c (assuming that will still work),
> but as more warnings are implemented in later passes it would
> lead to duplicating code or reinventing the wheel just to get
> around the limitation (or simply worse quality diagnostics).
It'll work fine for the middle-end within cc1 and cc1plus.
I'm specifically referring to LTO here, and it would be fixable from
LTO if we can encode information about the TU encoding options into the
LTO data stream, and capture the string concatenation records there too
(but that would be followup work).
> > Thanks
> > Dave
> >  with the exception of the string concatenation records, but I
> > believe those are tiny