[PATCH v4 4/8] diagnostics: Support obtaining source code lines from generated data buffers

Lewis Hyatt lhyatt@gmail.com
Wed Aug 23 19:41:59 GMT 2023


On Tue, Aug 15, 2023 at 04:08:47PM -0400, Lewis Hyatt wrote:
> On Tue, Aug 15, 2023 at 3:46 PM David Malcolm <dmalcolm@redhat.com> wrote:
> >
> > On Tue, 2023-08-15 at 14:15 -0400, Lewis Hyatt wrote:
> > > On Tue, Aug 15, 2023 at 12:15:15PM -0400, David Malcolm wrote:
> > > > On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote:
> > > > > This patch enhances location_get_source_line(), which is the
> > > > > primary
> > > > > interface provided by the diagnostics infrastructure to obtain
> > > > > the line of
> > > > > source code corresponding to a given location, so that it
> > > > > understands
> > > > > generated data locations in addition to normal file-based
> > > > > locations. This
> > > > > involves changing the argument to location_get_source_line() from
> > > > > a plain
> > > > > file name, to a source_id object that can represent either type
> > > > > of location.
> > > > >
> >
> > [...]
> >
> > > > >
> > > > >
> > > > > diff --git a/gcc/input.cc b/gcc/input.cc
> > > > > index 9377020b460..790279d4273 100644
> > > > > --- a/gcc/input.cc
> > > > > +++ b/gcc/input.cc
> > > > > @@ -207,6 +207,28 @@ private:
> > > > >    void maybe_grow ();
> > > > >  };
> > > > >
> > > > > +/* This is the implementation of cache_data_source for generated
> > > > > +   data that is already in memory.  */
> > > > > +class data_cache_slot final : public cache_data_source
> > > >
> > > > It occurred to me: why are we caching accessing a buffer that's
> > > > already
> > > > in memory - but we're also caching the line-splitting information,
> > > > and
> > > > providing the line-splitting algorithm with a consistent interface
> > > > to
> > > > the data, right?
> > > >
> > >
> > > Yeah, for the current _Pragma use case, multi-line buffers are not
> > > going to
> > > be common, but they can occur. I was mainly motivated by the
> > > consistent
> > > interface, and by the assumption that the overhead is not critical
> > > given a
> > > diagnostic is being issued.
> >
> > (nods)
> >
> > >
> > > > [...snip...]
> > > >
> > > > > @@ -397,6 +434,15 @@ diagnostics_file_cache_forcibly_evict_file
> > > > > (const char *file_path)
> > > > >    global_dc->m_file_cache->forcibly_evict_file (file_path);
> > > > >  }
> > > > >
> > > > > +void
> > > > > +diagnostics_file_cache_forcibly_evict_data (const char *data,
> > > > > +                                           unsigned int
> > > > > data_len)
> > > > > +{
> > > > > +  if (!global_dc->m_file_cache)
> > > > > +    return;
> > > > > +  global_dc->m_file_cache->forcibly_evict_data (data, data_len);
> > > >
> > > > Maybe we should rename diagnostic_context's m_file_cache to
> > > > m_source_cache?  (and class file_cache for that matter?)  But if
> > > > so,
> > > > that can/should be a followup/separate patch.
> > > >
> > >
> > > Yes, we should. Believe it or not, I was trying to minimize the size
> > > of the
> > > patch :)
> >
> > :)
> >
> > Thanks for splitting it up, BTW.
> >
> > [...]
> >
> >
> > > >
> > > > > @@ -912,26 +1000,22 @@ cache_data_source::read_line_num (size_t
> > > > > line_num,
> > > > >     If the function fails, a NULL char_span is returned.  */
> > > > >
> > > > >  char_span
> > > > > -location_get_source_line (const char *file_path, int line)
> > > > > +location_get_source_line (source_id src, int line)
> > > > >  {
> > > > > -  const char *buffer = NULL;
> > > > > -  ssize_t len;
> > > > > -
> > > > > -  if (line == 0)
> > > > > -    return char_span (NULL, 0);
> > > > > -
> > > > > -  if (file_path == NULL)
> > > > > -    return char_span (NULL, 0);
> > > > > +  const char_span fail (nullptr, 0);
> > > > > +  if (!src || line <= 0)
> > > > > +    return fail;
> > > >
> > > > Looking at source_id's operator bool, are there effectively three
> > > > kinds
> > > > of source_id?
> > > >
> > > > (a) file names
> > > > (b) generated buffer
> > > > (c) NULL == m_filename_or_buffer
> > > >
> > > > What does (c) mean?  Is it a "something's gone wrong/error" state?
> > > > Or
> > > > is this more a special-case of (a)? (in that the m_len for such a
> > > > case
> > > > would be zero)
> > > >
> > > > Should source_id's 2-param ctor have an assert that the ptr is non-
> > > > NULL?
> > > >
> > > > [...snip...]
> > > >
> > > > The patch is OK for trunk as-is, but note the question about the
> > > > source_id ctor above.
> > > >
> > >
> > > Thanks. (c) has the same meaning as a NULL file name currently does,
> > > so a
> > > default-constructed source_id is not an in-memory buffer, but is
> > > rather a
> > > NULL filename. linemap_add() for instance, will interpret a NULL
> > > filename
> > > for an LC_LEAVE map, as a request to copy it from the natural values
> > > being
> > > returned to. I think the source_id constructor needs to accept a NULL
> > > filename to remain backwards compatible. With the current design of
> > > source_id, it is safe always to change a 'const char*' file name
> > > argument to
> > > a source_id argument instead; it will work just how it did before
> > > because it
> > > has an implicit constructor. But if the constructor would assert on a
> > > non-NULL pointer, that would necessitate changing all call sites that
> > > currently expect they can pass a NULL pointer there. (For example,
> > > there are
> > > several calls to _cpp_do_file_change() within libcpp that take
> > > advantage of
> > > being able to pass a NULL filename to linemap_add.)
> >
> > Yes, it's OK for this ctor to accept NULL;
> >    source_id (const char *filename = nullptr)
> > and I see you added the default arg.
> >
> > I was referring to this ctor:
> >    source_id (const char *buffer, unsigned buffer_len)
> > Is it ever OK for "buffer" to be NULL in this 2-param ctor, or can we
> > assert that it's non-NULL in this ctor?  Does the generated data case
> > ever return NULL?
> >
> 
> Oh, I see. This should never be NULL and I can add an assert for that.
> 

This tweak (incremental to patch 1/8) accomplishes that.

-Lewis

-- >8 --

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 395c4612dbe..ad20b140cce 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -604,7 +604,7 @@ public:
     : m_filename_or_buffer (buffer),
       m_len (buffer_len)
   {
-    linemap_assert (buffer_len > 0);
+    linemap_assert (buffer && buffer_len > 0);
   }
 
   explicit operator bool () const { return m_filename_or_buffer; }


More information about the Gcc-patches mailing list