This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c


I was a bit too hasty in committing this patch, thinking the review was
done.  I have reverted it yesterday night.

Let's try again after a night of sleep.

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> I still do not see how this is supposed to work:
>
> If the previous invocation of get_line already had read some
> characters of the following line(s), how is that information
> recovered?

It's lost, you are right.  I felt under my radar that the file position
indicator adjustment that should be done in get_line after it finds a
line delimiter felt through the cracks in the transformation made from
the getline of glibc.  Hopefully in this version, the file position
indicator should be properly set when get_line finishes.

So am proposing two patches that I think should be committed in pair.

The first one is the one we've been discussing lately; it's amended to
make get_line remember update the file position indicator.

The second one is to factorize the two read_line (from gcov.c and
input.c) and make gcov.c just use one from input.c.

Nathan Sidwell <nathan@acm.org> writes:

> How is input.o linked into gcov?

I didn't changed the way input.o is linked into gcov.  I just launched
the debugger on some of the regression tests of gcov in the suite, to
see that the read_line from input.c was being picked up.  But now that
you are asking, here is what I see from gcc/Makefile.in:

input.o is part of OBJS-libcommon:

    OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o

$(OBJS-libcommon) ends up in libcommon.a which itself is part of LIBS
(and LIBDEPS)

And the gcov executable is linked with LIBS as per:

    GCOV_OBJS = gcov.o
    gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS)
	    +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@

> The former's a compiler proper source file. Furthermore, if it is
> linked in, doesn't it drag in a whole bunch of the compiler itself
> into gcov?

>From the above, what I can say is that input.o was already linked with
gcov.  But I think it's minimal enough to only drag libcpp and the
diagnostic subsystem.

> I think that means read_line needs to be broken out into a separate
> source file -- or at least one that's already common between gcov
> and cc1 (etc).

Hopefully this concern should be addressed by the above.


> Changelog doesn't match diff. (function names wrong, no mention of gcov)

Woops, sorry about that.  The second patch attached is an update that
should address that.  And with the changes in the first patch, this
new factorization patch is smaller.

Jakub Jelinek <jakub@redhat.com> writes:

> On Thu, Nov 07, 2013 at 07:17:19AM +0000, Nathan Sidwell wrote:
>> On 11/06/13 21:25, Dodji Seketeli wrote:
>> >It appeared that gcov.c and input.c both have a static function named
>> >read_line.  I guess we ought to keep just one.
>> 
>> As a general rule, that's good.
>
> Generally yes, but IMHO not in this case, both because of the problems you
> are mentioning, because it already now needs to do quite different things
> and because I think in the future the version for caret diagnostics should
> do much more complicated things (caching of the lines for faster further
> caret diagnostics).  Furthermore, ftell/fseek aren't something that should
> be used unless strictly necessary.

OK, let's say I am sending these patches here just for the record so
that it doesn't get lost :-)  I'll devise another scheme right away to
avoid the ftell/seek that you don't like.

I agree that, as you proposed at the beginning of the initial thread,
the caret diagnostics system should avoid opening/closing one file just
to cycle over it to find one line, repeatedly.  I just thought that
would come as a later patch after we fix this PR specifically.

Cheers.

>From 75cea3f6883f192e9472fd7905f42d84d875e4e7 Mon Sep 17 00:00:00 2001
From: dodji <dodji@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed, 6 Nov 2013 11:33:52 +0000
Subject: [PATCH 1/2] PR preprocessor/58580 - preprocessor goes OOM with
 warning for zero literals

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter.
	* input.c (get_line): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new get_line function do the
	actual line reading.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/diagnostic.c                                   |  17 +--
 gcc/input.c                                        | 119 ++++++++++++++++-----
 gcc/input.h                                        |   3 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 4 files changed, 105 insertions(+), 34 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..e0c5d9d 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, &line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..ee3f31c 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -87,53 +87,118 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
-static const char *
-read_line (FILE *file)
+/* This function reads a line that might contain bytes whose value is
+   zero.  It returns the number of bytes read.  The 'end-of-line'
+   character found at the end of the line is not contained in the
+   returned buffer.  Note that this function has been adapted from
+   getline() and _IO_getdelim() GNU C library functions.  It's been
+   duplicated here because the getline() function is not necessarily
+   present on all platforms.
+
+   LINEPTR points to a buffer that is to contain the line read.
+
+   N points to the size of the the LINEPTR buffer.
+
+   FP points to the file to consider.  */
+
+static ssize_t
+get_line (char **lineptr, size_t *n, FILE *fp)
 {
-  static char *string;
-  static size_t string_len;
-  size_t pos = 0;
-  char *ptr;
+  ssize_t cur_len = 0, len;
+  char buf[16384];
+  long prev_pos;
+
+  if (lineptr == NULL || n == NULL)
+    return -1;
 
-  if (!string_len)
+  if (*lineptr == NULL || *n == 0)
     {
-      string_len = 200;
-      string = XNEWVEC (char, string_len);
+      *n = 120;
+      *lineptr = XNEWVEC (char, *n);
     }
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
-    {
-      size_t len = strlen (string + pos);
+  prev_pos = ftell (fp);
+  len = fread (buf, 1, sizeof buf, fp);
+  if (ferror (fp))
+    return -1;
 
-      if (string[pos + len - 1] == '\n')
+  for (;;)
+    {
+      size_t needed;
+      char *t = (char*) memchr (buf, '\n', len);
+      if (t != NULL) len = (t - buf) + 1;
+      if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0))
+	return -1;
+      needed = cur_len + len + 1;
+      if (needed > *n)
 	{
-	  string[pos + len - 1] = 0;
-	  return string;
+	  char *new_lineptr;
+	  if (needed < 2 * *n)
+	    needed = 2 * *n;
+	  new_lineptr = XRESIZEVEC (char, *lineptr, needed);
+	  *lineptr = new_lineptr;
+	  *n = needed;
 	}
-      pos += len;
-      string = XRESIZEVEC (char, string, string_len * 2);
-      string_len *= 2;
+      memcpy (*lineptr + cur_len, buf, len);
+      cur_len += len;
+      fseek (fp, prev_pos + cur_len, SEEK_SET);
+      if (t != NULL)
+	break;
+      prev_pos += cur_len;
+      len = fread (buf, 1, sizeof buf, fp);
+      if (ferror (fp))
+	return -1;
+      if (len == 0)
+	break;
     }
-      
-  return pos ? string : NULL;
+
+  if (cur_len)
+    (*lineptr)[cur_len - 1] = '\0';
+  return cur_len;
+}
+
+/* Reads one line from FILE into a static buffer.  If LINE_LENGTH is
+ *non-null LINE_LENGTH, will be set by this function to the length of
+ *the returned line.  Note that the returned line can contain several
+ *zero bytes.  Also note that the returned string is allocated in
+ *static storage that is going to be re-used by subsequent invocations
+ *of read_line.  */
+static const char *
+read_line (FILE *file, int *line_length)
+{
+  static char *string;
+  static size_t string_len;
+  int len;
+
+  len = get_line (&string, &string_len, file);
+  if (line_length)
+    *line_length = len;
+  return len ? string : NULL;
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN, if non-null, points to the actual length
+   of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int *line_len)
 {
-  const char *buffer;
-  int lines = 1;
+  const char *buffer = NULL, *ptr;
+  int lines = 0, len = 0;
   FILE *stream = xloc.file ? fopen (xloc.file, "r") : NULL;
   if (!stream)
     return NULL;
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
-    lines++;
+  while ((ptr = read_line (stream, &len)) && lines < xloc.line)
+    {
+      buffer = ptr;
+      lines++;
+      if (line_len)
+	*line_len = len;
+    }
 
   fclose (stream);
   return buffer;
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..128e28c 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int *line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

-- 
1.8.1.4

>From 4b60ab711372874bb6b4b54b17143f89841a3685 Mon Sep 17 00:00:00 2001
From: Dodji Seketeli <dodji@seketeli.org>
Date: Thu, 7 Nov 2013 09:09:05 +0100
Subject: [PATCH 2/2] Factorize the two read_line functions present in gcov.c
 and input.c

	* input.h (read_line): Declare this here.
	* input.c (read_line): This is not static anymore.
	* gcov.c (read_line): Remove.
	(output_lines): Adjust for the use of the new read_line signature.
---
 gcc/gcov.c  | 35 ++---------------------------------
 gcc/input.c |  2 +-
 gcc/input.h |  1 +
 3 files changed, 4 insertions(+), 34 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 9458812..da53ebe 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -2364,37 +2364,6 @@ output_branch_count (FILE *gcov_file, int ix, const arc_t *arc)
 
 }
 
-static const char *
-read_line (FILE *file)
-{
-  static char *string;
-  static size_t string_len;
-  size_t pos = 0;
-  char *ptr;
-
-  if (!string_len)
-    {
-      string_len = 200;
-      string = XNEWVEC (char, string_len);
-    }
-
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
-    {
-      size_t len = strlen (string + pos);
-
-      if (string[pos + len - 1] == '\n')
-	{
-	  string[pos + len - 1] = 0;
-	  return string;
-	}
-      pos += len;
-      string = XRESIZEVEC (char, string, string_len * 2);
-      string_len *= 2;
-    }
-      
-  return pos ? string : NULL;
-}
-
 /* Read in the source file one line at a time, and output that line to
    the gcov file preceded by its execution count and other
    information.  */
@@ -2455,7 +2424,7 @@ output_lines (FILE *gcov_file, const source_t *src)
 	}
 
       if (retval)
-	retval = read_line (source_file);
+	retval = read_line (source_file, NULL);
 
       /* For lines which don't exist in the .bb file, print '-' before
 	 the source line.  For lines which exist but were never
@@ -2503,7 +2472,7 @@ output_lines (FILE *gcov_file, const source_t *src)
      last line of code.  */
   if (retval)
     {
-      for (; (retval = read_line (source_file)); line_num++)
+      for (; (retval = read_line (source_file, NULL)); line_num++)
 	fprintf (gcov_file, "%9s:%5u:%s\n", "-", line_num, retval);
     }
 
diff --git a/gcc/input.c b/gcc/input.c
index ee3f31c..5da718b 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -163,7 +163,7 @@ get_line (char **lineptr, size_t *n, FILE *fp)
  *zero bytes.  Also note that the returned string is allocated in
  *static storage that is going to be re-used by subsequent invocations
  *of read_line.  */
-static const char *
+const char *
 read_line (FILE *file, int *line_length)
 {
   static char *string;
diff --git a/gcc/input.h b/gcc/input.h
index 128e28c..afbe3d5 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -66,4 +66,5 @@ extern location_t input_location;
 
 void dump_line_table_statistics (void);
 
+const char * read_line (FILE *, int *);
 #endif
-- 
1.8.1.4

-- 
		Dodji

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]