[PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals

Jakub Jelinek jakub@redhat.com
Thu Oct 31 18:26:00 GMT 2013


On Thu, Oct 31, 2013 at 04:00:01PM +0100, Dodji Seketeli wrote:
> Jakub Jelinek <jakub@redhat.com> writes:
> 
> > On Thu, Oct 31, 2013 at 03:36:07PM +0100, Bernd Edlinger wrote:
> >> if you want to read zero-chars, why don't you simply use fgetc,
> >> optionally replacing '\0' with ' ' in read_line?
> >
> > Because it is too slow?
> >
> > getline(3) would be much better for this purpose, though of course
> > it is a GNU extension in glibc and so we'd need some fallback, which
> > very well could be the fgetc or something similar.
> 
> So would getline (+ the current patch as a fallback) be acceptable?

I think even as a fallback is the patch too expensive.
I'd say best would be to write some getline API compatible function
and just use it, using fread on say fixed size buffer (4KB or similar),
then for the number of characters returned by fread that were stored
into that buffer look for the line terminator there and allocate/copy
to the dynamically allocated buffer.  A slight complication is what to do
on mingw/cygwin and other DOS or Mac style line ending environments,
no idea what fgets exactly does there.  But, ignoring the DOS/Mac style line
endings, it would be roughly (partially from glibc iogetdelim.c).

ssize_t
getline_fallback (char **lineptr, size_t *n, FILE *fp)
{
  ssize_t cur_len = 0, len;
  char buf[16384];

  if (lineptr == NULL || n == NULL)
    return -1;

  if (*lineptr == NULL || *n == 0)
    {
      *n = 120;
      *lineptr = (char *) malloc (*n);
      if (*lineptr == NULL)
	return -1;
    }

  len = fread (buf, 1, sizeof buf, fp);
  if (ferror (fp))
    return -1;

  for (;;)
    {
      size_t needed;
      char *t = memchr (buf, '\n', len);
      if (t != NULL)
	len = (t - buf) + 1;
      if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0))
	return -1;
      needed = cur_len + len + 1;
      if (needed > *n)
	{
	  char *new_lineptr;
	  if (needed < 2 * *n)
	    needed = 2 * *n;
	  new_lineptr = realloc (*lineptr, needed);
	  if (new_lineptr == NULL)
	    return -1;
	  *lineptr = new_lineptr;
	  *n = needed;
	}
      memcpy (*lineptr + cur_len, buf, len);
      cur_len += len;
      if (t != NULL)
	break;
      len = fread (buf, 1, sizeof buf, fp);
      if (ferror (fp))
	return -1;
      if (len == 0)
	break;
    }
  (*lineptr)[cur_len] = '\0';
  return cur_len;
}

For the DOS/Mac style line endings, you probably want to look at what
exactly does libcpp do with them.

BTW, we probably want to do something with the speed of the caret
diagnostics too, right now it opens the file again for each single line
to be printed in caret diagnostics and reads all lines until the right one,
so imagine how fast is printing of many warnings on almost adjacent lines
near the end of many megabytes long file.
Perhaps we could remember the last file we've opened for caret diagnostics,
don't fclose the file right away but only if a new request is for a
different file, perhaps keep some vector of line start offsets (say starting
byte of every 100th line or similar) and also remember the last read line
offset, so if a new request is for the same file, but higher line than last,
we can just keep getlineing, and if it is smaller line than last, we look up
the offset of the line / 100, fseek to it and just getline only modulo 100
lines.  Maybe we should keep not just one, but 2 or 4 opened files as cache
(again, with the starting line offsets vectors).

	Jakub



More information about the Gcc-patches mailing list