gcc reports Internal Error

Zack Weinberg zack@codesourcery.com
Mon Jul 15 12:44:00 GMT 2002


On Sat, Jul 13, 2002 at 10:09:10AM +0100, Neil Booth wrote:
> Zack Weinberg wrote:-
> > On Fri, Jul 12, 2002 at 07:50:06AM +0100, Neil Booth wrote:
> > > Suppose we read in a chunk of size N.  If we can find the last newline,
> > > we can put a NUL after it (remembering what was there originally).
> > > If we can't find a newline, keep reading chunks until we do.  Because
> > > of the range of newlines we find, it's best to actually replace the first
> > > newline char in the last string of newline chars, if you see what I
> > > mean.
> > 
> > Right, that was roughly what I had imagined doing too.
> 
> On further reflection, I don't think it's a great idea.  It's not
> multibyte safe, for example.

It could be made multibyte safe, with extra work which we might end
up doing anyway as part of the multibyte implementation.  Your idea of
line-at-a-time prescan to take care of translation phases 1-3, for
instance, could work well with block-by-block file reading.

On the other hand, having the entire file in a continuous buffer
guarantees that we do not have to deal with iconv(3) reporting an
incomplete multibyte sequence, unless the input file is ill-formed.
The iconv interface makes it relatively easy to do that, though, and
we do have to recover from ill-formed multibyte input, so that may
turn out not to be a win after all.

> Maybe we're best leaving the caching to the O/S?  I imagine the number
> of headers that we want to process twice is quite limited (2%?).

I'm not sure what the exact numbers in each category are, but it's a
bimodal distribution: either a file is processed once, or it's
processed many times.  Examples of files processed many times are
stddef.h, rtl.def, etc.  The existing heuristic - files are cached if
they lack multiple-include guards - works pretty well.

zw



More information about the Gcc-bugs mailing list