[PATCH] Speed up LEX line cleaning a bit...

David Miller davem@davemloft.net
Sun Mar 14 23:16:00 GMT 2010


From: "Joseph S. Myers" <joseph@codesourcery.com>
Date: Sun, 14 Mar 2010 16:23:39 +0000 (UTC)

> If empty lines are very common it may be useful to check for initial
> newline before trying the vectorized loop.

I'm not too hot on this idea.  Part of the win of the vectorized code
is that we have a bit of integer setup code with which to buffer the
load latency for that critical first load.

So it's probably essentially free.

But yes, something to investigate for sure.

> See also Zack's ideas on speeding up _cpp_clean_line that I posted in 
> <http://gcc.gnu.org/ml/gcc/2007-05/msg00741.html>.  It's not clear if they 
> could be effectively combined with vectorization, or what would help 
> performance more, or whether several simple passes or one more complicated 
> combined pass would actually be better.

I think the state machine would prevent being able to use
a vectorization optimization like that being described here.

Unless you want to use a state machine that can transition
on four character at a time, which I don't think can fit in
main memory all at once :-)

> You shouldn't really need to check for backslash for every character; if 
> you find a newline you could then check if the line was nonempty and what 
> came before was a backslash.

Hmmm, the code seems to backtrack over any number of whitespace
characters:

	p = d;
	while (is_nvspace (p[-1]))
	  --p;
	if (p - 1 != pbackslash)
	  goto done;

so I don't think checking just one character behind the found newline
would work.

And this would incur more loads decreasing the effectiveness of the
vectorization, which aims to minimize the number of loads.



More information about the Gcc-patches mailing list