This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[CFT, v4] Vectorized _cpp_clean_line


On 08/13/2010 12:28 AM, Andi Kleen wrote:
>> static inline bool
>> search_line_fast (s, end, out)
>> {
>>   if (fast_impl == 0)
>>     return search_line_sse42 (s, end, out);
>>   else if (fast_impl == 1)
>>     return search_line_sse2 (s, end, out);
>>   else
>>     return search_line_acc_char (s, end, out);
>> }
>>
>> where FAST_IMPL is set up appropriately by init_vectorized_lexer.
>>
>> The question being, are three predicted jumps faster than one
>> indirect jump on a processor without the proper predictor?
> 
> Yes usually, especially if you don't have to go through all three
> on average.

This is the version I plan to commit Monday or Tuesday, 
barring further feedback.  It uses direct branches as above,
with tweaks to allow inlining when possible (e.g. 64-bit
which always has SSE2).

I've also bootstrapped on powerpc64-linux and ia64-linux.
Those test machines are loaded, so testing is proceeding
rather slowly.  I'd appreciate it if dje and sje could
give it a go on aix and ia64-hpux and see that (1) it works
with the big-endian, ilp32 hpux, and (2) if at all possible
report some performance results.


r~

Attachment: searchline-4
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]