This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[CFT, v4] Vectorized _cpp_clean_line

From: Richard Henderson <rth at redhat dot com>
To: gcc-patches at gcc dot gnu dot org
Cc: Andi Kleen <andi at firstfloor dot org>, sje at cup dot hp dot com, edelsohn at gnu dot org
Date: Sat, 14 Aug 2010 10:00:28 -0700
Subject: [CFT, v4] Vectorized _cpp_clean_line
References: <4C601691.1000303@moene.org> <4C601E08.4020303@google.com> <4C6035C2.9020505@moene.org> <4C60378B.4060303@google.com> <4C603AC2.5070403@moene.org> <45B5C4E0-DFA5-413E-8FC8-E13077862245@apple.com> <877hjy8qwk.fsf@basil.nowhere.org> <4C64699B.20804@redhat.com> <20100812220708.GC7058@basil.fritz.box> <4C647448.6080707@redhat.com> <20100813070300.GA12885@gargoyle.fritz.box>

On 08/13/2010 12:28 AM, Andi Kleen wrote:
>> static inline bool
>> search_line_fast (s, end, out)
>> {
>>   if (fast_impl == 0)
>>     return search_line_sse42 (s, end, out);
>>   else if (fast_impl == 1)
>>     return search_line_sse2 (s, end, out);
>>   else
>>     return search_line_acc_char (s, end, out);
>> }
>>
>> where FAST_IMPL is set up appropriately by init_vectorized_lexer.
>>
>> The question being, are three predicted jumps faster than one
>> indirect jump on a processor without the proper predictor?
> 
> Yes usually, especially if you don't have to go through all three
> on average.

This is the version I plan to commit Monday or Tuesday, 
barring further feedback.  It uses direct branches as above,
with tweaks to allow inlining when possible (e.g. 64-bit
which always has SSE2).

I've also bootstrapped on powerpc64-linux and ia64-linux.
Those test machines are loaded, so testing is proceeding
rather slowly.  I'd appreciate it if dje and sje could
give it a go on aix and ia64-hpux and see that (1) it works
with the big-endian, ilp32 hpux, and (2) if at all possible
report some performance results.

r~

Attachment: searchline-4
Description: Text document

Follow-Ups:
- Re: [CFT, v4] Vectorized _cpp_clean_line
  - From: Steve Ellcey
- Re: [CFT, v4] Vectorized _cpp_clean_line
  - From: Tom Tromey

References:
- The speed of the compiler, was: Re: Combine four insns
  - From: Toon Moene
- Re: The speed of the compiler, was: Re: Combine four insns
  - From: Diego Novillo
- Re: The speed of the compiler, was: Re: Combine four insns
  - From: Toon Moene
- Re: The speed of the compiler, was: Re: Combine four insns
  - From: Diego Novillo
- Re: The speed of the compiler, was: Re: Combine four insns
  - From: Toon Moene
- Re: The speed of the compiler, was: Re: Combine four insns
  - From: Chris Lattner
- Re: The speed of the compiler, was: Re: Combine four insns
  - From: Andi Kleen
- Vectorized _cpp_clean_line
  - From: Richard Henderson
- Re: Vectorized _cpp_clean_line
  - From: Andi Kleen
- Re: Vectorized _cpp_clean_line
  - From: Richard Henderson
- Re: Vectorized _cpp_clean_line
  - From: Andi Kleen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]