This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 08/13/2010 12:28 AM, Andi Kleen wrote: >> static inline bool >> search_line_fast (s, end, out) >> { >> if (fast_impl == 0) >> return search_line_sse42 (s, end, out); >> else if (fast_impl == 1) >> return search_line_sse2 (s, end, out); >> else >> return search_line_acc_char (s, end, out); >> } >> >> where FAST_IMPL is set up appropriately by init_vectorized_lexer. >> >> The question being, are three predicted jumps faster than one >> indirect jump on a processor without the proper predictor? > > Yes usually, especially if you don't have to go through all three > on average. This is the version I plan to commit Monday or Tuesday, barring further feedback. It uses direct branches as above, with tweaks to allow inlining when possible (e.g. 64-bit which always has SSE2). I've also bootstrapped on powerpc64-linux and ia64-linux. Those test machines are loaded, so testing is proceeding rather slowly. I'd appreciate it if dje and sje could give it a go on aix and ia64-hpux and see that (1) it works with the big-endian, ilp32 hpux, and (2) if at all possible report some performance results. r~
Attachment:
searchline-4
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |