This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize manual byte swap implementations -refreshed

On Fri, Jun 12, 2009 at 12:28:11PM +0200, Andreas Krebbel wrote:
> Hi,
> here is a refreshed version of the bswap optimization pass.  I again
> tried to do some measurements with builds of the Linux kernel. This
> time I've also tried to evaluate how much of the time is consumed by
> just walking the statements.
> Unfortunately I must admit that there is quite a deviation in the
> results.  At least for the statement walk measurements the standard
> deviation is so high that you should look at it with care. My
> statistics teacher probably would kick me through the hallway for
> these numbers ;)
> With the latest version - after integrating some more comments from
> Richard - the overhead went down to 0.24%.
> I've built the Linux kernel with -j4 (version 2.6.28) 5 times. The
> timings show the total time spent in user space measured with the
> \time command - not the bash builtin.
> x86_64 Intel Quad Core 9550 8GB 2.83 GHz
> GCC svn revision: 147107
> clean		stmt walk only	optimized
> 3599.21s	3599.23s	3607.28s	
> 3604.17s	3609.16s	3608.33s	
> 3600.32s	3601.75s	3610.47s	
> 3600.49s	3608.81s	3611.62s	
> 3601.26s	3604.6s		3611.51s	
> +-1.87s +-0.05%	+-4.34s +-0.12%	+-1.95s +-0.05%  <- standard deviation
> 3601.09		3604.71 +0.10%	3609.84	+0.24%
> Bootstrapped on x86_64. No regressions.
> Ok for mainline?

I'm just wondering out loud whether it would be useful to add tests of bswap
from a memory location in addition to a register (rs6000's main bswap is
load from and store to memory, and from, it looks like the s390 also
has a bswap from memory).  Most of the other bswap targets like the x86 are
register only, so presumably it would would work also.

Now, I can certainly put in extra powerpc tests, but I'm wondering if it would
be useful to add the tests as global tests.

Also, getting back to the issue raised in my latest bswap patches, I wonder
whether it would be useful to move bswap16 to generic, and add bswaphi support
to your patches?  I know the powerpc would find it useful, and IIRC, you could
do a 16-bit rotate on x86.

Michael Meissner, IBM
4 Technology Place Drive, MS 2203A, Westford, MA, 01886, USA

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]