[PATCH] Optimize manual byte swap implementations v3

Mark Mitchell mark@codesourcery.com
Thu Feb 12 17:33:00 GMT 2009


Andreas Krebbel wrote:

>> What is the performance improvement provided by this pass on what
>> benchmarks on what CPUs?  You may have indicated this elsewhere, in
>> which case I apologize for asking a redundant question; please just
>> point me at the URL.
> 
> Unfortunately I'm not able to come up with performance numbers.

I think that we shouldn't turn on an optimization pass by default with
-O2/-O3 if we can't measure it's improvement on some benchmark.  Here,
benchmark doesn't have to be something like SPEC or EEMBC; it could be
"the Linux kernel on an x86 CPU" or "Firefox running on Windows" or
whatever.  But, optimization is a numbers game, and compile times do
matter.  If we have an optimization pass, then its benefit is how much
better it makes generated code, and its cost is how long it takes to
run, and those need to be in balance.

Note that I say "by default".  If we have an optimization pass that is
disabled by default, it's not a problem because the cost to most users
is zero.  Obviously, we still want the code that implements the pass to
be of high quality, we'd like to know that it's useful to somebody
somewhere, etc.  But, it doesn't have to meet as high a threshold on the
cost/benefit side, since the costs are basically zero.

> But how would you recommend an application developer should implement a
> byte swap which is neither bound to GCC (even to a specific version) by
> using the bswap builtins nor bound to GCC and a specific CPU by using
> inline assemblies?

As you say, if you're constrained to use any compiler on any CPU, then
you have no choice but to write generic code.  But, even then, you can
write yourself a bswap macro and call it -- and then put some #ifdef's
in for architectures or compilers.

I understand the optimization and I'm sure there are cases where it's
going to result in smaller, faster code.  (For example, for code size
alone, I bet it's beneficial on ARM.)  But that in and of itself isn't a
 justification for being on by default with -O2/-O3.  (Of course, if we
find it makes code 0.5% smaller, then that would probably be a good
justification for being on by default with -Os.)

Thanks,

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713



More information about the Gcc-patches mailing list