This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Optimize manual byte swap implementations v3
- From: Andreas Krebbel <krebbel at linux dot vnet dot ibm dot com>
- To: Richard Guenther <richard dot guenther at gmail dot com>
- Cc: Mark Mitchell <mark at codesourcery dot com>, Andrew Haley <aph at redhat dot com>, gcc-patches at gcc dot gnu dot org, rdsandiford at googlemail dot com
- Date: Mon, 27 Apr 2009 10:03:28 +0200
- Subject: Re: [PATCH] Optimize manual byte swap implementations v3
- References: <20090209145520.GA32536@bart> <49931E43.1050307@codesourcery.com> <499414EA.9010204@linux.vnet.ibm.com> <49945590.50606@codesourcery.com> <873aehgxn1.fsf@firetop.home> <49971B86.4000702@codesourcery.com> <499727DB.6020704@redhat.com> <49986FE4.8020604@codesourcery.com> <84fc9c000904241354n50c1b65s6369e4272a76bb98@mail.gmail.com>
Hi Richard,
as proposed by Mark I did some experiments with the ARM target. I think
the byte swap optimizer is especially benefical for targets where a
"bswap" instruction has been added with some later CPU level. For these
targets it is especially ugly to enhance code with inline assemblies
since it would require CPU level checks. For ARM the instruction has
been added with ARMv6. Currently neither the Linux kernel nor newlib use
the "rev" instruction to implement byte swaps.
In order to enable the ARM byte swap instruction at all I did a small
patch adding the "rev" instruction to the arm back-end and a kernel
patch removing the bswap special handling in the linux kernel.
With the byte swap optimizer pass enabled the vmlinux.o object file
contained 178 "rev" instructions shrinking the .text section by several
kB (I don't have the exact number on this machine). To me a bit
surprising was the shrinkage of the debug information. The .debug*
sections together became about 40kB smaller.
What is lacking is to have a look at the performance overhead. I would
consider the kernel as some kind of worst case scenario and I will try
to do some benchmarking on my x86_64 machine with it.
Bye,
-Andreas-