This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize manual byte swap implementations v3

> What is lacking is to have a look at the performance overhead. I would  
> consider the kernel as some kind of worst case scenario and I will try  
> to do some benchmarking on my x86_64 machine with it.

I've built the Linux kernel with -j4 (version 2.6.28) 5 times with and
without the bswap patch:

x86_64 Intel Quad Core 9550 8GB 2.83 GHz

GCC svn revision: 147107

user	system	elapsed
3602.04	314.83	17:02.93
3604.14	314.14	17:01.82
3602.02	315.20	17:07.43
3603.38	314.94	17:04.00
3603.64	314.67	17:03.13


user	system	elapsed
3619.95	314.29	17:10.17
3618.13	316.58	17:11.39
3616.37	316.41	17:09.57
3619.10	318.04	17:12.32
3622.39	316.74	17:15.53

3619.19 +0.45%

In the process 4786 byte swap implementations have been recognized and
folded.  Although the x86 kernel code uses an inline assembly for byte
swaps the software implementation nevertheless stays visible to
GCC. So I think this is a valid test also representative for other

I think the run-time of the pass is dominated by the time needed to
walk over the ssa statements.  I doubt that tweaks in the recognition
of bswap will buy a lot.  Merging it with sincos would help here but
would cost us the flexibility to move the pass to somewhere else.  I
can certainly do that if this is considered the way to go.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]