This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: gcc will become the best optimizing x86 compiler
On Fri, Jul 25, 2008 at 9:08 AM, Agner Fog <agner@agner.org> wrote:
> Raksit Ashok wrote:
>>There is a more optimized version for 64-bit:
>>http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/gen/memcpy.s
>>I think this looks similar to your implementation, Agner.
>
> Yes it is similar to my code.
3164 line source file which implements memcpy().
You got to be kidding.
How much of L1 icache it blows away in the process?
I bet it performs wonderfully on microbenchmarks though.
2991 .balign 16 # sadistic alignment strikes again
2992 L(bkPxQx): .int L(bkP0Q0)-L(bkPxQx) # why use two bytes when
we can use four?
Seriously. What possible reason there can be to align
a randomly accessed data table to 16 bytes?
4 bytes I understand, but 16?
--
vda