This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
.p2align
- From: Mihai DonÈu <mihai dot dontu at gmail dot com>
- To: gcc-help at gcc dot gnu dot org
- Date: Sun, 26 Aug 2007 22:22:03 +0300
- Subject: .p2align
Hi,
The other day I wrote a few routines in assembler (using WIN64 calling
convention). It was something more like writing the code in C, compiling
it with gcc, then doing `objdump -D a.out | less`, taking the code and
making the necessary changes (save/restore %rdi, %rsi upon enter/leave).
All was great. Still, in my search for speed I noticed that gcc generated
a lot of suff like:
...
.data 16
.data 16
nop
nop
...
which is the result of ".p2align 4,,15" (on the net, aparently this is and
I quote "like a "turbo" switch on some benchmarks"). I said to myself: "good
to know" and did the necessary changes in my "*.S" files.
Indeed, what was before a nasty unaligned code, now it's nicely put at a
16byte boundary. However, to my disapointment, this did not make the code
run faster :(. "Au contraire", it made it run slower. So why is gcc using it?
Or am I missing something?
I've tested this on an AMD64 (Turion @ 2.2GHz) machine.
--
Mihai DonÈu