This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

.p2align


Hi,

The other day I wrote a few routines in assembler (using WIN64 calling
convention). It was something more like writing the code in C, compiling
it with gcc, then doing `objdump -D a.out | less`, taking the code and
making the necessary changes (save/restore %rdi, %rsi upon enter/leave).
All was great. Still, in my search for speed I noticed that gcc generated
a lot of suff like:

  ...
  .data 16
  .data 16
  nop
  nop
  ...

which is the result of ".p2align 4,,15" (on the net, aparently this is and
I quote "like a "turbo" switch on some benchmarks"). I said to myself: "good
to know" and did the necessary changes in my "*.S" files.
Indeed, what was before a nasty unaligned code, now it's nicely put at a
16byte boundary. However, to my disapointment, this did not make the code
run faster :(. "Au contraire", it made it run slower. So why is gcc using it?
Or am I missing something?

I've tested this on an AMD64 (Turion @ 2.2GHz) machine.

-- 
Mihai DonÈu


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]