[REVISED][PATCH/RFT] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
H. J. Lu
hjl@lucon.org
Wed Jun 20 06:20:00 GMT 2007
On Tue, Jun 19, 2007 at 05:38:14PM +0200, Paolo Bonzini wrote:
>
> >>> So instead of writing
> >>>
> >>> .p2align 4,,7
> >>>
> >>>can't you write the sequence:
> >>>
> >>> .p2align 4,,7
> >>> .p2align 3,,7
> >>>
> >>>which will have the exact effect you're asking for there?
> >>Yeah, that's the affect I'm looking for. However, my comment was
> >>more of a hypothetical "Why doesn't it work that way?" type of
> >>question, rather than a "I'm really interested in this and am
> >>going to fix this!". I guess I just thought it curious.
> >
> >Well, why should it work that way? It's designed to give you a
> >tradeoff between alignment and saving code space. As long as you can
> >specify precisely what you want--and, as Dave shows, you can--it
> >should let you do that, rather than guessing what you might want.
>
> I think Peter is reading ".p2align 4,,7" as "give me *as much alignment
> as you can* (up to 2^4) with a 7-byte sequence". Instead, it is "give
> me 2^4 alignment *if you can* with a 7-byte sequence". Two remarks:
>
> 1) I guess emitting an additional ".p2align 3,,7" is better anyway,
> because it would improve performance.
For .p2align, the assembler after 2006-06-23 will generate a single
nop up to 10 bytes:
0x66,0x2e,0x0f,0x1f,0x84,0x00,0x00,0x00,0x00,0x00
by default for 64bit and with -march=i686 or above for 32bit while
the older assembler can only generate a single nop up to 7 bytes:
0x8d,0xb4,0x26,0x00,0x00,0x00,0x00
I think we should use ".p2align 4,,10" instead of ".p2align 4,,7"
for 64bit. It will optimize for more cases for 64bit.
H.J.
More information about the Gcc-patches
mailing list