This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/39942] Nonoptimal code - leaveq; xchg %ax,%ax; retq
- From: "hjl dot tools at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 15 May 2009 14:35:18 -0000
- Subject: [Bug target/39942] Nonoptimal code - leaveq; xchg %ax,%ax; retq
- References: <bug-39942-17483@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #40 from hjl dot tools at gmail dot com 2009-05-15 14:35 -------
(In reply to comment #37)
> This patch looks very wrong. It assumes that min_insn_size gives exact insn
> sizes (current min_insn_size is very far from that, but even get_attr_length
> isn't exact), doesn't take into account label alignments nor branch shortening
> which can change the insn sizes afterwards and assumes that a p2align always
> aligns to 16 bytes (it does not).
> While the previous algorithm works with estimated 16 consecutive bytes rather
> than 16 byte pages 0xXXXX0 ... 0xXXXF, that's because during machine reorg
> you simply can't know in most cases where exactly the 16 byte page will start,
> so you assume it can start (almost) anywhere (and use .p2align behavior to
> align when needed).
>
There is no perfect solution here. Let's list pros/cons:
The current algorithm:
pros:
1. Very conservative. Catch most of 4 branches in 16byte windows.
Cons:
1. It works on 16byte window, not 16byte page.
2. When it gets wrong, it increases code sizes by adding unnecessary
nops.
My proposal:
Pros:
1. Work on 16byte page.
2. Even if it gets wrong, it doesn't increase code size.
Cons:
1. Rely on inaccurate instruction length data.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39942