[Bug target/56309] -O3 optimizer generates conditional moves instead of compare and branch resulting in almost 2x slower code
steven at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Thu Feb 14 16:59:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309
Steven Bosscher <steven at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |steven at gcc dot gnu.org
--- Comment #11 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-14 16:59:04 UTC ---
(In reply to comment #8)
> I wonder if instead of emitting this sequence
>
> shr $0x20,%rdi
> and $0xffffffff,%ecx
> cmp %r8,%rdx
> cmovbe %r11,%rdi
> add $0x1,%rax
> cmp %r8,%rdx
> cmovbe %rdx,%rcx
>
> it would do this instead
>
> shr $0x20,%rdi
> and $0xffffffff,%ecx
> add $0x1,%rax
> cmp %r8,%rdx
> cmovbe %r11,%rdi
> cmovbe %rdx,%rcx
GCC fails to do so because the flags are clobbered between the two
cmovs, preventing code motion to group the two cmovs:
197: r116:DI={(gtu(flags:CC,0))?r125:DI:r233:DI}
199: {r110:DI=r110:DI+0x1;clobber flags:CC;}
201: flags:CC=cmp(r124:DI,r235:DI)
202: r221:DI={(gtu(flags:CC,0))?r126:DI:r124:DI}
If you do this change manually in your code (compile with -S, "fix"
the .s file and assemble it), does that speed up your code?
More information about the Gcc-bugs
mailing list