[Bug target/56309] -O3 optimizer generates conditional moves instead of compare and branch resulting in almost 2x slower code

steven at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Feb 14 16:59:00 GMT 2013


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steven at gcc dot gnu.org

--- Comment #11 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-14 16:59:04 UTC ---
(In reply to comment #8)
> I wonder if instead of emitting this sequence
> 
>    shr    $0x20,%rdi
>    and    $0xffffffff,%ecx
>    cmp    %r8,%rdx
>    cmovbe %r11,%rdi
>    add    $0x1,%rax
>    cmp    %r8,%rdx
>    cmovbe %rdx,%rcx
> 
> it would do this instead
> 
>    shr    $0x20,%rdi
>    and    $0xffffffff,%ecx
>    add    $0x1,%rax
>    cmp    %r8,%rdx
>    cmovbe %r11,%rdi
>    cmovbe %rdx,%rcx

GCC fails to do so because the flags are clobbered between the two
cmovs, preventing code motion to group the two cmovs:

  197: r116:DI={(gtu(flags:CC,0))?r125:DI:r233:DI}
  199: {r110:DI=r110:DI+0x1;clobber flags:CC;}
  201: flags:CC=cmp(r124:DI,r235:DI)
  202: r221:DI={(gtu(flags:CC,0))?r126:DI:r124:DI}

If you do this change manually in your code (compile with -S, "fix"
the .s file and assemble it), does that speed up your code?



More information about the Gcc-bugs mailing list