[Bug rtl-optimization/49095] Horrible code generation for trivial decrement with test

Sat May 21 21:33:00 GMT 2011

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49095

--- Comment #3 from Linus Torvalds <torvalds@linux-foundation.org> 2011-05-21 20:42:26 UTC ---
Hmm. Looking at that code generation, it strikes me that even with the odd load
store situation, why do we have that "test" instruction?

   c:    8b 10                    mov    (%eax),%edx
   e:    83 ea 01                 sub    $0x1,%edx
  11:    85 d2                    test   %edx,%edx
  13:    89 10                    mov    %edx,(%eax)
  15:    74 09                    je     20 <main+0x20>

iow, regardless of any complexities of the store, that "sub + test" is just
odd. Gcc knows to simplify that particular sequence in other situations, why
doesn't it simplify it here?

IOW, I can make gcc generate code like

   c:    83 e8 01                 sub    $0x1,%eax
   f:    75 07                    jne    18 <main+0x18>

with no real problem when it's in registers. No "test" instruction after the
sub. Why does that store matter so much?

It looks like the combine is bring driven by the conditional branch, and then
when the previous instruction from the conditional branch is that store,
everything kind of goes to hell.

Would it be possible to have a peephole for the "arithmetic/logical +
compare-with-zero" case (causing us to just drop the compare), and then have a
separate peephole optimization that triggers the "load + op + store with dead
reg" and turns that into a "op to mem" case?

The MD files do make me confused, so maybe there is some fundamental limitation
to the peephole patterns that makes this impossible?