Bug 37810 - Bad store sinking job
Summary: Bad store sinking job
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.4.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
Keywords: alias, missed-optimization
Depends on:
Reported: 2008-10-12 15:13 UTC by Carlo Wood
Modified: 2009-04-03 12:34 UTC (History)
4 users (show)

See Also:
Known to work:
Known to fail:
Last reconfirmed: 2009-04-03 12:34:44


Note You need to log in before you can comment on or make changes to this bug.
Description Carlo Wood 2008-10-12 15:13:31 UTC
The following code snippet:

void g();

struct A {
  int n;
  int m;

  A& operator++(void)
    if (__builtin_expect(n == m, false))
    return *this;

  A() : n(0), m(0) { }

  friend bool operator!=(A const& a1, A const& a2) { return a1.n != a2.n; }

void testfunction(A& iter)
  A const end;
  while (iter != end)

Results in the following assembly code, using maximum optimization:

        movl    (%rdi), %eax
        jmp     .L6

        cmpl    %eax, 4(%rdi)     // n == m ?
        je      .L8               // unlikely jump
        addl    $1, %eax          // ++n
        movl    %eax, (%rdi)      // *** store result to memory ***
        testl   %eax, %eax        // iter != end ?
        jne     .L4               // continue while loop

The storing (back) of %eax to (%rdi) remains inside the inner
loop no matter what I try. It could/should be moved outside
the loop, since nothing inside the L4 loop is accessing (%rdi)
or could possibly be accessing that memory.

This loop has two exits: below the last jne .L4, and the
jump to .L8. The store could be sinked to both exits.
This grows the code, but it seems reasonable to do for
a loop with a very small body, especially if one of the
exits is marked as unlikely :p.
Comment 1 Richard Biener 2008-10-12 15:20:19 UTC
store-sinking doesn't do its job because it thinks that

Memory reference 0: iter_1(D)->n
Memory reference 1: iter_1(D)->m
Querying dependencies of ref 0 in loop 1: dependent
Comment 2 Richard Biener 2008-10-12 15:25:41 UTC
The original testcase (from an IRC discussion) reduced to a C testcase is:

struct A {
  int n;
  int m;

void g();

void test (struct A* iter)
  struct A end = { 0, 0 };
  while (iter->n != end.n)
      iter->n = iter->n + 1;
      if (iter->n == iter->m)

where there is an optimization possibility to sink the store to iter->n to
before the call and apply load-store motion to iter->n for the remaining loop.
Comment 3 Richard Biener 2008-10-12 15:29:17 UTC
It looks like the testcase in comment #2 should be fixed by SSUPRE?  We have

  *p = ...;
  if ()

where foo() is an "implicit" store to *p.  Still store sinking should be applied
to the subloop.
Comment 4 Carlo Wood 2008-10-12 15:32:56 UTC
Note that the original code was:

  A& operator++(void)
    if (__builtin_expect(n == m, false))
    return *this;

but g++ fails to optimize that by decrementing m outside
the loop (so I'm decrementing m myself now and use the
former code). The former code has as advantage, namely,
that the result of the addl $1,%eax can be used for the
conditional jump. However, gcc ALSO doesn't do that: in
the above assembly it follows the add with a redundant
testl %eax,%eax.

Anyway, using the operator++ given in this comment,
the assembly code is:

        movl    (%rdi), %eax
        jmp     .L3

        addl    $1, %eax
        cmpl    4(%rdi), %eax
        movl    %eax, (%rdi)
        je      .L8
        testl   %eax, %eax
        jne     .L4

which is essentially the same, except now the
testl %eax,%eax is indeed "needed" ...
Comment 5 Richard Biener 2009-04-03 12:34:44 UTC