This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/48837] [4.4/4.5/4.6/4.6 Regression] Wrong optimization of recursive function calls


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48837

--- Comment #5 from Steven Bosscher <steven at gcc dot gnu.org> 2011-05-01 19:24:06 UTC ---

Function foo from .143.expand dump:

;; Function int foo(int, int) (_Z3fooii)

int foo(int, int) (int a, int b)
{
  int acc_tmp.13;
  int add_acc.12;
  int D.2091;
  int D.2085;
  int D.2081;

  # BLOCK 2 freq:6950
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  # a_1 = PHI <a_2(D)(0)>
  # b_7 = PHI <b_3(D)(0)>
  # add_acc.12_21 = PHI <0(0)>
  # SUCC: 3 [100.0%]  (fallthru)

  # BLOCK 3 freq:10000
  # PRED: 2 [100.0%]  (fallthru) 5 [100.0%]  (fallthru,dfs_back,exec)
  # a_14 = PHI <a_1(2), a_23(5)>
  # b_22 = PHI <b_7(2), b_25(5)>
  # add_acc.12_19 = PHI <add_acc.12_21(2), add_acc.12_12(5)>
  D.2085_4 = b_22 | a_14;
  if (D.2085_4 != 0)
    goto <bb 4>;
  else
    goto <bb 5>;
  # SUCC: 4 [39.0%]  (true,exec) 5 [61.0%]  (false,exec)

  # BLOCK 4 freq:3900
  # PRED: 3 [39.0%]  (true,exec)
  D.2081_5 = baz (); [tail call]
  acc_tmp.13_10 = D.2081_5 + add_acc.12_19;
  return acc_tmp.13_10;
  # SUCC: EXIT [100.0%]

  # BLOCK 5 freq:3050
  # PRED: 3 [61.0%]  (false,exec)
Invalid sum of incoming frequencies 6100, should be 3050
  D.2091_8 = foo (0, 1);
  add_acc.12_12 = add_acc.12_19 + D.2091_8;
  a_23 = 1;
  b_25 = 0;
  goto <bb 3>;
  # SUCC: 3 [100.0%]  (fallthru,dfs_back,exec)

}
Invalid sum of incoming frequencies 3900, should be 6950

(Note, someone is not updating the CFG properly, see incorrect profile info.
This is caused by VRP, the mismatch appears first in the .vrp1 dump.)


The PHI nodes in basic block 3 show that the "+ foo(1,0)" part of foo() has
been transformed from a recursive call into a loop inside foo.

We enter the function with a==0 and b==0 so control flow is as follows:

BB2 -> BB3 -> BB5 (because (a | b) == 0) -> 
D.2091_8 = foo(0, 1)                                // D.2091_8 = 1
add_acc = 0 (from the PHI in BB2) + 1 = 1
-> BB3 with a=1 and b=0 -> BB4 ->
D.2081_5 = baz () [tail call]                       // D.2081_5 = 1
acc_tmp.13_10 = 1 + add_acc.12_19 = 1 + 1 = 2
return acc_tmp.13_10                                // return 2

That looks correct to me, but foo(0,0) returns 1 instead of 2. Both foo(0,1)
and foo(1,0) both return 1 as expected.

I think the problem is in RTL expansion of the "tail call" in basic block 4,
which doesn't look like a tail call to me. The RTL generated for basic block 4
is this:

;; Generating RTL for gimple basic block 4

;; D.2081_5 = baz (); [tail call]

(call_insn/u/j 13 12 14 4 (set (reg:SI 0 ax)
        (call (mem:QI (symbol_ref:DI ("_Z3bazv") [flags 0x3]  <function_decl
0x7ffff722c100 baz>) [0 baz S1 A8])
            (const_int 0 [0]))) t.cc:14 -1
     (expr_list:REG_EH_REGION (const_int 0 [0])
        (nil))
    (nil))

(barrier 14 13 0)

The "  acc_tmp.13_10 = D.2081_5 + add_acc.12_19;" part is lost.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]