[Bug tree-optimization/48837] [4.4/4.5/4.6/4.6 Regression] Wrong optimization of recursive function calls

steven at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Sun May 1 19:27:00 GMT 2011


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48837

--- Comment #5 from Steven Bosscher <steven at gcc dot gnu.org> 2011-05-01 19:24:06 UTC ---

Function foo from .143.expand dump:

;; Function int foo(int, int) (_Z3fooii)

int foo(int, int) (int a, int b)
{
  int acc_tmp.13;
  int add_acc.12;
  int D.2091;
  int D.2085;
  int D.2081;

  # BLOCK 2 freq:6950
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  # a_1 = PHI <a_2(D)(0)>
  # b_7 = PHI <b_3(D)(0)>
  # add_acc.12_21 = PHI <0(0)>
  # SUCC: 3 [100.0%]  (fallthru)

  # BLOCK 3 freq:10000
  # PRED: 2 [100.0%]  (fallthru) 5 [100.0%]  (fallthru,dfs_back,exec)
  # a_14 = PHI <a_1(2), a_23(5)>
  # b_22 = PHI <b_7(2), b_25(5)>
  # add_acc.12_19 = PHI <add_acc.12_21(2), add_acc.12_12(5)>
  D.2085_4 = b_22 | a_14;
  if (D.2085_4 != 0)
    goto <bb 4>;
  else
    goto <bb 5>;
  # SUCC: 4 [39.0%]  (true,exec) 5 [61.0%]  (false,exec)

  # BLOCK 4 freq:3900
  # PRED: 3 [39.0%]  (true,exec)
  D.2081_5 = baz (); [tail call]
  acc_tmp.13_10 = D.2081_5 + add_acc.12_19;
  return acc_tmp.13_10;
  # SUCC: EXIT [100.0%]

  # BLOCK 5 freq:3050
  # PRED: 3 [61.0%]  (false,exec)
Invalid sum of incoming frequencies 6100, should be 3050
  D.2091_8 = foo (0, 1);
  add_acc.12_12 = add_acc.12_19 + D.2091_8;
  a_23 = 1;
  b_25 = 0;
  goto <bb 3>;
  # SUCC: 3 [100.0%]  (fallthru,dfs_back,exec)

}
Invalid sum of incoming frequencies 3900, should be 6950

(Note, someone is not updating the CFG properly, see incorrect profile info.
This is caused by VRP, the mismatch appears first in the .vrp1 dump.)


The PHI nodes in basic block 3 show that the "+ foo(1,0)" part of foo() has
been transformed from a recursive call into a loop inside foo.

We enter the function with a==0 and b==0 so control flow is as follows:

BB2 -> BB3 -> BB5 (because (a | b) == 0) -> 
D.2091_8 = foo(0, 1)                                // D.2091_8 = 1
add_acc = 0 (from the PHI in BB2) + 1 = 1
-> BB3 with a=1 and b=0 -> BB4 ->
D.2081_5 = baz () [tail call]                       // D.2081_5 = 1
acc_tmp.13_10 = 1 + add_acc.12_19 = 1 + 1 = 2
return acc_tmp.13_10                                // return 2

That looks correct to me, but foo(0,0) returns 1 instead of 2. Both foo(0,1)
and foo(1,0) both return 1 as expected.

I think the problem is in RTL expansion of the "tail call" in basic block 4,
which doesn't look like a tail call to me. The RTL generated for basic block 4
is this:

;; Generating RTL for gimple basic block 4

;; D.2081_5 = baz (); [tail call]

(call_insn/u/j 13 12 14 4 (set (reg:SI 0 ax)
        (call (mem:QI (symbol_ref:DI ("_Z3bazv") [flags 0x3]  <function_decl
0x7ffff722c100 baz>) [0 baz S1 A8])
            (const_int 0 [0]))) t.cc:14 -1
     (expr_list:REG_EH_REGION (const_int 0 [0])
        (nil))
    (nil))

(barrier 14 13 0)

The "  acc_tmp.13_10 = D.2081_5 + add_acc.12_19;" part is lost.



More information about the Gcc-bugs mailing list