I found this while looking into PR 8361 for missed optimization. The following two programs should produce the same asm: int f(int k, int i1, int j1) { int *f1; if(k) f1 = &i1; else f1 = &j1; return *f1; } int g(int k, int i1, int j1) { int *f1; if(k) return i1; else return j1; }
...and of course you are going to show what you *actually* get now?
This is what I get with the mainline on ppc-darwin: _f: cmpwi cr7,r3,0 stw r4,28(r1) stw r5,32(r1) addi r3,r1,28 beq- cr7,L7 lwz r3,0(r3) blr L7: addi r3,r1,32 lwz r3,0(r3) blr .align 2 .globl _g _g: cmpwi cr7,r3,0 mr r3,r4 bnelr+ cr7 mr r3,r5 blr Note the stores in the f case but non otherwise.
This comes from "std::_Deque_base<_Tp, _Alloc>::_M_initialize_map" in libstdc++. Here is the last SSA tree dump: f (k, i1, j1) { int * f1; int D.1116; <bb 0>: if (k_2 != 0) goto <L2>; else goto <L4>; <L4>:; # f1_1 = PHI <&i1(0), &j1(1)>; <L2>:; # VUSE <i1_7>; # VUSE <j1_8>; D.1116_3 = *f1_1; return D.1116_3; } And the .final_cleanup one: f (k, i1, j1) { int * f1; <bb 0>: if (k != 0) goto <L5>; else goto <L4>; <L5>:; f1 = &i1; goto <bb 3> (<L2>); <L4>:; f1 = &j1; <L2>:; return *f1; }
Subject: Re: New: missed optimization with ifs and deferencing On Thu, 2005-01-13 at 21:38 +0000, pinskia at gcc dot gnu dot org wrote: > I found this while looking into PR 8361 for missed optimization. > The following two programs should produce the same asm: > int f(int k, int i1, int j1) > { > int *f1; > if(k) > f1 = &i1; > else > f1 = &j1; > return *f1; > } The only way you are going to get this to optimize is to make it see we are derefencing a phi whose arguments are all is_gimple_min_invariant, substitute the value in, merge it, and then use the that at the derefence point. This is actually just a special case of noticing that we are really doing *&<something> in both cases, and removing the *& part, while leaving the rest. IE transform it like so Step 0: f (k, i1, j1) { int * f1; int D.1116; <bb 0>: if (k_2 != 0) goto <L2>; else goto <L4>; <L4>:; # f1_1 = PHI <&i1(0), &j1(1)>; <L2>:; # VUSE <i1_7>; # VUSE <j1_8>; D.1116_3 = *f1_1; return D.1116_3; } Step 1 (new blocks added for clarity) f (k, i1, j1) { int * f1; int D.1116; <bb 0>: if (k_2 != 0) goto <L3>; else goto <L4>; <L4>:; newtemp_2 = i1; goto L2 <L3> newtemp_1 = j1; f1_1 = PHI<&i1(0), &j1(1)> <L2>:; # VUSE <i1_7>; # VUSE <j1_8>; D.1116_3 = *f1_1; return D.1116_3; } Step 2: f (k, i1, j1) { int * f1; int D.1116; <bb 0>: if (k_2 != 0) goto <L3>; else goto <L4>; <L4>:; newtemp_2 = i1; goto L2 <L3> newtemp_1 = j1; newphi = PHI<newtemp_2(0), newtemp_1(1)> <L2>:; # VUSE <i1_7>; # VUSE <j1_8>; D.1116_3 = *f1_1; return D.1116_3; } Step 3: f (k, i1, j1) { int * f1; int D.1116; <bb 0>: if (k_2 != 0) goto <L3>; else goto <L4>; <L4>:; newtemp_2 = i1; goto L2 <L3> newtemp_1 = j1; newphi = PHI<newtemp_2(0), newtemp_1(1)> <L2>:; D.1116_3 = newphi; return D.1116_3; } Notice that all i've done is simply match up that the dereference was of a phi node whoes arguments are all &something. In that specific case, we can remove all the addressing operations and the dereference use, and just replace it with the values.
I should mention this shows up with std::min/std::max like: int main(int argc) { int a = std::min(8, argc*2); return a; }
*** Bug 30261 has been marked as a duplicate of this bug. ***
Maybe we can do something with this in PHIopt...
*** Bug 30738 has been marked as a duplicate of this bug. ***
I have a patch^Whack.
one more testcase: #include <tr1/functional> #include <algorithm> extern void assign( long* variable, long v ) { std::transform( variable, variable + 1, variable, std::tr1::bind( std::plus< long >(), 0L, v ) ); } extern void assign( long& variable, long v ) { std::transform( &variable, &variable + 1, &variable, std::tr1::bind( std::plus< long >(), 0L, v ) ); } at -O3 gcc 4.2 produces: assign(long&, long): movq -40(%rsp), %rax movq %rsi, -24(%rsp) movq $0, -32(%rsp) movq $0, -64(%rsp) movq %rsi, -56(%rsp) movq %rsi, (%rdi) movq %rax, -72(%rsp) ret assign(long*, long): movq %rsi, (%rdi) ret
Bonus points if you can make that self-contained ;)
(In reply to comment #11) > Bonus points if you can make that self-contained ;) -fdump-tree-optimized shows the same code for both variants and there is the if-with-dereferencing :) (...) if (variable != variable + 8B) goto <L6>; else goto <L2>; (...)
> if (variable != variable + 8B) goto <L6>; else goto <L2>; That issue was fixed just yesterday by PR 30951.
Not really as this is only exposed by TER. Though I remember forwprop catching similiar stuff before. I'll have a look.
We start with D.59982_3 = variable_2(D) + 4B; __unary_op = __unary_op.65; goto <bb 4> (<L1>); <L0>:; D.61146_10 = __unary_op._M_arg1; D.61147_11 = __unary_op._M_arg2; D.61093_12 = D.61146_10 + D.61147_11; *variable_13 = D.61093_12; variable_14 = variable_7 + 4B; variable_15 = variable_13 + 4B; # variable_7 = PHI <variable_2(D)(2), variable_14(3)> # variable_13 = PHI <variable_2(D)(2), variable_15(3)> <L1>:; if (D.59982_3 != variable_7) goto <L0>; else goto <L2>; and loop header copying and DOM make it optimizable: D.59982_3 = variable_2(D) + 4B; __unary_op.65._M_arg2 = v_1(D); __unary_op.65._M_arg1 = 0; __unary_op = __unary_op.65; __unary_op$_M_arg2_54 = __unary_op._M_arg2; __unary_op$_M_arg1_53 = __unary_op._M_arg1; if (variable_2(D) != D.59982_3) goto <L7>; else goto <L2>; <L2>:; return; # variable_49 = PHI <variable_2(D)(2)> # variable_27 = PHI <variable_2(D)(2)> <L7>:; __unary_op$_M_arg1_25 = __unary_op$_M_arg1_53; __unary_op$_M_arg2_17 = __unary_op$_M_arg2_54; D.61093_18 = __unary_op$_M_arg1_53 + __unary_op$_M_arg2_54; *variable_49 = D.61093_18; variable_20 = variable_27 + 4B; variable_21 = variable_49 + 4B; goto <bb 3> (<L2>); so a tweaked forwprop fixes it at least before PRE. But dunno if we want to do tree combining on conditions here.
Created attachment 13109 [details] patch to fix testcase in comment 10
I split the issue in comment #10 to PR30965 as it is a slightly different issue.
Subject: Bug 19431 Author: rguenth Date: Wed Apr 18 12:45:09 2007 New Revision: 123946 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=123946 Log: 2007-04-18 Richard Guenther <rguenther@suse.de> PR tree-optimization/19431 PR tree-optimization/21463 * tree-pass.h (pass_phiprop): Declare. * passes.c (init_optimization_passes): New phiprop pass. * tree-ssa-forwprop.c (struct phiprop_d): New structure. (phivn_valid_p): New helper function. (phiprop_insert_phi): Likewise. (propagate_with_phi): Likewise. (tree_ssa_phiprop): New propagator propagating loads through phi nodes if profitable. * gcc.c-torture/execute/20070212-1.c: New testcase. * gcc.c-torture/execute/20070212-2.c: Likewise. * gcc.c-torture/execute/20070212-3.c: Likewise. * gcc.dg/tree-ssa/pr19431.c: Likewise. * gcc.dg/tree-ssa/pr21463.c: Likewise. * g++.dg/tree-ssa/pr21463.C: Likewise. * g++.dg/tree-ssa/pr30738.C: Likewise. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/pr21463.C trunk/gcc/testsuite/g++.dg/tree-ssa/pr30738.C trunk/gcc/testsuite/gcc.c-torture/execute/20070212-1.c trunk/gcc/testsuite/gcc.c-torture/execute/20070212-2.c trunk/gcc/testsuite/gcc.c-torture/execute/20070212-3.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr19431.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr21463.c Modified: trunk/gcc/ChangeLog trunk/gcc/passes.c trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-pass.h trunk/gcc/tree-ssa-forwprop.c
Fixed.
(In reply to comment #19) > Fixed. > will it be backported to 4.2? it improves perf. of big stl-based apps.
Ha, sure not ;)
(In reply to comment #21) > Ha, sure not ;) > and wait another years for 4.3 release. it sux.
(In reply to comment #22) > and wait another years for 4.3 release. it sux. GCC 4.3 is the one of the few compilers to do this optimization.