There are cases in which phiopt doesn't recognize MAX_EXPRs or MIN_EXPRs patterns. In particular, source codes that look very similar at first sight may induce phiopt to behave differently. Let's consider the following two functions: ----------------------------- int minmax_correct(int a) { if (a > 32767) a = 32767; else if (a < -32768) a = -32768; return a; } int minmax_wrong(int a) { if (a > 32767) a = 32767; if (a < -32768) a = -32768; return a; } ----------------------------- MIN_EXPRs and MAX_EXPRs are generated for the first function, but not for the second. Here is the contents of trace file minmax.c.042t.phicprop1: ----------------------------- ;; Function minmax_correct (minmax_correct) minmax_correct (a) { <bb 2>: if (a_2 > 32767) goto <L3>; else goto <L1>; <L1>:; if (a_2 < -32768) goto <L2>; else goto <L3>; <L2>:; # a_1 = PHI <32767(2), a_2(3), -32768(4)>; <L3>:; <retval> = a_1; return <retval>; } ;; Function minmax_wrong (minmax_wrong) Removing basic block 6 minmax_wrong (a) { <bb 2>: if (a_3 > 32767) goto <L6>; else goto <L1>; <L1>:; if (a_3 < -32768) goto <L3>; else goto <L6>; # a_9 = PHI <a_3(3), 32767(2)>; <L6>:; # a_2 = PHI <a_9(4), -32768(3)>; <L3>:; <retval> = a_2; return <retval>; } ----------------------------- And here is minmax.c.043t.phiopt1: ----------------------------- ;; Function minmax_correct (minmax_correct) Removing basic block 4 Removing basic block 3 Merging blocks 2 and 5 minmax_correct (a) { <bb 2>: a_7 = MAX_EXPR <-32768, a_2>; a_8 = MIN_EXPR <a_7, 32767>; <retval> = a_8; return <retval>; } ;; Function minmax_wrong (minmax_wrong) minmax_wrong (a) { <bb 2>: if (a_3 > 32767) goto <L6>; else goto <L1>; <L1>:; if (a_3 < -32768) goto <L3>; else goto <L6>; # a_9 = PHI <a_3(3), 32767(2)>; <L6>:; # a_2 = PHI <a_9(4), -32768(3)>; <L3>:; <retval> = a_2; return <retval>; } -----------------------------
Created attachment 12378 [details] Source code of the example
Actually this is a case where jump threading gets in the way of PHI-OPT. Really PHI-OPT should be put before VRP and DOM, I don't know why it was not.
I tested what happens if the first PHI-OPT pass is moved right before the first VRP pass in gcc/passes.c It looks like PHI-OPT should be run both before and after VRP and DOM. The example reported shows that, when PHI-OPT is run after VRP and DOM, it misses some MAX_EXPR and MIN_EXPR generation. gcc/testsuite/gcc.dg/tree-ssa/phi-opt-5.c shows that, when PHI-OPT anticipates VRP and DOM, it misses some MAX_EXPR and MIN_EXPR generation as well. Should an additional PHI-OPT pass be added before VRP and DOM? (In this case, a few testsuite examples should also be changed in order to consider what happens after PHI-OPT2 instead of PHI-OPT1)
If we do a merge_phi before the phiopt we will find it correctly. I have a few patches which adds another merge_phi right before the last phiopt and adds one right before the first phiopt.
This is the patch which I am testing: Index: gcc/passes.c =================================================================== --- gcc/passes.c (revision 183381) +++ gcc/passes.c (working copy) @@ -1300,11 +1300,11 @@ init_optimization_passes (void) NEXT_PASS (pass_phiprop); NEXT_PASS (pass_fre); NEXT_PASS (pass_copy_prop); - NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_vrp); NEXT_PASS (pass_dce); NEXT_PASS (pass_cselim); NEXT_PASS (pass_tree_ifcombine); + NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_tail_recursion); NEXT_PASS (pass_ch); @@ -1401,6 +1401,7 @@ init_optimization_passes (void) NEXT_PASS (pass_late_warn_uninitialized); NEXT_PASS (pass_dse); NEXT_PASS (pass_forwprop); + NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_fold_builtins); NEXT_PASS (pass_optimize_widening_mul);
(In reply to comment #5) > This is the patch which I am testing: I Have a slightly different one since we should do a few more things before the mergephi and also we ccannot remove the first mergephi as that causes vpr to lose some optimizations.
(In reply to comment #6) > (In reply to comment #5) > > This is the patch which I am testing: > > I Have a slightly different one since we should do a few more things before the > mergephi and also we ccannot remove the first mergephi as that causes vpr to > lose some optimizations. I think it makes sense to do mergephi before phiopt. mergephi should not be required for VRP - can you file a bug with a testcase?
This may be worth filing as another PR (let me know if you think I should), but another case of VRP stymieing phiopt is: void bar (int); void foo (int a, int b) { if (!b) bar (1); else { int c; if (a) c = a; else c = 0; if (c == b) bar (2); } } Before vrp1 we have: <bb 2> [100.0%]: if (b_3(D) == 0) goto <bb 3>; [29.6%] else goto <bb 4>; [70.4%] <bb 3> [29.6%]: bar (1); goto <bb 8>; [100.0%] <bb 4> [70.4%]: if (a_4(D) != 0) goto <bb 6>; [50.0%] else goto <bb 5>; [50.0%] <bb 5> [35.2%]: <bb 6> [70.4%]: # c_1 = PHI <a_4(D)(4), 0(5)> if (c_1 == b_3(D)) goto <bb 7>; [22.9%] else goto <bb 8>; [77.0%] <bb 7> [16.2%]: bar (2); <bb 8> [100.0%]: return; phiopt would tell that c_1 is equal to a_4, and after making that substitution, cfgcleanup would remove blocks 4 and 5, leaving two comparisons rather than three. However, VRP runs first and threads 4->5 to 4->8, giving: <bb 2> [100.0%]: if (b_3(D) == 0) goto <bb 3>; [29.6%] else goto <bb 4>; [70.4%] <bb 3> [29.6%]: bar (1); goto <bb 7>; [100.0%] <bb 4> [70.4%]: if (a_4(D) != 0) goto <bb 5>; [50.0%] else goto <bb 8>; [50.0%] <bb 5> [35.2%]: # c_1 = PHI <a_4(D)(4)> if (c_1 == b_3(D)) goto <bb 6>; [45.9%] else goto <bb 7>; [54.1%] <bb 6> [16.2%]: bar (2); <bb 7> [100.0%]: return; <bb 8> [35.2%]: # c_11 = PHI <0(4)> goto <bb 7>; [100.0%] We never "recover" from that and so the three comparisons survive into the final output. Removing the first !b comparison leaves no jump threading opportunities, so phiopt kicks in as expected. This is a cut-down version of what happens with things like the safe_as_a calls in NEXT_INSN. E.g. in the basic-block iterator: #define FOR_BB_INSNS(BB, INSN) \ for ((INSN) = BB_HEAD (BB); \ (INSN) && (INSN) != NEXT_INSN (BB_END (BB)); \ (INSN) = NEXT_INSN (INSN)) we never get rid of the p null test in: template <typename T, typename U> inline T safe_as_a (U *p) { if (p) { gcc_checking_assert (is_a <T> (p)); return is_a_helper <T>::cast (p); } else return NULL; } even though in release builds the function should reduce to "return p".
This actually has been fixed fully since r9-3606.
(In reply to rsandifo@gcc.gnu.org from comment #8) > This may be worth filing as another PR (let me know if you > think I should), but another case of VRP stymieing phiopt is: > > void bar (int); > void > foo (int a, int b) > { > if (!b) > bar (1); > else > { > int c; > if (a) > c = a; > else > c = 0; > if (c == b) > bar (2); > } > } This was testcase was fixed on trunk by a combo of patches to tree-ssa-phiopt.c and match.pd (r12-2041, r12-2040, r12-2039, and r12-1152 [there might have been a few more required which I missed]) which allows early phiopt to change if (a) c = a else c = 0; to just c = a.