Consider the 2 codes below. int mymax1(int *it, int *end) { int max = *it++; for (; it != end; it++) if (*it > max) max = *it; return max; } int mymax2(int *it, int *end) { int max = *it; while (++it != end) if (*it > max) max = *it; return max; } Compiled with g++ -Ofast, the first one is vectorized but not the second one. It gives up immediately because of "latch block not empty". That seems rather fragile :-( <bb 3>: max_8 = *it_6; max_10 = MAX_EXPR <max_2, max_8>; <bb 4>: # it_1 = PHI <it_6(3), it_4(D)(2)> # max_2 = PHI <max_10(3), max_5(2)> it_6 = it_1 + 4; if (it_6 != end_7(D)) goto <bb 3>; else goto <bb 5>; There are several related bugs (like PR 33447), feel free to mark as a dup if you can identify one for sure. (code taken from http://stackoverflow.com/q/25622109/1918193 apparently there may be other issues with what gcc produces)
Confirmed. For some reason loop header copying is not applied to the 2nd variant.
It looks like do_while_loop_p is fooled by the CFG structure and hits static bool do_while_loop_p (struct loop *loop) { gimple stmt = last_stmt (loop->latch); /* If the latch of the loop is not empty, it is not a do-while loop. */ if (stmt && gimple_code (stmt) != GIMPLE_LABEL) return false; the loop at CH time looks like | v /-> <bb 5> | exit test ---> | | | <bb 3> | ... | | \ <bb 4> -- (empty but with a PHI) phiopt introduces this singleton PHI: <bb 3>: max_8 = *it_6; - if (max_8 > max_2) - goto <bb 5>; - else - goto <bb 4>; + max_10 = MAX_EXPR <max_2, max_8>; <bb 4>: + # max_9 = PHI <max_10(3)> <bb 5>: - # max_9 = PHI <max_2(4), max_8(3)> - - <bb 6>: - # it_1 = PHI <it_6(5), it_4(D)(2)> - # max_2 = PHI <max_9(5), max_5(2)> + # it_1 = PHI <it_6(4), it_4(D)(2)> + # max_2 = PHI <max_9(4), max_5(2)> it looks like the CH predicate relies on copy-propagated form (a latch block with an SSA name copy should also be considered empty IMHO). I think that cfgcleanup should merge <bb A> .... | single-succ/pred <bb B> # var = PHI <arg> and rewrite the PHIs to copies. Its comment even says so: /* Merging the blocks may create new opportunities for folding conditional branches (due to the elimination of single-valued PHI nodes). */ if (single_succ_p (bb) && can_merge_blocks_p (bb, single_succ (bb))) { merge_blocks (bb, single_succ (bb)); return true; } Ah, but /* Checks whether we can merge block B into block A. */ static bool gimple_can_merge_blocks_p (basic_block a, basic_block b) { ... /* Protect the loop latches. */ if (current_loops && b->loop_father->latch == b) return false; triggers here (merge_blocks doesn't handle merging with a latch). Mine.
*** Bug 28643 has been marked as a duplicate of this bug. ***
Fixed.
Author: rguenth Date: Thu Oct 16 10:13:52 2014 New Revision: 216304 URL: https://gcc.gnu.org/viewcvs?rev=216304&root=gcc&view=rev Log: 2014-10-16 Richard Biener <rguenther@suse.de> PR tree-optimization/63168 * tree-cfg.c (gimple_can_merge_blocks_p): Only protect latches if after merging they are no longer simple. * cfghooks.c (merge_blocks): Handle merging a latch block into another block. * gcc.dg/tree-ssa/loop-40.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/loop-40.c Modified: trunk/gcc/ChangeLog trunk/gcc/cfghooks.c trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-cfg.c