Bug 118468 - vectorizer: if conversion does not handle early exit well
Summary: vectorizer: if conversion does not handle early exit well
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 15.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2025-01-14 07:06 UTC by Andi Kleen
Modified: 2025-01-14 17:23 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2025-01-14 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andi Kleen 2025-01-14 07:06:45 UTC
This is forked from PR116126 to handle another early exit problem

const unsigned char *search_line_fast2 (const unsigned char *s, const unsigned char *end)
{
        while (s < end) {
                if (*s == '\n' || *s == '\r' 
#ifdef MORE
                                || *s == '\\' || *s == '?'
#endif
                                )
                        return s;
                s++;
        }
        return s;
}

When build

cc1 -O3 -mavx10.2 search-line-fast-short.c -quiet -fopt-info-all -o x.s -fdump-tree-vect
it vectorizes but with -DMORE=1 it gives:
search-line-fast-short.c:3:18: missed: couldn't vectorize loop
search-line-fast-short.c:3:18: missed: not vectorized: unsupported control flow in loop.

tree-vect-loop.c:
  /* Check if we have any control flow that doesn't leave the loop.  */
  basic_block *bbs = get_loop_body (loop);
  for (unsigned i = 0; i < loop->num_nodes; i++)
    if (EDGE_COUNT (bbs[i]->succs) != 1
        && (EDGE_COUNT (bbs[i]->succs) != 2
            || !loop_exits_from_bb_p (bbs[i]->loop_father, bbs[i])))
      {
        free (bbs);
        return opt_result::failure_at (vect_location,
                                       "not vectorized:"
                                       " unsupported control flow in loop.\n");



tree-if-conv converts the if, but then generates:


 _13 = _5 | prephitmp_26;
  if (_13 != 0)
    goto <bb 12>; [8.03%]
  else
    goto <bb 6>; [91.97%]

(still inside loop)
  <bb 12> [local count: 83800317]:
  # s_24 = PHI <s_15(5)>
  goto <bb 7>; [100.00%]  -> leaving loop


So the empty basic block with the extra PHI prevents vectorization.

The question is what to do about it. Either if-conv could avoid it, or the vectorizer could handle this case.
Comment 1 Richard Biener 2025-01-14 09:10:15 UTC
That bb 12 isn't inside the loop.  The issue is that if-conversion fails
to if-convert

  <bb 3> [local count: 1044213920]:  (loop header)
  # s_15 = PHI <s_10(9), s_7(D)(8)>
  _1 = *s_15;
  if (_1 > 63)
    goto <bb 11>; [50.00%]
  else 
    goto <bb 4>; [50.00%]
    
  <bb 11> [local count: 522106960]:
  goto <bb 5>; [100.00%]

  <bb 4> [local count: 522106960]:
  _14 = (int) _1;
  _17 = 9223372036854785024 >> _14;
  _18 = _17 & 1;
  _19 = _18 == 0;
  _25 = ~_19;

  <bb 5> [local count: 1044213920]:
  # prephitmp_26 = PHI <_25(4), 0(11)>

because

"Can not ifcvt due to multiple exits"

that's a stop-gap to avoid spending time on if-converting that's useless
in the end.  We also have

"basic block after exit bb but before latch"

so basically if-conversion doesn't know how to if-convert between
two exits - it always tries to remove _all_ control flow, not knowing
how to keep hands off of exit tests.
Comment 2 Andi Kleen 2025-01-14 17:23:31 UTC
Thanks. That makes sense. Fixing the summary.