Bug 65946 - Simple loop with if-statement not vectorized
Summary: Simple loop with if-statement not vectorized
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 5.0
: P3 normal
Target Milestone: ---
Assignee: Alan Lawrence
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2015-04-30 12:00 UTC by Alan Lawrence
Modified: 2015-07-02 11:59 UTC (History)
0 users

See Also:
Host:
Target: x86_64
Build:
Known to work:
Known to fail:
Last reconfirmed: 2015-04-30 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Lawrence 2015-04-30 12:00:44 UTC
This testcase:

#define N 32

int a[N], b[N];

int
foo ()
{
  for (int i = 0; i < N ; i++)
  {
    int m = (a[i] & i) ? 5 : 4;
    b[i] = a[i] * m;
  }
}

does not vectorize at -O3 on x86_64 or other platforms. Following dom1, jump threading partially peels the loop to give:


  <bb 2>:
  goto <bb 8>;

  <bb 3>:
  # i_11 = PHI <i_9(6)>
  _5 = a[i_11];
  _6 = i_11 & _5;
  if (_6 != 0)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 4>:

  <bb 5>:
  # m_14 = PHI <5(4), 4(3)>

  <bb 6>:
  # m_2 = PHI <m_14(5), 4(8)>
  # _15 = PHI <_5(5), _10(8)>
  # i_16 = PHI <i_11(5), i_1(8)>
  _7 = m_2 * _15;
  b[i_16] = _7;
  i_9 = i_16 + 1;
  if (i_9 != 32)
    goto <bb 3>;
  else
    goto <bb 7>;

  <bb 7>:
  return;

  <bb 8>:
  # i_1 = PHI <0(2)>
  _10 = a[i_1];
  _3 = i_1 & _10;
  goto <bb 6>;

which form cannot be if-converted (tree-if-conv.c):

  /* If one of the loop header's edge is an exit edge then do not
     apply if-conversion.  */
  FOR_EACH_EDGE (e, ei, loop->header->succs)
    if (loop_exit_edge_p (loop, e))
      return false;

and even if it were, the PHI nodes at loop entry cannot be handled by the vectorizer.
Comment 1 Alan Lawrence 2015-04-30 12:01:56 UTC
Discussion here: https://gcc.gnu.org/ml/gcc/2015-04/msg00351.html

Suggestion is to use loop-header-copying to rotate the loop to a form that both if-conversion and the vectorizer can handle.
Comment 2 Alan Lawrence 2015-07-02 11:59:35 UTC
Author: alalaw01
Date: Thu Jul 2 12:47:31 2015
New Revision: 225311

URL: https://gcc.gnu.org/viewcvs?rev=225311&root=gcc&view=rev
Log:
gcc/:

        * tree-pass.h (make_pass_ch_vect): New.
        * passes.def: Add pass_ch_vect just before pass_if_conversion.

        * tree-ssa-loop-ch.c (ch_base, pass_ch_vect, pass_data_ch_vect,
        pass_ch::process_loop_p, pass_ch_vect::process_loop_p,
        make_pass_ch_vect): New.
        (pass_ch): Extend ch_base.

        (pass_ch::execute): Move all but loop_optimizer_init/finalize to...
        (ch_base::copy_headers): ...here.

gcc/testsuite/:

        * gcc.dg/vect/vect-strided-a-u16-i4.c (main1): Narrow scope of x,y,z,w.
        * gcc.dg/vect/vect-ifcvt-11.c: New testcase.