[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

ysrumyan at gmail dot com gcc-bugzilla@gcc.gnu.org
Mon Apr 8 14:03:00 GMT 2013


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #12 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-08 14:03:45 UTC ---
Richard,

We found out another issue related to your fix (r196872), namely for the
attached test-case t1.c function vect_gen_niters_for_prolog_loop() uses
non-invariant pointer (v1) for calculation of #iterations for prolog but before
your fix it uses invariant pointer (x) for doing it and all these evaluations
can be hoised out of outermost loop:

before your fix
  <bb 6>:
  niters.3_17 = (unsigned int) len_7;
  vect_px.4_4 = x_24(D);
  _119 = (unsigned long) vect_px.4_4;
  _118 = _119 & 31;
  _117 = _118 >> 2;
  _116 = -_117;
  _115 = (unsigned int) _116;
  _114 = _115 & 7;
  prolog_loop_niters.5_52 = MIN_EXPR <niters.3_17, _114>;

after your fix

  <bb 6>:
  niters.3_17 = (unsigned int) len_7;
  vect_pv1.4_4 = v1_16;
  _119 = (unsigned long) vect_pv1.4_4;

It leads to 7% performance regression on 482.sphinx3 from spec2006 (since
#itertaions of outer loop is much more greater (4096) then #iteration of inner
loop (13)).

This can be reproduced with following options:

  -O3 -funroll-loops -ffast-math -march=corei7 -mavx



More information about the Gcc-bugs mailing list