This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Scheduling an early complete loop unrolling pass?


Hi,

currently with -ftree-vectorize we generate for

  for (i=0; i<3; ++i)
  # SFT.4346_507 = VDEF <SFT.4346_504(D)>
  # SFT.4347_508 = VDEF <SFT.4347_505(D)>
  # SFT.4348_509 = VDEF <SFT.4348_506(D)>
    d[i] = 0.0;

  for (j=0; j<n; ++j)
    x[j] = d;

(that is, zero a small vector and use that to initialize an array
of vectors)

<L266>:;
  vect_cst_.4501_723 = { 0.0, 0.0 };
  vect_p.4506_724 = (vector double *) &D.76822;
  vect_p.4502_725 = vect_p.4506_724;

  # ivtmp.4508_728 = PHI <0(6), ivtmp.4508_729(11)>
  # ivtmp.4507_726 = PHI <vect_p.4502_725(6), ivtmp.4507_727(11)>
  # ivtmp.4461_601 = PHI <3(6), ivtmp.4461_485(11)>
  # SFT.4348_612 = PHI <SFT.4348_506(D)(6), SFT.4348_509(11)>
  # SFT.4347_611 = PHI <SFT.4347_505(D)(6), SFT.4347_508(11)>
  # SFT.4346_610 = PHI <SFT.4346_504(D)(6), SFT.4346_507(11)>
  # i_582 = PHI <0(6), i_118(11)>
<L131>:;
  # SFT.4346_507 = VDEF <SFT.4346_610>
  # SFT.4347_508 = VDEF <SFT.4347_611>
  # SFT.4348_509 = VDEF <SFT.4348_612>
  *ivtmp.4507_726 = vect_cst_.4501_723;
  i_118 = i_582 + 1;
  ivtmp.4461_485 = ivtmp.4461_601 - 1;
  ivtmp.4507_727 = ivtmp.4507_726 + 16B;
  ivtmp.4508_729 = ivtmp.4508_728 + 1;
  if (ivtmp.4508_729 < 1) goto <L171>; else goto <L263>;

  # i_722 = PHI <i_118(7)>
  # ivtmp.4461_717 = PHI <ivtmp.4461_485(7)>
<L263>:;

  # ivtmp.4461_706 = PHI <ivtmp.4461_715(10), 1(8)>
  # SFT.4348_707 = PHI <SFT.4348_713(10), SFT.4348_509(8)>
  # SFT.4347_708 = PHI <SFT.4347_712(10), SFT.4347_508(8)>
  # SFT.4346_709 = PHI <SFT.4346_711(10), SFT.4346_507(8)>
  # i_710 = PHI <i_714(10), 2(8)>
<L260>:;
  # SFT.4346_711 = VDEF <SFT.4346_709>
  # SFT.4347_712 = VDEF <SFT.4347_708>
  # SFT.4348_713 = VDEF <SFT.4348_707>
  D.76822.D.44378.values[i_710] = 0.0;
  i_714 = i_710 + 1;
  ivtmp.4461_715 = ivtmp.4461_706 - 1;
  if (ivtmp.4461_715 != 0) goto <L259>; else goto <L264>;

...

and we are later not able to do constant propagation to the
second loop which we can do if we first unroll such small loops.

As we also only vectorize innermost loops I believe doing a
complete unrolling pass early will help in general (I pushed
for this some time ago).

Thoughts?

Thanks,
Richard.

-- 
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]