This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH GCC]Improve alias check code generation in vectorizer


Hi,
Take subroutine "DACOP" from spec2k/200.sixtrack as an example, the loop needs to be versioned for vectorization because of possibly alias.  The alias check for data-references are like:

//pair 1
dr_a:
(Data Ref: 
  bb: 8 
  stmt: _92 = da.cc[_27];
  ref: da.cc[_27];
)
dr_b:
(Data Ref: 
  bb: 8 
  stmt: da.cc[_93] = _92;
  ref: da.cc[_93];
)
//pair 2
dr_a:
(Data Ref: 
  bb: 8 
  stmt: pretmp_29 = da.i2[_27];
  ref: da.i2[_27];
)
dr_b:
(Data Ref: 
  bb: 8 
  stmt: da.i2[_93] = pretmp_29;
  ref: da.i2[_93];
)
//pair 3
dr_a:
(Data Ref: 
  bb: 8 
  stmt: pretmp_28 = da.i1[_27];
  ref: da.i1[_27];
)
dr_b:
(Data Ref: 
  bb: 8 
  stmt: da.i1[_93] = pretmp_28;
  ref: da.i1[_93];
)

The code generated for alias checks are as below:

  <bb 23>:
  # iy_186 = PHI <_413(22), 2(2)>
  # ivtmp_1050 = PHI <ivtmp_1049(22), 512(2)>
  _155 = iy_186 + -2;
  _156 = _155 * 516;
  _241 = iy_186 + -1;
  _242 = _241 * 516;
  _328 = iy_186 * 516;
  _413 = iy_186 + 1;
  _414 = _413 * 516;
  _499 = iy_186 + 2;
  _500 = _499 * 516;
  _998 = iy_186 * 516;
  _997 = (sizetype) _998;
  _996 = _997 + 6;
  _995 = _996 * 4;
  _994 = global_Output.2_16 + _995;
  _993 = iy_186 * 516;
  _992 = (long unsigned int) _993;
  _991 = _992 * 4;
  _990 = _991 + 18446744073709547488;
  _989 = global_Input.0_153 + _990;
  _886 = _989 >= _994;
  _885 = iy_186 * 516;
  _884 = (sizetype) _885;
  _883 = _884 + 1040;
  _882 = _883 * 4;
  _881 = global_Input.0_153 + _882;
  _880 = (sizetype) _998;
  _879 = _880 + 2;
  _878 = _879 * 4;
  _877 = global_Output.2_16 + _878;
  _876 = _877 >= _881;
  _875 = _876 | _886;
  _874 = iy_186 * 516;
  _873 = (sizetype) _874;
  _872 = _873 + 514;
  _871 = _872 * 4;
  _870 = global_Output.2_16 + _871;
  _869 = local_Filter_33 >= _870;
  _868 = local_Filter_33 + 100;
  _867 = (sizetype) _874;
  _866 = _867 + 2;
  _865 = _866 * 4;
  _864 = global_Output.2_16 + _865;
  _863 = _864 >= _868;
  _862 = _863 | _869;
  _861 = _862 & _875;
  if (_861 != 0)
    goto <bb 7>;
  else
    goto <bb 4>;

It contains quite a lot redundant computations.  Root cause is vectorizer simply translates alias checks into full address expressions comparison, and CSE opportunities are covered by foler.  This patch improves function vect_create_cond_for_alias_checks by simplifying the comparison by comparing DR_BASE_ADDRESS/DR_INIT of both data-reference at compilation time.  It also simplifies conditions:
  (addr_a_min + addr_a_length) <= addr_b_min || (addr_b_min + addr_b_length) <= addr_a_min
into below form:
  cond_expr = addr_b_min - addr_a_min
  cond_expr >= addr_a_length || cond_expr <= -addr_b_length
if the comparison is done in signed type.  And this can be further simplified by folder if addr_a_length and addr_b_lengnth are equal/const, which is quite common.
I looked into generated assembly, this patch does introduces small regression in some cases, but overall I think it's good.  Bootstrap and test on x86_64 and AArch64.  Is it OK?
Thanks,
bin

2016-06-08  Bin Cheng  <bin.cheng@arm.com>

	* tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): New
	Parameter.  Simplify alias check conditions at compilation time.
	(vect_loop_versioning): Pass new argument to above function.

Attachment: alias-check-condition-20160611.txt
Description: alias-check-condition-20160611.txt


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]