This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
| Other format: | [Raw text] | |
Hi,
Take subroutine "DACOP" from spec2k/200.sixtrack as an example, the loop needs to be versioned for vectorization because of possibly alias. The alias check for data-references are like:
//pair 1
dr_a:
(Data Ref:
bb: 8
stmt: _92 = da.cc[_27];
ref: da.cc[_27];
)
dr_b:
(Data Ref:
bb: 8
stmt: da.cc[_93] = _92;
ref: da.cc[_93];
)
//pair 2
dr_a:
(Data Ref:
bb: 8
stmt: pretmp_29 = da.i2[_27];
ref: da.i2[_27];
)
dr_b:
(Data Ref:
bb: 8
stmt: da.i2[_93] = pretmp_29;
ref: da.i2[_93];
)
//pair 3
dr_a:
(Data Ref:
bb: 8
stmt: pretmp_28 = da.i1[_27];
ref: da.i1[_27];
)
dr_b:
(Data Ref:
bb: 8
stmt: da.i1[_93] = pretmp_28;
ref: da.i1[_93];
)
The code generated for alias checks are as below:
<bb 23>:
# iy_186 = PHI <_413(22), 2(2)>
# ivtmp_1050 = PHI <ivtmp_1049(22), 512(2)>
_155 = iy_186 + -2;
_156 = _155 * 516;
_241 = iy_186 + -1;
_242 = _241 * 516;
_328 = iy_186 * 516;
_413 = iy_186 + 1;
_414 = _413 * 516;
_499 = iy_186 + 2;
_500 = _499 * 516;
_998 = iy_186 * 516;
_997 = (sizetype) _998;
_996 = _997 + 6;
_995 = _996 * 4;
_994 = global_Output.2_16 + _995;
_993 = iy_186 * 516;
_992 = (long unsigned int) _993;
_991 = _992 * 4;
_990 = _991 + 18446744073709547488;
_989 = global_Input.0_153 + _990;
_886 = _989 >= _994;
_885 = iy_186 * 516;
_884 = (sizetype) _885;
_883 = _884 + 1040;
_882 = _883 * 4;
_881 = global_Input.0_153 + _882;
_880 = (sizetype) _998;
_879 = _880 + 2;
_878 = _879 * 4;
_877 = global_Output.2_16 + _878;
_876 = _877 >= _881;
_875 = _876 | _886;
_874 = iy_186 * 516;
_873 = (sizetype) _874;
_872 = _873 + 514;
_871 = _872 * 4;
_870 = global_Output.2_16 + _871;
_869 = local_Filter_33 >= _870;
_868 = local_Filter_33 + 100;
_867 = (sizetype) _874;
_866 = _867 + 2;
_865 = _866 * 4;
_864 = global_Output.2_16 + _865;
_863 = _864 >= _868;
_862 = _863 | _869;
_861 = _862 & _875;
if (_861 != 0)
goto <bb 7>;
else
goto <bb 4>;
It contains quite a lot redundant computations. Root cause is vectorizer simply translates alias checks into full address expressions comparison, and CSE opportunities are covered by foler. This patch improves function vect_create_cond_for_alias_checks by simplifying the comparison by comparing DR_BASE_ADDRESS/DR_INIT of both data-reference at compilation time. It also simplifies conditions:
(addr_a_min + addr_a_length) <= addr_b_min || (addr_b_min + addr_b_length) <= addr_a_min
into below form:
cond_expr = addr_b_min - addr_a_min
cond_expr >= addr_a_length || cond_expr <= -addr_b_length
if the comparison is done in signed type. And this can be further simplified by folder if addr_a_length and addr_b_lengnth are equal/const, which is quite common.
I looked into generated assembly, this patch does introduces small regression in some cases, but overall I think it's good. Bootstrap and test on x86_64 and AArch64. Is it OK?
Thanks,
bin
2016-06-08 Bin Cheng <bin.cheng@arm.com>
* tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): New
Parameter. Simplify alias check conditions at compilation time.
(vect_loop_versioning): Pass new argument to above function.Attachment:
alias-check-condition-20160611.txt
Description: alias-check-condition-20160611.txt
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |