This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hi, Take subroutine "DACOP" from spec2k/200.sixtrack as an example, the loop needs to be versioned for vectorization because of possibly alias. The alias check for data-references are like: //pair 1 dr_a: (Data Ref: bb: 8 stmt: _92 = da.cc[_27]; ref: da.cc[_27]; ) dr_b: (Data Ref: bb: 8 stmt: da.cc[_93] = _92; ref: da.cc[_93]; ) //pair 2 dr_a: (Data Ref: bb: 8 stmt: pretmp_29 = da.i2[_27]; ref: da.i2[_27]; ) dr_b: (Data Ref: bb: 8 stmt: da.i2[_93] = pretmp_29; ref: da.i2[_93]; ) //pair 3 dr_a: (Data Ref: bb: 8 stmt: pretmp_28 = da.i1[_27]; ref: da.i1[_27]; ) dr_b: (Data Ref: bb: 8 stmt: da.i1[_93] = pretmp_28; ref: da.i1[_93]; ) The code generated for alias checks are as below: <bb 23>: # iy_186 = PHI <_413(22), 2(2)> # ivtmp_1050 = PHI <ivtmp_1049(22), 512(2)> _155 = iy_186 + -2; _156 = _155 * 516; _241 = iy_186 + -1; _242 = _241 * 516; _328 = iy_186 * 516; _413 = iy_186 + 1; _414 = _413 * 516; _499 = iy_186 + 2; _500 = _499 * 516; _998 = iy_186 * 516; _997 = (sizetype) _998; _996 = _997 + 6; _995 = _996 * 4; _994 = global_Output.2_16 + _995; _993 = iy_186 * 516; _992 = (long unsigned int) _993; _991 = _992 * 4; _990 = _991 + 18446744073709547488; _989 = global_Input.0_153 + _990; _886 = _989 >= _994; _885 = iy_186 * 516; _884 = (sizetype) _885; _883 = _884 + 1040; _882 = _883 * 4; _881 = global_Input.0_153 + _882; _880 = (sizetype) _998; _879 = _880 + 2; _878 = _879 * 4; _877 = global_Output.2_16 + _878; _876 = _877 >= _881; _875 = _876 | _886; _874 = iy_186 * 516; _873 = (sizetype) _874; _872 = _873 + 514; _871 = _872 * 4; _870 = global_Output.2_16 + _871; _869 = local_Filter_33 >= _870; _868 = local_Filter_33 + 100; _867 = (sizetype) _874; _866 = _867 + 2; _865 = _866 * 4; _864 = global_Output.2_16 + _865; _863 = _864 >= _868; _862 = _863 | _869; _861 = _862 & _875; if (_861 != 0) goto <bb 7>; else goto <bb 4>; It contains quite a lot redundant computations. Root cause is vectorizer simply translates alias checks into full address expressions comparison, and CSE opportunities are covered by foler. This patch improves function vect_create_cond_for_alias_checks by simplifying the comparison by comparing DR_BASE_ADDRESS/DR_INIT of both data-reference at compilation time. It also simplifies conditions: (addr_a_min + addr_a_length) <= addr_b_min || (addr_b_min + addr_b_length) <= addr_a_min into below form: cond_expr = addr_b_min - addr_a_min cond_expr >= addr_a_length || cond_expr <= -addr_b_length if the comparison is done in signed type. And this can be further simplified by folder if addr_a_length and addr_b_lengnth are equal/const, which is quite common. I looked into generated assembly, this patch does introduces small regression in some cases, but overall I think it's good. Bootstrap and test on x86_64 and AArch64. Is it OK? Thanks, bin 2016-06-08 Bin Cheng <bin.cheng@arm.com> * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): New Parameter. Simplify alias check conditions at compilation time. (vect_loop_versioning): Pass new argument to above function.
Attachment:
alias-check-condition-20160611.txt
Description: alias-check-condition-20160611.txt
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |