Created attachment 41319 [details] Preprocessed source generated by gcc -v -save-temps -O3 -mavx2 thecode.c I ran into a problem with strange results when compiling with -O3 -mavx2 and have been able to reduce it to the following small test code: ======================================== #include <stdio.h> int main() { const int N = 8; int v[N]; for(int k = 0; k < N; k++) v[k] = k; v[0] = 77; int found_index = -1; for(int k = 0; k < N; k++) { if(v[k] == 77) found_index = k; } printf("found_index = %d\n", found_index); } ======================================== If compiled correctly, running this code should give "found_index = 0". When compiling it like this: gcc -O3 -mavx2 thecode.c then running the resulting a.out executable gives: $ ./a.out found_index = -1 which is wrong. The output of "gcc -v -save-temps -O3 -mavx2 thecode.c" looks as follows: ======================================== $ gcc -v -save-temps -O3 -mavx2 thecode.c Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/lto-wrapper Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --disable-libgcj --with-isl --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 6.3.1 20161221 (Red Hat 6.3.1-1) (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-mavx2' '-mtune=generic' '-march=x86-64' /usr/libexec/gcc/x86_64-redhat-linux/6.3.1/cc1 -E -quiet -v thecode.c -mavx2 -mtune=generic -march=x86-64 -O3 -fpch-preprocess -o thecode.i ignoring nonexistent directory "/usr/lib/gcc/x86_64-redhat-linux/6.3.1/include-fixed" ignoring nonexistent directory "/usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../../../x86_64-redhat-linux/include" #include "..." search starts here: #include <...> search starts here: /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include /usr/local/include /usr/include End of search list. COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-mavx2' '-mtune=generic' '-march=x86-64' /usr/libexec/gcc/x86_64-redhat-linux/6.3.1/cc1 -fpreprocessed thecode.i -quiet -dumpbase thecode.c -mavx2 -mtune=generic -march=x86-64 -auxbase thecode -O3 -version -o thecode.s GNU C11 (GCC) version 6.3.1 20161221 (Red Hat 6.3.1-1) (x86_64-redhat-linux) compiled by GNU C version 6.3.1 20161221 (Red Hat 6.3.1-1), GMP version 6.1.1, MPFR version 3.1.5, MPC version 1.0.2, isl version 0.14 or 0.13 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU C11 (GCC) version 6.3.1 20161221 (Red Hat 6.3.1-1) (x86_64-redhat-linux) compiled by GNU C version 6.3.1 20161221 (Red Hat 6.3.1-1), GMP version 6.1.1, MPFR version 3.1.5, MPC version 1.0.2, isl version 0.14 or 0.13 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 67626b9d441eed376539391e660a9413 COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-mavx2' '-mtune=generic' '-march=x86-64' as -v --64 -o thecode.o thecode.s GNU assembler version 2.26.1 (x86_64-redhat-linux) using BFD version version 2.26.1-1.fc25 COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/:/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/6.3.1/:/usr/lib/gcc/x86_64-redhat-linux/ LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/6.3.1/:/usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-mavx2' '-mtune=generic' '-march=x86-64' /usr/libexec/gcc/x86_64-redhat-linux/6.3.1/collect2 -plugin /usr/libexec/gcc/x86_64-redhat-linux/6.3.1/liblto_plugin.so -plugin-opt=/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/lto-wrapper -plugin-opt=-fresolution=thecode.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --build-id --no-add-needed --eh-frame-hdr --hash-style=gnu -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../../../lib64/crt1.o /usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/6.3.1/crtbegin.o -L/usr/lib/gcc/x86_64-redhat-linux/6.3.1 -L/usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../.. thecode.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-redhat-linux/6.3.1/crtend.o /usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../../../lib64/crtn.o COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-mavx2' '-mtune=generic' '-march=x86-64' ======================================== I have tested this with a few different gcc versions: gcc 4.8.3 --> OK gcc 4.9.4 --> OK gcc 5.3.0 --> OK gcc 5.4.0 --> OK gcc 6.1.0 --> WRONG gcc 6.2.0 --> WRONG gcc 6.3.1 --> WRONG gcc 7.1.0 --> WRONG I don't know what goes wrong but it seems somehow related to the beginning of the list v in the code; if I change v[0]=77 to e.g. v[3]=77 then that gives found_index=3 as it should, it is only v[0] that somehow is missed.
Started with r230297. Note, in C const int N = 8; int v[N]; is a variable length array, so unnecessarily pessimizing, you need to use #define N 8 or enum { N = 8 }; or something similar instead for it to be a non-VLA. In C++ it is not a VLA. But fixing that doesn't help here.
GCC 6.4 is being released, adjusting target milestone.
More complete testcase: int v[8] = { 77, 1, 79, 3, 4, 5, 6, 7 }; __attribute__((noipa)) void foo () { int k, r = -1; for (k = 0; k < 8; k++) if (v[k] == 77) r = k; if (r != 0) __builtin_abort (); } __attribute__((noipa)) void bar () { int k, r = 4; for (k = 0; k < 8; k++) if (v[k] == 79) r = k; if (r != 2) __builtin_abort (); } int main () { foo (); bar (); return 0; } The conditional reduction handling is buggy. In foo we emit: vect_cst__21 = { 8, 8, 8, 8, 8, 8, 8, 8 }; vect_cst__28 = { 77, 77, 77, 77, 77, 77, 77, 77 }; vect_cst__30 = { -1, -1, -1, -1, -1, -1, -1, -1 }; <bb 3> [local count: 119292720]: ... # vect_vec_iv_.0_22 = PHI <vect_vec_iv_.0_23(9), { 0, 1, 2, 3, 4, 5, 6, 7 }(2)> # vect_r_3.1_24 = PHI <vect_r_3.6_29(9), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> # vectp_v.2_25 = PHI <vectp_v.2_26(9), &v(2)> ... vect_vec_iv_.0_23 = vect_vec_iv_.0_22 + vect_cst__21; vect__1.4_27 = MEM[(int *)vectp_v.2_25]; vect_r_3.6_29 = VEC_COND_EXPR <vect__1.4_27 == vect_cst__28, vect_vec_iv_.0_22, vect_r_3.1_24>; ... <bb 18> [local count: 119292720]: # vect_r_3.6_31 = PHI <vect_r_3.6_29(3)> stmp_r_3.7_32 = REDUC_MAX (vect_r_3.6_31); stmp_r_3.7_33 = stmp_r_3.7_32 == 0 ? -1 : stmp_r_3.7_32; vect_cst__30 which seems to be the initial value of the reduction var r as a vector is unused. The problem is that by starting with zero vector for vect_r_3.1_24 there is no difference between a condition match on the first iteration and no match at all, both result in REDUC_MAX of 0 and the emitted code assumes REDUC_MAX of 0 means no match. In this case (if the first iteration iterator is constant and bigger than the minimum value of the type), just initializing by a vector containing any value smaller than the first iteration IV and adjusting that: stmp_r_3.7_33 = stmp_r_3.7_32 == 0 ? -1 : stmp_r_3.7_32; to stmp_r_3.7_33 = stmp_r_3.7_32 == the_chosen_value ? -1 : stmp_r_3.7_32; or specially in case when the reduction var is previously initialized to a value smaller than the minimum, we could build a vector of those values and avoid the COND_EXPR on the REDUC_MAX value. Now, in case the first iteration iterator is constant, but is the minimum value, we can't use this trick. Perhaps we could in that case just bias it by one, say if the reduction is with unsigned type emit e.g.: # vect_vec_iv_.0_22 = PHI <vect_vec_iv_.0_23(9), { 1, 2, 3, 4, 5, 6, 7, 8 }(2)> # vect_r_3.1_24 = PHI <vect_r_3.6_29(9), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> # vectp_v.2_25 = PHI <vectp_v.2_26(9), &v(2)> ... vect_vec_iv_.0_23 = vect_vec_iv_.0_22 + vect_cst__21; vect__1.4_27 = MEM[(int *)vectp_v.2_25]; vect_r_3.6_29 = VEC_COND_EXPR <vect__1.4_27 == vect_cst__28, vect_vec_iv_.0_22, vect_r_3.1_24>; ... <bb 18> [local count: 119292720]: # vect_r_3.6_31 = PHI <vect_r_3.6_29(3)> stmp_r_3.7_32 = REDUC_MAX (vect_r_3.6_31); stmt_r_3.7_34 = stmp_r_3.7_32 - 1; stmp_r_3.7_33 = stmp_r_3.7_32 == 0 ? <original_r_value> : stmt_r_3.7_34; For the non-constant IV first value we actually emit really weird code: int v[8] = { 77, 1, 79, 3, 4, 5, 6, 7 }; __attribute__((noipa)) void foo (int *v, int f) { int k, r = -1; for (k = f; k < f + 8; k++) if (v[k] == 77) r = k; if (r != 0) __builtin_abort (); } __attribute__((noipa)) void bar (int *v, int f) { int k, r = 4; for (k = f; k < f + 8; k++) if (v[k] == 79) r = k; if (r != 2) __builtin_abort (); } int main () { foo (v, 0); bar (v, 0); return 0; } where we emit 2 VEC_COND_EXPRs and 2 REDUC_MAX. While that testcases passes, not really sure if it is correct generally, and furthermore, it seems unnecessarily complicated to me. Can't we just emit what we'd emit for unsigned conditional reduction with first iteration 1, and only after the vectorized loop adjust it. So, say for the foo in the second case, emit: vect_cst__21 = { 8, 8, 8, 8, 8, 8, 8, 8 }; vect_cst__28 = { 77, 77, 77, 77, 77, 77, 77, 77 }; <bb 3> [local count: 119292720]: ... # vect_vec_iv_.0_22 = PHI <vect_vec_iv_.0_23(9), { 1, 2, 3, 4, 5, 6, 7, 8 }(2)> # vect_r_3.1_24 = PHI <vect_r_3.6_29(9), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> # vectp_v.2_25 = PHI <vectp_v.2_26(9), &v(2)> ... vect_vec_iv_.0_23 = vect_vec_iv_.0_22 + vect_cst__21; vect__1.4_27 = MEM[(int *)vectp_v.2_25]; vect_r_3.6_29 = VEC_COND_EXPR <vect__1.4_27 == vect_cst__28, vect_vec_iv_.0_22, vect_r_3.1_24>; ... <bb 18> [local count: 119292720]: # vect_r_3.6_31 = PHI <vect_r_3.6_29(3)> stmp_r_3.7_32 = REDUC_MAX (vect_r_3.6_31); stmt_r_3.7_34 = f_9(D) + (stmp_r_3.7_32 - 1) * step; stmp_r_3.7_33 = stmp_r_3.7_32 == 0 ? <r_value_before_loop> : stmp_r_3.7_34; where _22, _24, _29 would be all in vectors of unsigned_type_for (r)? Or for signed start with { min, min, ... } as condition never seen value, and { min+1, min+2, min+3, ... } vector as the initial _22 value?
On December 8, 2017 4:56:12 PM GMT+01:00, "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80631 > >Jakub Jelinek <jakub at gcc dot gnu.org> changed: > > What |Removed |Added >---------------------------------------------------------------------------- > CC| |rguenth at gcc dot gnu.org > >--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- >More complete testcase: > >int v[8] = { 77, 1, 79, 3, 4, 5, 6, 7 }; > >__attribute__((noipa)) void >foo () >{ > int k, r = -1; > for (k = 0; k < 8; k++) > if (v[k] == 77) > r = k; > if (r != 0) > __builtin_abort (); >} > >__attribute__((noipa)) void >bar () >{ > int k, r = 4; > for (k = 0; k < 8; k++) > if (v[k] == 79) > r = k; > if (r != 2) > __builtin_abort (); >} > >int >main () >{ > foo (); > bar (); > return 0; >} > >The conditional reduction handling is buggy. >In foo we emit: > vect_cst__21 = { 8, 8, 8, 8, 8, 8, 8, 8 }; > vect_cst__28 = { 77, 77, 77, 77, 77, 77, 77, 77 }; > vect_cst__30 = { -1, -1, -1, -1, -1, -1, -1, -1 }; > > <bb 3> [local count: 119292720]: >... ># vect_vec_iv_.0_22 = PHI <vect_vec_iv_.0_23(9), { 0, 1, 2, 3, 4, 5, 6, >7 >}(2)> ># vect_r_3.1_24 = PHI <vect_r_3.6_29(9), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> > # vectp_v.2_25 = PHI <vectp_v.2_26(9), &v(2)> >... > vect_vec_iv_.0_23 = vect_vec_iv_.0_22 + vect_cst__21; > vect__1.4_27 = MEM[(int *)vectp_v.2_25]; > vect_r_3.6_29 = VEC_COND_EXPR <vect__1.4_27 == vect_cst__28, >vect_vec_iv_.0_22, vect_r_3.1_24>; >... > <bb 18> [local count: 119292720]: > # vect_r_3.6_31 = PHI <vect_r_3.6_29(3)> > stmp_r_3.7_32 = REDUC_MAX (vect_r_3.6_31); > stmp_r_3.7_33 = stmp_r_3.7_32 == 0 ? -1 : stmp_r_3.7_32; > >vect_cst__30 which seems to be the initial value of the reduction var r >as a >vector is unused. >The problem is that by starting with zero vector for vect_r_3.1_24 >there is no >difference between a condition match on the first iteration and >no match at all, both result in REDUC_MAX of 0 and the emitted code >assumes >REDUC_MAX of 0 means no match. > >In this case (if the first iteration iterator is constant and bigger >than the >minimum value of the type), just initializing by a vector containing >any value >smaller than the first iteration IV and adjusting that: > stmp_r_3.7_33 = stmp_r_3.7_32 == 0 ? -1 : stmp_r_3.7_32; >to >stmp_r_3.7_33 = stmp_r_3.7_32 == the_chosen_value ? -1 : stmp_r_3.7_32; >or specially in case when the reduction var is previously initialized >to a >value smaller than the minimum, we could build a vector of those values >and >avoid the COND_EXPR on the REDUC_MAX value. > >Now, in case the first iteration iterator is constant, but is the >minimum >value, we can't use this trick. Perhaps we could in that case just >bias it by one, say if the reduction is with unsigned type emit e.g.: ># vect_vec_iv_.0_22 = PHI <vect_vec_iv_.0_23(9), { 1, 2, 3, 4, 5, 6, 7, >8 >}(2)> ># vect_r_3.1_24 = PHI <vect_r_3.6_29(9), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> > # vectp_v.2_25 = PHI <vectp_v.2_26(9), &v(2)> >... > vect_vec_iv_.0_23 = vect_vec_iv_.0_22 + vect_cst__21; > vect__1.4_27 = MEM[(int *)vectp_v.2_25]; > vect_r_3.6_29 = VEC_COND_EXPR <vect__1.4_27 == vect_cst__28, >vect_vec_iv_.0_22, vect_r_3.1_24>; >... > <bb 18> [local count: 119292720]: > # vect_r_3.6_31 = PHI <vect_r_3.6_29(3)> > stmp_r_3.7_32 = REDUC_MAX (vect_r_3.6_31); > stmt_r_3.7_34 = stmp_r_3.7_32 - 1; >stmp_r_3.7_33 = stmp_r_3.7_32 == 0 ? <original_r_value> : >stmt_r_3.7_34; > >For the non-constant IV first value we actually emit really weird code: >int v[8] = { 77, 1, 79, 3, 4, 5, 6, 7 }; > >__attribute__((noipa)) void >foo (int *v, int f) >{ > int k, r = -1; > for (k = f; k < f + 8; k++) > if (v[k] == 77) > r = k; > if (r != 0) > __builtin_abort (); >} > >__attribute__((noipa)) void >bar (int *v, int f) >{ > int k, r = 4; > for (k = f; k < f + 8; k++) > if (v[k] == 79) > r = k; > if (r != 2) > __builtin_abort (); >} > >int >main () >{ > foo (v, 0); > bar (v, 0); > return 0; >} > >where we emit 2 VEC_COND_EXPRs and 2 REDUC_MAX. While that testcases >passes, >not really sure if it is correct generally, and furthermore, >it seems unnecessarily complicated to me. Can't we just emit what we'd >emit >for unsigned conditional reduction with first iteration 1, and only >after the >vectorized loop adjust it. >So, say for the foo in the second case, emit: > > vect_cst__21 = { 8, 8, 8, 8, 8, 8, 8, 8 }; > vect_cst__28 = { 77, 77, 77, 77, 77, 77, 77, 77 }; > > <bb 3> [local count: 119292720]: >... ># vect_vec_iv_.0_22 = PHI <vect_vec_iv_.0_23(9), { 1, 2, 3, 4, 5, 6, 7, >8 >}(2)> ># vect_r_3.1_24 = PHI <vect_r_3.6_29(9), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> > # vectp_v.2_25 = PHI <vectp_v.2_26(9), &v(2)> >... > vect_vec_iv_.0_23 = vect_vec_iv_.0_22 + vect_cst__21; > vect__1.4_27 = MEM[(int *)vectp_v.2_25]; > vect_r_3.6_29 = VEC_COND_EXPR <vect__1.4_27 == vect_cst__28, >vect_vec_iv_.0_22, vect_r_3.1_24>; >... > <bb 18> [local count: 119292720]: > # vect_r_3.6_31 = PHI <vect_r_3.6_29(3)> > stmp_r_3.7_32 = REDUC_MAX (vect_r_3.6_31); > stmt_r_3.7_34 = f_9(D) + (stmp_r_3.7_32 - 1) * step; >stmp_r_3.7_33 = stmp_r_3.7_32 == 0 ? <r_value_before_loop> : >stmp_r_3.7_34; >where _22, _24, _29 would be all in vectors of unsigned_type_for (r)? >Or for signed start with { min, min, ... } as condition never seen >value, and { >min+1, min+2, min+3, ... } vector as the initial _22 value? There's a dup for this (the existing vect.exp execute fail) and there is an approved patch for it.
Related to PR81179 and http://gcc.gnu.org/ml/gcc-patches/2017-11/msg02054.html As the patch doesn't apply cleanly, can't easily verify it.
Created attachment 42840 [details] gcc8-pr80631.patch Untested fix.
Author: jakub Date: Tue Dec 12 08:55:02 2017 New Revision: 255574 URL: https://gcc.gnu.org/viewcvs?rev=255574&root=gcc&view=rev Log: PR tree-optimization/80631 * tree-vect-loop.c (get_initial_def_for_reduction): Fix comment typo. (vect_create_epilog_for_reduction): Add INDUC_VAL and INDUC_CODE arguments, for INTEGER_INDUC_COND_REDUCTION use INDUC_VAL instead of hardcoding zero as the value if COND_EXPR is never true. For INTEGER_INDUC_COND_REDUCTION don't emit the final COND_EXPR if INDUC_VAL is equal to INITIAL_DEF, and use INDUC_CODE instead of hardcoding MAX_EXPR as the reduction operation. (is_nonwrapping_integer_induction): Allow negative step. (vectorizable_reduction): Compute INDUC_VAL and INDUC_CODE for vect_create_epilog_for_reduction, if no value is suitable, don't use INTEGER_INDUC_COND_REDUCTION for now. Formatting fixes. * gcc.dg/vect/pr80631-1.c: New test. * gcc.dg/vect/pr80631-2.c: New test. * gcc.dg/vect/pr65947-13.c: Expect integer induc cond reduction vectorization. Added: trunk/gcc/testsuite/gcc.dg/vect/pr80631-1.c trunk/gcc/testsuite/gcc.dg/vect/pr80631-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/vect/pr65947-13.c trunk/gcc/tree-vect-loop.c
Fixed on the trunk so far.
Author: jakub Date: Fri Dec 15 17:51:36 2017 New Revision: 255701 URL: https://gcc.gnu.org/viewcvs?rev=255701&root=gcc&view=rev Log: PR tree-optimization/80631 * gcc.target/i386/avx2-pr80631.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/avx2-pr80631.c Modified: trunk/gcc/testsuite/ChangeLog
Author: jakub Date: Fri Dec 15 22:12:46 2017 New Revision: 255726 URL: https://gcc.gnu.org/viewcvs?rev=255726&root=gcc&view=rev Log: Backported from mainline 2017-12-12 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/80631 * tree-vect-loop.c (get_initial_def_for_reduction): Fix comment typo. (vect_create_epilog_for_reduction): Add INDUC_VAL argument, for INTEGER_INDUC_COND_REDUCTION use INDUC_VAL instead of hardcoding zero as the value if COND_EXPR is never true. For INTEGER_INDUC_COND_REDUCTION don't emit the final COND_EXPR if INDUC_VAL is equal to INITIAL_DEF. (vectorizable_reduction): Compute INDUC_VAL for vect_create_epilog_for_reduction, if no value is suitable, don't use INTEGER_INDUC_COND_REDUCTION for now. Formatting fixes. * gcc.dg/vect/pr80631-1.c: New test. * gcc.dg/vect/pr80631-2.c: New test. PR tree-optimization/80631 * gcc.target/i386/avx2-pr80631.c: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.dg/vect/pr80631-1.c branches/gcc-7-branch/gcc/testsuite/gcc.dg/vect/pr80631-2.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/avx2-pr80631.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/testsuite/ChangeLog branches/gcc-7-branch/gcc/tree-vect-loop.c
Fixed for 7.3+ too.
Author: jakub Date: Tue Dec 19 07:39:24 2017 New Revision: 255804 URL: https://gcc.gnu.org/viewcvs?rev=255804&root=gcc&view=rev Log: PR tree-optimization/80631 * tree-vect-loop.c (vect_create_epilog_for_reduction): Compare induc_code against MAX_EXPR or MIN_EXPR instead of reduc_fn against IFN_REDUC_MAX or IFN_REDUC_MIN. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-vect-loop.c
GCC 6 branch is being closed, fixed in 7.x.