Created attachment 37986 [details] Reproducer. Testcase produces ICE with -O3 -march=skylake-avx512. Everything works fine with -O2. Error: > g++ -O3 -march=skylake-avx512 -S repr.cpp repr.cpp: In function ‘void foo()’: repr.cpp:7:6: internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1490 void foo () { ^~~ 0xc50433 vect_get_vec_def_for_stmt_copy(vect_def_type, tree_node*) ../.././gcc/tree-vect-stmts.c:1490 0xc55e8c vectorizable_condition(gimple*, gimple_stmt_iterator*, gimple**, tree_node*, int, _slp_tree*) ../.././gcc/tree-vect-stmts.c:7640 0xc65c26 vect_transform_stmt(gimple*, gimple_stmt_iterator*, bool*, _slp_tree*, _slp_instance*) ../.././gcc/tree-vect-stmts.c:8212 0xc6960a vect_transform_loop(_loop_vec_info*) ../.././gcc/tree-vect-loop.c:6885 0xc843bc vectorize_loops() ../.././gcc/tree-vectorizer.c:554 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. GCC version: > g++ -v Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/export/users/vlivinsk/gcc-trunk/bin/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /export/users/vlivinsk/gcc-trunk/gcc/configure --with-arch=corei7 --with-cpu=corei7 --enable-clocale=gnu --with-system-zlib --enable-shared --with-demangler-in-ld --enable-cloog-backend=isl --with-fpmath=sse --enable-checking=release --enable-languages=c,c++,lto --with-gmp=/export/users/vlivinsk/gcc-trunk/gmp-6.1.0/bin --with-mpfr=/export/users/vlivinsk/gcc-trunk/mpfr-3.1.3/bin --with-mpc=/export/users/vlivinsk/gcc-trunk/mpc-1.0.3/bin --prefix=/export/users/vlivinsk/gcc-trunk/bin Thread model: posix gcc version 6.0.0 20160315 (experimental) (Revision=234226) Test: extern unsigned char a [150]; extern unsigned char b [150]; extern unsigned char c [150]; extern unsigned char d [150]; extern unsigned char e [150]; void foo () { for (int i = 92; i <= 141; i += 2) { int tmp = (d [i] && b [i]) <= (a [i] > c [i]); e [i] = tmp >> b [i]; } }
Confirmed.
Seems it is due to incorrect mask conversion. The problem is that both scalar mask of 4 elements and scalar mask of 8 elements have QImode. It makes us think that we may get vec<bool>(4) using vec_unpack_[lo|hi]_expr on vec<bool>(16).
Well, for now we have only two scalar masks sharing the same mode. It means we may handle them by changing appropriate optabs from direct to conversion type and having separate entries for QI<->QI and QI<->HI cases. But if we have two elements scalar mask added for some target then it can't work. Also we have a common code used for multiple conversion cases and changing one of optabs would break it. For now we should just reject conversion cases involving different masks sharing the same mode (we don't have conversion patterns for them anyway).
Author: ienkovich Date: Fri Mar 18 09:36:32 2016 New Revision: 234323 URL: https://gcc.gnu.org/viewcvs?rev=234323&root=gcc&view=rev Log: gcc/ PR tree-optimization/70252 * tree-vect-stmts.c (supportable_widening_operation): Check resulting boolean vector has a proper number of elements. (supportable_narrowing_operation): Likewise. gcc/testsuite/ PR tree-optimization/70252 * gcc.dg/pr70252.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr70252.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-stmts.c
Fixed