Compiling the attached testcase at -O3 results in an ICE. GCC is configured as: --target=arm-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9 --with-fpu=neon Compiling at -O2 -Wall warns that several local variables are unintialized. Initializing val1=0 makes the ICE disappear. arm-none-linux-gnueabihf-gcc -Wall gcc_vect_bug.c -c -O3 gcc_vect_bug.c: In function 'fail': gcc_vect_bug.c:22:5: internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1598 int fail ( const RMColorData * pInColor, ^ 0xfb2c34 vect_get_vec_def_for_stmt_copy(vect_def_type, tree_node*) /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/tree-vect-stmts.c:1598 0xfb2d3d vect_get_vec_defs_for_stmt_copy /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/tree-vect-stmts.c:1624 0xfbeac3 vectorizable_operation /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/tree-vect-stmts.c:4910 0xfc658d vect_transform_stmt(gimple_statement_base*, gimple_stmt_iterator*, bool*, _slp_tree*, _slp_instance*) /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/tree-vect-stmts.c:7273 0xfdbcf3 vect_transform_loop(_loop_vec_info*) /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/tree-vect-loop.c:6156 0xff1deb vectorize_loops() /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/tree-vectorizer.c:497 0xee638f execute /media/lyon/9be1a707-5b7f-46da-9106-e084a5dbb011/ssd/src/GCC/sources/gcc-fsf/trunk/gcc/tree-ssa-loop.c:295
Created attachment 34599 [details] testcase
Confirmed on all release branches.
Doesn't reproduce with a cross from x86_64-linux on trunk. > ./xgcc -B. t.c -O3 -v -Wall Reading specs from ./specs COLLECT_GCC=./xgcc COLLECT_LTO_WRAPPER=./lto-wrapper Target: arm-none-linux-gnueabihf Configured with: /space/rguenther/src/svn/trunk2/configure --target=arm-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9 --with-fpu=neon ... t.c: In function ?fail?: t.c:55:19: warning: ?val2? may be used uninitialized in this function [-Wmaybe-uninitialized] result += (val2 - val1) * interpFrac2; ^ t.c:55:19: warning: ?val1? may be used uninitialized in this function [-Wmaybe-uninitialized] t.c:55:27: warning: ?interpFrac2? may be used uninitialized in this function [-Wmaybe-uninitialized] result += (val2 - val1) * interpFrac2; ^ t.c:56:19: warning: ?val3? may be used uninitialized in this function [-Wmaybe-uninitialized] result += (val3 - val2) * interpFrac3; ^ t.c:56:27: warning: ?interpFrac3? may be used uninitialized in this function [-Wmaybe-uninitialized] result += (val3 - val2) * interpFrac3; ^ t.c:54:27: warning: ?interpFrac1? may be used uninitialized in this function [-Wmaybe-uninitialized] result += (val1 - val0) * interpFrac1; ^ COLLECT_GCC_OPTIONS='-B' '.' '-O3' '-v' '-Wall' '-mcpu=cortex-a9' '-mfpu=neon' '-marm' '-mtls-dialect=gnu' ./as -v -mcpu=cortex-a9 -mfpu=neon -meabi=5 -o /tmp/cciyGoik.o /tmp/cc4vjgVC.s ./as: line 106: exec: -v: invalid option exec: usage: exec [-cl] [-a name] [command [arguments ...]] [redirection ...] are you sure it isn't already fixed?
> > are you sure it isn't already fixed? Don't think so. It ICEs for me with r220203. I'm using the options -mfpu=neon -mfloat-abi=hard -O3 -march=armv7-a. The -mfloat-abi=hard is important, it doesn't reproduce without it
Ok, reproduce with > ./cc1 -quiet t.c -O3 -mfloat-abi=hard -I include -march=armv7-a -mfpu=neon t.c: In function ?fail?: t.c:22:5: internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1598 int fail ( const RMColorData * pInColor, ^ 0xf89822 vect_get_vec_def_for_stmt_copy(vect_def_type, tree_node*) /space/rguenther/src/svn/trunk2/gcc/tree-vect-stmts.c:1598 0xf8991c vect_get_vec_defs_for_stmt_copy /space/rguenther/src/svn/trunk2/gcc/tree-vect-stmts.c:1624 0xf94dd8 vectorizable_operation /space/rguenther/src/svn/trunk2/gcc/tree-vect-stmts.c:4910 0xf9c3b3 vect_transform_stmt(gimple_statement_base*, gimple_stmt_iterator*, bool*, _slp_tree*, _slp_instance*) /space/rguenther/src/svn/trunk2/gcc/tree-vect-stmts.c:7273 0xfb0c26 vect_transform_loop(_loop_vec_info*) /space/rguenther/src/svn/trunk2/gcc/tree-vect-loop.c:6156 0xfc5daf vectorize_loops() /space/rguenther/src/svn/trunk2/gcc/tree-vectorizer.c:497 0xec3f97 execute /space/rguenther/src/svn/trunk2/gcc/tree-ssa-loop.c:295 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. Must be related to t.c:48:2: note: vect_recog_widen_shift_pattern: detected: t.c:48:2: note: patt_42 = _41 w<< 4; t.c:48:2: note: pattern recognized: patt_42 = _41 w<< 4; t.c:48:2: note: additional pattern stmt: _41 = (unsigned short) _13; ... t.c:48:2: note: type of def: 3. t.c:48:2: note: mark relevant 4, live 0. t.c:48:2: note: last stmt in pattern. don't mark relevant/live. t.c:48:2: note: worklist: examine stmt: _41 = (unsigned short) _13; ... t.c:48:2: note: type of def: 3. t.c:48:2: note: mark relevant 4, live 0. t.c:48:2: note: last stmt in pattern. don't mark relevant/live. t.c:48:2: note: worklist: examine stmt: patt_42 = _41 w<< 4; I will have a closer look.
In fact we recognize a widening shift but it gets dropped on the floor: t.c:48:2: note: ------>vectorizing statement: result_15 = val0_14 << 4; t.c:48:2: note: ------>vectorizing statement: _17 = val1_16(D) - val0_14; huh? The issue that we hit is that we vectorized the def of val0_14 with ncopies == 1 but now require two copies of it when vectorizing val1_16(D) - val0_14. So maybe the shift issue is unrelated. Sofar we have vect__13.17_155 = MEM[(const Uint8 *)vectp_pCornerPoint0.15_153]; _13 = *pCornerPoint0_48; vect__41.18_156 = [vec_unpack_lo_expr] vect__13.17_155; vect__41.18_157 = [vec_unpack_hi_expr] vect__13.17_155; val0_14 = (Sint32) _13; Thus we widen a char load to an int but the vectorized version widens to an unsigned short only. That's probably because of the consumer result_15 = val0_14 << 4; which was pattern detected as widening shift. But unfortunately that isn't the only one and we forgot _17 = val1_16(D) - val0_14; which then results in the ICE. Index: gcc/tree-vect-patterns.c =================================================================== --- gcc/tree-vect-patterns.c (revision 220205) +++ gcc/tree-vect-patterns.c (working copy) @@ -1732,9 +1732,11 @@ vect_recog_widen_shift_pattern (vec<gimp if (TREE_CODE (oprnd0) != SSA_NAME || TREE_CODE (oprnd1) != INTEGER_CST) return NULL; - /* Check operand 0: it has to be defined by a type promotion. */ - if (!type_conversion_p (oprnd0, last_stmt, false, &half_type0, &def_stmt0, - &promotion) + /* Check operand 0: it has to be defined by a type promotion and it + should be only used by the shift. */ + if (!has_single_use (oprnd0) + || !type_conversion_p (oprnd0, last_stmt, false, &half_type0, &def_stmt0, + &promotion) || !promotion) return NULL;
> > Index: gcc/tree-vect-patterns.c > =================================================================== > --- gcc/tree-vect-patterns.c (revision 220205) > +++ gcc/tree-vect-patterns.c (working copy) > @@ -1732,9 +1732,11 @@ vect_recog_widen_shift_pattern (vec<gimp > if (TREE_CODE (oprnd0) != SSA_NAME || TREE_CODE (oprnd1) != INTEGER_CST) > return NULL; > > - /* Check operand 0: it has to be defined by a type promotion. */ > - if (!type_conversion_p (oprnd0, last_stmt, false, &half_type0, &def_stmt0, > - &promotion) > + /* Check operand 0: it has to be defined by a type promotion and it > + should be only used by the shift. */ > + if (!has_single_use (oprnd0) > + || !type_conversion_p (oprnd0, last_stmt, false, &half_type0, > &def_stmt0, > + &promotion) > || !promotion) > return NULL; That fixes the ICE in the testcase for me. I'll give it a test on arm-none-eabi and an arm bootstrap
Author: rguenth Date: Fri Jan 30 09:22:17 2015 New Revision: 220275 URL: https://gcc.gnu.org/viewcvs?rev=220275&root=gcc&view=rev Log: 2015-01-30 Richard Biener <rguenther@suse.de> PR tree-optimization/64829 * tree-vect-patterns.c (vect_handle_widen_op_by_const): Do not add a widening conversion pattern but hand off extra widenings to callers. (vect_recog_widen_mult_pattern): Handle extra widening produced by vect_handle_widen_op_by_const. (vect_recog_widen_shift_pattern): Likewise. (vect_pattern_recog_1): Remove excess vertical space in dumping. * tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise. (vect_init_vector_1): Likewise. (vect_get_vec_def_for_operand): Likewise. (vect_finish_stmt_generation): Likewise. (vectorizable_load): Likewise. (vect_analyze_stmt): Likewise. (vect_is_simple_use): Likewise. * gcc.dg/vect/pr64829.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/vect/pr64829.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-patterns.c trunk/gcc/tree-vect-stmts.c
Fixed.