Created attachment 50321 [details] test case gcc 11 fails to build openCV for aarch64 due to an assembler error on attached sample program. % arch64-yoe-linux-g++ -c a.cpp -O /tmp/ccxRnB71.s: Assembler messages: /tmp/ccxRnB71.s:113: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,15' /tmp/ccxRnB71.s:249: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,14' /tmp/ccxRnB71.s:385: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,13' /tmp/ccxRnB71.s:521: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,12' /tmp/ccxRnB71.s:657: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,11' /tmp/ccxRnB71.s:793: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,10' /tmp/ccxRnB71.s:929: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,9'
the version of gcc in use is gcc version 11.0.1 20210306 (experimental) (GCC)
//(insn 148 147 149 (set (reg:V16QI 33 v1 [orig:153 _101 ] [153]) // (vec_concat:V16QI (truncate:V8QI (lshiftrt:V8HI (reg:V8HI 33 v1 [orig:165 _113 ] [165]) // (const_vector:V8HI [ // (const_int 15 [0xf]) repeated x8 // ]))) // (const_vector:V8QI [ // (const_int 0 [0]) repeated x8 // ]))) "/mnt/b/yoe/master/build/tmp/work/cortexa57-yoe-linux/opencv/4.5.1-r0/recipe-sysroot-native/usr/lib/aarch64-yoe-linux/gcc/aarch64-yoe-linux/11.0.1/include/arm_neon.h":6548:53 1917 {aarch64_shrnv8hi_insn_le} // (nil)) shrn v1.8b, v1.8h, 15 // 148 [c=4 l=4] aarch64_shrnv8hi_insn_le Confirmed, reducing ....
Trying 144, 146 -> 148: 144: r159:V8HI=r165:V8HI 0>>const_vector REG_DEAD r165:V8HI 146: r155:V8QI=trunc(r159:V8HI) REG_DEAD r159:V8HI 148: r153:V16QI=vec_concat(r155:V8QI,const_vector) REG_DEAD r155:V8QI Successfully matched this instruction: (set (reg:V16QI 153 [ _101 ]) (vec_concat:V16QI (truncate:V8QI (lshiftrt:V8HI (reg:V8HI 165 [ _113 ]) (const_vector:V8HI [ (const_int 15 [0xf]) repeated x8 ]))) (const_vector:V8QI [ (const_int 0 [0]) repeated x8 ]))) I have not reduced it yet but the above shows where the problem is introduced inside combine. I think the constrants/predicates for aarch64_shrnv8hi_insn_le on the const_vect (shift) are incorrect.
Confirmed: #include <arm_neon.h> uint8x16_t foo (uint16x8_t a, uint8x8_t b) { return vcombine_u8 (vmovn_u16 (vshrq_n_u16 (a, 9)), b); } Testing a patch.
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>: https://gcc.gnu.org/g:0d9a70ea3881c284b7689b691d54d047b55b486d commit r11-7556-g0d9a70ea3881c284b7689b691d54d047b55b486d Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Mon Mar 8 15:05:21 2021 +0000 aarch64: Fix PR99437 - tighten shift predicates for narrowing shift patterns In this bug combine forms the (R)SHRN(2) instructions with an invalid shift amount. The intrinsic expanders for these patterns validate the right shift amount but if the final patterns end up being matched by combine (or other RTL passes I suppose) they still let the wrong const_vector through. This patch tightens up the predicates for the instructions involved by using predicates for the right shift amount const_vectors. gcc/ChangeLog: PR target/99437 * config/aarch64/predicates.md (aarch64_simd_shift_imm_vec_qi): Define. (aarch64_simd_shift_imm_vec_hi): Likewise. (aarch64_simd_shift_imm_vec_si): Likewise. (aarch64_simd_shift_imm_vec_di): Likewise. * config/aarch64/aarch64-simd.md (aarch64_shrn<mode>_insn_le): Use predicate from above. (aarch64_shrn<mode>_insn_be): Likewise. (aarch64_rshrn<mode>_insn_le): Likewise. (aarch64_rshrn<mode>_insn_be): Likewise. (aarch64_shrn2<mode>_insn_le): Likewise. (aarch64_shrn2<mode>_insn_be): Likewise. (aarch64_rshrn2<mode>_insn_le): Likewise. (aarch64_rshrn2<mode>_insn_be): Likewise. gcc/testsuite/ChangeLog: PR target/99437 * gcc.target/aarch64/simd/pr99437.c: New test.
I can confirm that the above commit fixed the ICE