Bug 99437 - [11 Regression] Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,15'
Summary: [11 Regression] Error: immediate value out of range 1 to 8 at operand 3 -- `s...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 11.0
: P1 normal
Target Milestone: 11.0
Assignee: ktkachov
URL:
Keywords: assemble-failure
Depends on:
Blocks:
 
Reported: 2021-03-06 23:09 UTC by Khem Raj
Modified: 2021-03-08 18:57 UTC (History)
2 users (show)

See Also:
Host:
Target: aarch64
Build:
Known to work: 10.2.0
Known to fail: 11.0
Last reconfirmed: 2021-03-07 00:00:00


Attachments
test case (130.18 KB, application/x-xz)
2021-03-06 23:09 UTC, Khem Raj
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Khem Raj 2021-03-06 23:09:44 UTC
Created attachment 50321 [details]
test case

gcc 11 fails to build openCV for aarch64 due to an assembler error on attached sample program.

% arch64-yoe-linux-g++ -c a.cpp -O

/tmp/ccxRnB71.s: Assembler messages:
/tmp/ccxRnB71.s:113: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,15'
/tmp/ccxRnB71.s:249: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,14'
/tmp/ccxRnB71.s:385: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,13'
/tmp/ccxRnB71.s:521: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,12'
/tmp/ccxRnB71.s:657: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,11'
/tmp/ccxRnB71.s:793: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,10'
/tmp/ccxRnB71.s:929: Error: immediate value out of range 1 to 8 at operand 3 -- `shrn v1.8b,v1.8h,9'
Comment 1 Khem Raj 2021-03-06 23:13:35 UTC
the version of gcc in use is

gcc version 11.0.1 20210306 (experimental) (GCC)
Comment 2 Andrew Pinski 2021-03-07 00:07:42 UTC
//(insn 148 147 149 (set (reg:V16QI 33 v1 [orig:153 _101 ] [153])
//        (vec_concat:V16QI (truncate:V8QI (lshiftrt:V8HI (reg:V8HI 33 v1 [orig:165 _113 ] [165])
//                    (const_vector:V8HI [
//                            (const_int 15 [0xf]) repeated x8
//                        ])))
//            (const_vector:V8QI [
//                    (const_int 0 [0]) repeated x8
//                ]))) "/mnt/b/yoe/master/build/tmp/work/cortexa57-yoe-linux/opencv/4.5.1-r0/recipe-sysroot-native/usr/lib/aarch64-yoe-linux/gcc/aarch64-yoe-linux/11.0.1/include/arm_neon.h":6548:53 1917 {aarch64_shrnv8hi_insn_le}
//     (nil))
        shrn    v1.8b, v1.8h, 15        // 148  [c=4 l=4]  aarch64_shrnv8hi_insn_le

Confirmed, reducing ....
Comment 3 Andrew Pinski 2021-03-07 00:29:06 UTC
Trying 144, 146 -> 148:
  144: r159:V8HI=r165:V8HI 0>>const_vector
      REG_DEAD r165:V8HI
  146: r155:V8QI=trunc(r159:V8HI)
      REG_DEAD r159:V8HI
  148: r153:V16QI=vec_concat(r155:V8QI,const_vector)
      REG_DEAD r155:V8QI
Successfully matched this instruction:
(set (reg:V16QI 153 [ _101 ])
    (vec_concat:V16QI (truncate:V8QI (lshiftrt:V8HI (reg:V8HI 165 [ _113 ])
                (const_vector:V8HI [
                        (const_int 15 [0xf]) repeated x8
                    ])))
        (const_vector:V8QI [
                (const_int 0 [0]) repeated x8
            ])))

I have not reduced it yet but the above shows where the problem is introduced inside combine.  I think the constrants/predicates for aarch64_shrnv8hi_insn_le on the const_vect (shift) are incorrect.
Comment 4 ktkachov 2021-03-07 14:04:55 UTC
Confirmed:
#include <arm_neon.h>

uint8x16_t
foo (uint16x8_t a, uint8x8_t b)
{
  return vcombine_u8 (vmovn_u16 (vshrq_n_u16 (a, 9)), b);
}

Testing a patch.
Comment 5 GCC Commits 2021-03-08 15:06:21 UTC
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:0d9a70ea3881c284b7689b691d54d047b55b486d

commit r11-7556-g0d9a70ea3881c284b7689b691d54d047b55b486d
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Mon Mar 8 15:05:21 2021 +0000

    aarch64: Fix PR99437 - tighten shift predicates for narrowing shift patterns
    
    In this bug combine forms the (R)SHRN(2) instructions with an invalid shift amount.
    The intrinsic expanders for these patterns validate the right shift amount but if the
    final patterns end up being matched by combine (or other RTL passes I suppose) they
    still let the wrong const_vector through.
    
    This patch tightens up the predicates for the instructions involved by using predicates
    for the right shift amount const_vectors.
    
    gcc/ChangeLog:
    
            PR target/99437
            * config/aarch64/predicates.md (aarch64_simd_shift_imm_vec_qi): Define.
            (aarch64_simd_shift_imm_vec_hi): Likewise.
            (aarch64_simd_shift_imm_vec_si): Likewise.
            (aarch64_simd_shift_imm_vec_di): Likewise.
            * config/aarch64/aarch64-simd.md (aarch64_shrn<mode>_insn_le): Use
            predicate from above.
            (aarch64_shrn<mode>_insn_be): Likewise.
            (aarch64_rshrn<mode>_insn_le): Likewise.
            (aarch64_rshrn<mode>_insn_be): Likewise.
            (aarch64_shrn2<mode>_insn_le): Likewise.
            (aarch64_shrn2<mode>_insn_be): Likewise.
            (aarch64_rshrn2<mode>_insn_le): Likewise.
            (aarch64_rshrn2<mode>_insn_be): Likewise.
    
    gcc/testsuite/ChangeLog:
    
            PR target/99437
            * gcc.target/aarch64/simd/pr99437.c: New test.
Comment 6 Khem Raj 2021-03-08 18:57:12 UTC
I can confirm that the above commit fixed the ICE