I'm seeing an ICE with GCC 3.2.2 (RH9 build) and GCC 3.4 (20031126 snapshot) for the following code: -- CUT typedef int v4hi __attribute__ ((mode(V4HI))); int f(unsigned short n) { v4hi vec = { 0, 0, 1, n }; v4hi hw = __builtin_ia32_pmulhw(vec, vec); return (__builtin_ia32_pextrw(hw,0)); } -- CUT I'm seeing this: $ gcc-3.4 -v Reading specs from /usr/local/gcc-3.4-20031126/lib/gcc/i686-pc-linux-gnu/3.4/specs Configured with: ../gcc-3.4-20031126/configure --prefix=/usr/local/gcc-3.4-20031126 Thread model: posix gcc version 3.4 20031126 (experimental) $ gcc-3.4 -O -msse -c bug.c bug.c: In function `f': bug.c:8: error: unable to find a register to spill in class `GENERAL_REGS' bug.c:8: error: this is the insn: (insn 13 11 15 0 (parallel [ (set (subreg:SI (reg/v:V4HI %mm0 [orig:61 vec ] [61]) 0) (and:SI (subreg:SI (reg/v:V4HI %mm0 [orig:61 vec ] [61]) 0) (const_int -65536 [0xffff0000]))) (clobber (reg:CC %eflags)) ]) 197 {*andsi_1} (insn_list 11 (nil)) (expr_list:REG_UNUSED (reg:CC %eflags) (nil))) bug.c:8: internal compiler error: in spill_failure, at reload1.c:1854 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions. The ICE with 3.2.2 appears to be basically the same (can't find a register to spill), but has less detail, and anyway I doubt anyone wants to fix bugs in 3.2.2. This looks an awful lot like bug 9401, but that has been closed as fixed since last May. The ICE is only with -O; -O2/-O3 seem to be fine. BTW, change the '1' in the initializer of 'vec' to produce another ICE (in emit-rtl.c). This one does not show up with 3.2.2, but my 3.4 doesn't like that either.
Confirmed but not a regression (this was rejected before the ICE showed up).
The generic vector extensions does not work at all for i386, unrotunately. Honza
I get a different ICE on the mainline: t.c:1: warning: specifying vector types with __attribute__ ((mode)) is deprecated t.c:1: warning: use __attribute__ ((vector_size)) instead f t.c: In function `f': t.c:8: error: unable to find a register to spill in class `GENERAL_REGS' t.c:8: error: this is the insn: (insn 14 13 16 0 (set (strict_low_part (subreg:HI (reg/v:V4HI 29 mm0 [orig:59 vec ] [59]) 0)) (const_int 0 [0x0])) 43 {*movstricthi_1} (insn_list 13 (nil)) (nil)) t.c:8: internal compiler error: in spill_failure, at reload1.c:1884 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions.
The testcase from the description fails in the same way for current mainline: gcc -O -msse pr13366.c pr13366.c:1: warning: specifying vector types with __attribute__ ((mode)) is deprecated pr13366.c:1: warning: use __attribute__ ((vector_size)) instead pr13366.c: In function 'f': pr13366.c:9: error: unable to find a register to spill in class 'GENERAL_REGS' pr13366.c:9: error: this is the insn: (insn 15 13 17 0 (parallel [ (set (subreg:SI (reg/v:V4HI 29 mm0 [orig:59 vec ] [59]) 0) (and:SI (subreg:SI (reg/v:V4HI 29 mm0 [orig:59 vec ] [59]) 0) (const_int -65536 [0xffff0000]))) (clobber (reg:CC 17 flags)) ]) 206 {*andsi_1} (insn_list:REG_DEP_TRUE 13 (nil)) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil)))) pr13366.c:9: internal compiler error: in spill_failure, at reload1.c:1873 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions. BTW: Changing 1 as suggested in report to 5, does not change the ICE.
Could info at http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02453.html help to fix this bug?
Oops, my report on that was not clear. Change the 1 to a 0 to get a different ICE, at least in whatever random 3.4.0 snapshot I have installed (20040107). It's apparently a 0/!0 thing. I have not checked this on a 3.4 release or mainline, though. Here is the what I see after changing the 1 to a 0: ice.c: In function `f': ice.c:8: internal compiler error: in subreg_hard_regno, at emit-rtl.c:1026 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions.
Equivalent SSE2 version works OK: typedef int v8hi __attribute__ ((mode (V8HI))); int f (unsigned short n) { v8hi vec = { 0, 0, 0, 0, 0, 0, 1, n }; v8hi hw = __builtin_ia32_pmulhw128 (vec, vec); return (__builtin_ia32_pextrw128 (hw, 0)); } SSE2 example produces following RTL: (insn 13 11 15 1 (set (reg/v:V8HI 59 [ vec ]) (const_vector:V8HI [ (const_int 0 [0x0]) (const_int 0 [0x0]) (const_int 0 [0x0]) (const_int 0 [0x0]) (const_int 0 [0x0]) (const_int 0 [0x0]) (const_int 0 [0x0]) (const_int 0 [0x0]) ])) -1 (nil) (nil)) (insn 15 13 16 1 (parallel [ (set (subreg:SI (reg/v:V8HI 59 [ vec ]) 12) (and:SI (subreg:SI (reg/v:V8HI 59 [ vec ]) 12) (const_int -65536 [0xffff0000]))) (clobber (reg:CC 17 flags)) ]) -1 (nil) (nil)) ... and MMX version produces: (insn 13 11 15 1 (clobber (reg/v:V4HI 59 [ vec ])) -1 (nil) (nil)) (insn 15 13 17 1 (parallel [ (set (subreg:SI (reg/v:V4HI 59 [ vec ]) 0) (and:SI (subreg:SI (reg/v:V4HI 59 [ vec ]) 0) (const_int -65536 [0xffff0000]))) (clobber (reg:CC 17 flags)) ]) -1 (nil) (nil)) ... The trouble is in (insn 13). There is no setting of reg 59 to zero. Also, mainline does not ICE for v4hi vec = { 0, 0, 0, n }; and its SSE2 equivalent as suggested in comment #6 for both MMX and SSE2 versions.
Subject: Bug 13366 CVSROOT: /cvs/gcc Module name: gcc Changes by: rth@gcc.gnu.org 2005-01-11 21:33:15 Modified files: gcc : ChangeLog gcc/config/i386: emmintrin.h i386-protos.h i386.c i386.h mmintrin.h mmx.md pmmintrin.h predicates.md sse.md xmmintrin.h Added files: gcc/testsuite/gcc.target/i386: pr13366.c Log message: PR target/13366 * config/i386/i386.h (enum ix86_builtins): Move ... * config/i386/i386.c: ... here. (IX86_BUILTIN_MOVDDUP, IX86_BUILTIN_MMX_ZERO, IX86_BUILTIN_PEXTRW, IX86_BUILTIN_PINSRW, IX86_BUILTIN_LOADAPS, IX86_BUILTIN_LOADSS, IX86_BUILTIN_STORESS, IX86_BUILTIN_SSE_ZERO, IX86_BUILTIN_PEXTRW128, IX86_BUILTIN_PINSRW128, IX86_BUILTIN_LOADAPD, IX86_BUILTIN_LOADSD, IX86_BUILTIN_STOREAPD, IX86_BUILTIN_STORESD, IX86_BUILTIN_STOREHPD, IX86_BUILTIN_STORELPD, IX86_BUILTIN_SETPD1, IX86_BUILTIN_SETPD, IX86_BUILTIN_CLRPD, IX86_BUILTIN_LOADPD1, IX86_BUILTIN_LOADRPD, IX86_BUILTIN_STOREPD1, IX86_BUILTIN_STORERPD, IX86_BUILTIN_LOADDQA, IX86_BUILTIN_STOREDQA, IX86_BUILTIN_CLRTI, IX86_BUILTIN_LOADDDUP): Remove. (IX86_BUILTIN_VEC_INIT_V2SI, IX86_BUILTIN_VEC_INIT_V4HI, IX86_BUILTIN_VEC_INIT_V8QI, IX86_BUILTIN_VEC_EXT_V2DF, IX86_BUILTIN_VEC_EXT_V2DI, IX86_BUILTIN_VEC_EXT_V4SF, IX86_BUILTIN_VEC_EXT_V8HI, IX86_BUILTIN_VEC_EXT_V4HI, IX86_BUILTIN_VEC_SET_V8HI, IX86_BUILTIN_VEC_SET_V4HI): New. (ix86_init_builtins): Make static. (ix86_init_mmx_sse_builtins): Update for changed builtins. (ix86_expand_binop_builtin): Only use ix86_fixup_binary_operands if all the modes match. Otherwise, fake it. (get_element_number, ix86_expand_vec_init_builtin, ix86_expand_vec_ext_builtin, ix86_expand_vec_set_builtin): New. (ix86_expand_builtin): Make static. Update for changed builtins. (ix86_expand_vector_move_misalign): Use sse2_loadlpd with zero operand instead of sse2_loadsd. Cast sse1 fallback to V4SFmode. (ix86_expand_vector_init_duplicate): New. (ix86_expand_vector_init_low_nonzero): New. (ix86_expand_vector_init_one_var, ix86_expand_vector_init_general): Split out from ix86_expand_vector_init; handle integer modes. (ix86_expand_vector_init): Use them. (ix86_expand_vector_set, ix86_expand_vector_extract): New. * config/i386/i386-protos.h: Update. * config/i386/predicates.md (reg_or_0_operand): New. * config/i386/mmx.md (mov<MMXMODEI>_internal): Add 'r' variants. (movv2sf_internal): Likewise. And a splitter to match them all. (vec_dupv2sf, mmx_concatv2sf, vec_setv2sf, vec_extractv2sf, vec_initv2sf, vec_dupv4hi, vec_dupv2si, mmx_concatv2si, vec_setv2si, vec_extractv2si, vec_initv2si, vec_setv4hi, vec_extractv4hi, vec_initv4hi, vec_setv8qi, vec_extractv8qi, vec_initv8qi): New. (mmx_pinsrw): Fix operand ordering. * config/i386/sse.md (movv4sf splitter): Use direct pattern, rather than sse_loadss expander. (movv2df splitter): Similarly. (sse_loadss, sse_loadlss): Remove. (vec_dupv4sf, sse_concatv2sf, sse_concatv4sf, vec_extractv4sf_0): New. (vec_setv4sf, vec_setv2df): Use ix86_expand_vector_set. (vec_extractv4sf, vec_extractv2df): Use ix86_expand_vector_extract. (sse3_movddup): Rename with '*'. (sse3_movddup splitter): Use gen_rtx_REG instead of gen_lowpart. (sse2_loadsd): Remove. (vec_dupv2df_sse3): Rename from sse3_loadddup. (vec_dupv2df, vec_concatv2df_sse3, vec_concatv2df): New. (sse2_pinsrw): Fix argument ordering. (sse2_loadld, sse2_loadq): Add sse1 alternatives. (sse2_stored): Remove 'r' destination. (vec_dupv4si, vec_dupv2di, sse2_concatv2si, sse1_concatv2si, vec_concatv4si_1, vec_concatv2di, vec_setv2di, vec_extractv2di, vec_initv2di, vec_setv4si, vec_extractv4si, vec_initv4si, vec_setv8hi, vec_extractv8hi, vec_initv8hi, vec_setv16qi, vec_extractv16qi, vec_initv16qi): New. * config/i386/emmintrin.h (__m128i, __m128d): Use typedef, not define. (_mm_set_sd, _mm_set1_pd, _mm_setzero_pd, _mm_set_epi64x, _mm_set_epi32, _mm_set_epi16, _mm_set_epi8, _mm_setzero_si128): Use constructor form. (_mm_load_pd, _mm_store_pd): Use plain dereference. (_mm_load_si128, _mm_store_si128): Likewise. (_mm_load1_pd): Use _mm_set1_pd. (_mm_load_sd): Use _mm_set_sd. (_mm_store_sd, _mm_storeh_pd): Use __builtin_ia32_vec_ext_v2df. (_mm_store1_pd, _mm_storer_pd): Use _mm_store_pd. (_mm_set_epi64): Use _mm_set_epi64x. (_mm_set1_epi64x, _mm_set1_epi64, _mm_set1_epi32, _mm_set_epi16, _mm_set1_epi8, _mm_setr_epi64, _mm_setr_epi32, _mm_setr_epi16, _mm_setr_epi8): Use _mm_set_foo form. (_mm_loadl_epi64, _mm_movpi64_epi64, _mm_move_epi64): Use _mm_set_epi64. (_mm_storel_epi64, _mm_movepi64_pi64): Use __builtin_ia32_vec_ext_v2di. (_mm_extract_epi16): Use __builtin_ia32_vec_ext_v8hi. (_mm_insert_epi16): Use __builtin_ia32_vec_set_v8hi. * config/i386/mmintrin.h (_mm_setzero_si64): Use plain cast. (_mm_set_pi32): Use __builtin_ia32_vec_init_v2si. (_mm_set_pi16): Use __builtin_ia32_vec_init_v4hi. (_mm_set_pi8): Use __builtin_ia32_vec_init_v8qi. (_mm_set1_pi16, _mm_set1_pi8): Use _mm_set_piN variant. * config/i386/pmmintrin.h (_mm_loaddup_pd): Use _mm_load1_pd. (_mm_movedup_pd): Use _mm_shuffle_pd. * config/i386/xmmintrin.h (_mm_setzero_ps, _mm_set_ss, _mm_set1_ps, _mm_set_ps, _mm_setr_ps): Use constructor form. (_mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps, _mm_cvtps_pi8, _mm_cvtpi32x2_ps): Avoid __builtin_ia32_mmx_zero; Use _mm_setzero_ps. (_mm_load_ss, _mm_load1_ps): Use _mm_set* form. (_mm_load_ps, _mm_loadr_ps): Use raw dereference. (_mm_store_ss): Use __builtin_ia32_vec_ext_v4sf. (_mm_store_ps): Use raw dereference. (_mm_store1_ps): Use _mm_storeu_ps. (_mm_storer_ps): Use _mm_store_ps. (_mm_extract_pi16): Use __builtin_ia32_vec_ext_v4hi. (_mm_insert_pi16): Use __builtin_ia32_vec_set_v4hi. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.7095&r2=2.7096 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/emmintrin.h.diff?cvsroot=gcc&r1=1.9&r2=1.10 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386-protos.h.diff?cvsroot=gcc&r1=1.124&r2=1.125 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.c.diff?cvsroot=gcc&r1=1.773&r2=1.774 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.h.diff?cvsroot=gcc&r1=1.416&r2=1.417 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/mmintrin.h.diff?cvsroot=gcc&r1=1.14&r2=1.15 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/mmx.md.diff?cvsroot=gcc&r1=1.2&r2=1.3 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/pmmintrin.h.diff?cvsroot=gcc&r1=1.4&r2=1.5 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/predicates.md.diff?cvsroot=gcc&r1=1.12&r2=1.13 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/sse.md.diff?cvsroot=gcc&r1=1.2&r2=1.3 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/xmmintrin.h.diff?cvsroot=gcc&r1=1.31&r2=1.32 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.target/i386/pr13366.c.diff?cvsroot=gcc&r1=NONE&r2=1.1
Fixed. No chance of a backport to 3.4. As a workaround, use _mm_set_pi16 instead of the explicit constructor.