I lack words to describe what happens on x86-64 to <------------------------------------------------------> #include <emmintrin.h> __m128i foo(__m128i a) { return _mm_srli_si128(a, 8); } int main() { return 0; } <------------------------------------------------------> # /usr/local/gcc-4.2-20060916/bin/gcc -O1 pr-psrldq.c -o pr-psrldq 000000000040042e <foo>: 40042e: 66 0f 7f 44 24 d8 movdqa %xmm0,0xffffffffffffffd8(%rsp) 400434: 48 8b 54 24 e0 mov 0xffffffffffffffe0(%rsp),%rdx 400439: 48 89 d0 mov %rdx,%rax 40043c: 31 d2 xor %edx,%edx 40043e: 48 89 44 24 e8 mov %rax,0xffffffffffffffe8(%rsp) 400443: 48 89 54 24 f0 mov %rdx,0xfffffffffffffff0(%rsp) 400448: 66 0f 6f 44 24 e8 movdqa 0xffffffffffffffe8(%rsp),%xmm0 40044e: c3 retq gcc-4.3-20070105 is still that creative. As far as i know, it's specific to x86-64 but i'm not sure if other shifting ops or specific values also are pathologic.
s/gcc-4.3-20070105/gcc-4.3-20070608/
This is the fault of combine pass. It isn't obvious to me, why it is converting: (insn 7 6 13 2 shift.c:6 (set (subreg:TI (reg:V2DI 62) 0) (ashift:TI (subreg:TI (reg:V2DI 61 [ __a ]) 0) (const_int 64 [0x40]))) 1115 {sse2_ashlti3} (insn_list:REG_DEP_TRUE 6 (nil)) (expr_list:REG_DEAD (reg:V2DI 61 [ __a ]) (expr_list:REG_EQUAL (ashift:TI (subreg:TI (reg/v:V2DI 60 [ __a ]) 0) (const_int 64 [0x40])) (nil)))) ) to (insn 7 6 13 2 shift.c:6 (parallel [ (set (subreg:TI (reg:V2DI 62) 0) (ashift:TI (reg:TI 21 xmm0 [ __a ]) (const_int 64 [0x40]))) (clobber (reg:CC 17 flags)) ]) 412 {*ashlti3_2} (nil) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_DEAD (reg:V2DI 21 xmm0 [ __a ]) (nil)))) Note the clobber in above pattern. So, what is wrong with: (define_insn "sse2_ashlti3" [(set (match_operand:TI 0 "register_operand" "=x") (ashift:TI (match_operand:TI 1 "register_operand" "0") (match_operand:SI 2 "const_0_to_255_mul_8_operand" "n")))] "TARGET_SSE2"
Because you should not be have two patterns that could match (with one differs via a clobber). The way to fix this is have the sse2_ashlti3 one do what ashlti3_2 does for "r" constraints also. This is the normal two patterns doing the same thing problem. -- Pinski
Subject: Bug 32280 Author: uros Date: Mon Jun 11 10:13:00 2007 New Revision: 125615 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125615 Log: PR target/32280 * config/i386/sse.md ("sse2_ashlti", "sse2_lshrti3"): Move ... * config/i386/i386.md ("sse2_ashlti", "sse2_lshrti3"): ... to here. testsuite/ChangeLog: PR target/32280 * gcc.target/i386/pr32280.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr32280.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
Fixed.(In reply to comment #3) > Because you should not be have two patterns that could match (with one differs > via a clobber). The way to fix this is have the sse2_ashlti3 one do what > ashlti3_2 does for "r" constraints also. But we need clobber for "r" constraint, because in this case, insn is split into logical operations that clobber flags.
Fixed.
Subject: Bug 32280 Author: uros Date: Thu Dec 17 12:33:09 2009 New Revision: 155312 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155312 Log: PR target/32280 * config/i386/i386-modes.def (V1TI): New vector mode. * config/i386/i386.h (VALID_SSE_REG_MODE): Add V1TImode. (SSE_REG_MODE_P): Ditto. * config/i386/sse.md (SSEMODE16): New mode iterator. (AVXMODE16): Ditto. (avxvecmode): Handle V1TI mode. (*avx_mov<mode>_internal): Use AVXMODE16 instead of AVXMODE. (mov<mode>): Use SSEMODE16 instead of SSEMODE. (*mov<mode>_internal): Ditto. (push<mode>1): Ditto. (movmisalign<mode>): Ditto. (sse2_ashlv1ti): Rename from sse2_ashlti. (sse2_lshrv1ti): Rename from sse2_lshrti. (*avx_ashlv1ti): Rename from *avx_ashlti and move from i386.md. (*avx_lshrv1ti): Rename from *avx_lshrti and move from i386.md. (vec_shl_<mode>): Convert operands to V1TImode and use V1TI shift. (vec_shr_<mode>): Ditto. (*sse2_mulv4si3): Update for renamed sse2_ashlv1ti3. (udot_prodv4si): Ditto. * config/i386/i386.c (classify_argument): Handle V1TImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (ix86_expand_sse4_unpack): Convert operands to V1TImode and update for renamed gen_sse2_lshrv1ti3. (ix86_expand_args_builtin) <V2DI_FTYPE_V2DI_INT_CONVERT>: Set rmode to V1TImode. (struct builtin_description) <__builtin_ia32_pslldqi128>: Update for renamed sse2_ashlv1ti3. <__builtin_ia32_psrldqi128>: Update for renamed sse2_lshrv1ti3. Revert: 2007-06-11 Uros Bizjak <ubizjak@gmail.com> PR target/32280 * config/i386/sse.md ("sse2_ashlti", "sse2_lshrti3"): Move ... * config/i386/i386.md ("sse2_ashlti", "sse2_lshrti3"): ... to here. testsuite/ChangeLog: PR target/32280 * gcc.target/i386/pr32280-1.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr32280-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386-modes.def trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.h trunk/gcc/config/i386/i386.md trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog