This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, i386]: Fix PR target/35714: x86 poor code with pmaddwd
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: "Uros Bizjak" <ubizjak at gmail dot com>
- Cc: "gcc patches" <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 7 May 2008 11:49:14 -0700
- Subject: Re: [PATCH, i386]: Fix PR target/35714: x86 poor code with pmaddwd
- References: <5787cf470805070603w3465e039h2ffbdcc4ec6607f0@mail.gmail.com>
Hi Uros,
This patch may have caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36174
H.J.
On Wed, May 7, 2008 at 6:03 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
> As mentioned in the PR, the problem is, that in
> ix86_expand_binop_builtin we handle all binop builtin functions with
> ix86_fixup_binary_operands (UNKNOWN, tmode, xops). The problem with
> UNKNOWN is, that it doesn't trigger commutative optimizations
> (matching op1/2 with src) for instructions with commutative operands.
>
> Unfortunatelly, a lot of patterns assume that their operands will be
> magically fixed for their ix86_binary_operator_ok insn constraint, so
> some work was needed to introduce correct expanders in this case.
> Attached patch also cleans a couple of invalid operand predicates and
> removes operand fixups in a couple of places.
>
> As for the testcase; for the code from PR:
>
> --cut here--
> #include <emmintrin.h>
>
> extern __m128i a;
>
> __m128i madd (__m128i b)
> {
> return _mm_madd_epi16(a, b);
> }
>
> __m128i madd_swapped (__m128i b)
> {
> return _mm_madd_epi16(b, a);
> }
> --cut here--
>
> compiles with -O2 -msse2 to
>
> madd:
> pushl %ebp
> movl %esp, %ebp
> subl $8, %esp
> movdqa a, %xmm1
> pmaddwd %xmm0, %xmm1
> movdqa %xmm1, %xmm0
> leave
> ret
>
> madd_swapped:
> pushl %ebp
> movl %esp, %ebp
> subl $8, %esp
> movdqa a, %xmm1
> leave
> pmaddwd %xmm1, %xmm0
> ret
>
> where patched gcc creates one insn sequence:
>
> madd:
> pushl %ebp
> movl %esp, %ebp
> subl $8, %esp
> pmaddwd a, %xmm0
> leave
> ret
>
> madd_swapped:
> pushl %ebp
> movl %esp, %ebp
> subl $8, %esp
> pmaddwd a, %xmm0
> leave
> ret
>
> 2008-05-06 Uros Bizjak <ubizjak@gmail.com>
>
> PR target/35714
> * config/i386/mmx.md (mmx_subv2sf3): New expander.
> (*mmx_subv2sf3): Rename from mmx_subv2sf3 insn pattern.
> (*mmx_eqv2sf3): Rename from mmx_eqv2sf3 insn pattern.
> (mmx_eqv2sf3): New expander. Use ix86_fixup_binary_operands_no_copy
> to handle nonimmediate operands.
> (*mmx_paddwd): Rename from mmx_paddwd insn pattern.
> (mmx_paddwd): New expander. Use ix86_fixup_binary_operands_no_copy
> to handle nonimmediate operands.
> (*mmx_pmulhrwv4hi3): Rename from mmx_pmulhrwv4hi3 insn pattern.
> (mmx_pmulhrwv4hi3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse2_umulv1siv1di3): Rename from sse2_umulv1siv1di3 insn pattern.
> (sse2_umulv1siv1di3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*mmx_eq<mode>3): Rename from mmx_eq<mode>3 insn pattern.
> (mmx_eq<mode>3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*mmx_uavgv8qi3): Rename from mmx_uavgv8qi3 insn pattern.
> (mmx_uavgv8qi3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*mmx_uavgv4hi3): Rename from mmx_uavgv4hi3 insn pattern.
> (mmx_uavgv4hi3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
>
> * config/i386/sse.md
> (*sse_movhlps): Rename from sse_movhlps insn pattern.
> (sse_movhlps): New expander. Use ix86_fixup_binary_operands
> to handle nonimmediate operands.
> (*sse_movlhps): Rename from sse_movlhps insn pattern.
> (sse_movlhps): New expander. Use ix86_fixup_binary_operands
> to handle nonimmediate operands.
> (*sse_loadhps): Rename from sse_loadhps insn pattern.
> (sse_loadhps): New expander. Use ix86_fixup_binary_operands
> to handle nonimmediate operands.
> (*sse_loadlps): Rename from sse_loadlps insn pattern.
> (sse_loadlps): New expander. Use ix86_fixup_binary_operands
> to handle nonimmediate operands.
> (*sse2_unpckhpd): Rename from sse2_unpckhpd insn pattern.
> (sse2_unpckhpd): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse2_unpcklpd): Rename from sse2_unpcklpd insn pattern.
> (sse2_unpcklpd): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse_loadhpd): Rename from sse_loadhpd insn pattern.
> (sse_loadhpd): New expander. Use ix86_fixup_binary_operands
> to handle nonimmediate operands.
> (*sse_loadlpd): Rename from sse_loadlpd insn pattern.
> (sse_loadlpd): New expander. Use ix86_fixup_binary_operands
> to handle nonimmediate operands.
> (*sse2_<plusminus_insn><mode>3): Rename from
> sse2_<plusminus_insn><mode>3 insn pattern.
> (sse2_<plusminus_insn><mode>3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse2_umulv2siv2di3): Rename from sse2_umulv2siv2di3 insn pattern.
> (sse2_umulv2siv2di3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse4_1_mulv2siv2di3): Rename from sse4_1_mulv2siv2di3 insn pattern.
> (sse4_1_mulv2siv2di3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse2_pmaddwd): Rename from sse2_pmaddwd insn pattern.
> (sse2_pmaddwd): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse2_eq<mode>3): Rename from sse2_eq<mode>3 insn pattern.
> (sse2_eq<mode>3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse4_1_eqv2di3): Rename from sse4_1_eqv2di3 insn pattern.
> (sse4_1_eqv2di3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse2_uavgv16qi3): Rename from sse2_uavgv16qi3 insn pattern.
> (sse2_uavgv16qi3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse2_uavgv16qi3): Rename from sse2_uavgv16qi3 insn pattern.
> (sse2_uavgv16qi3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*sse2_uavgv8hi3): Rename from sse2_uavgv8hi3 insn pattern.
> (sse2_uavgv8hi3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*ssse3_pmulhrswv8hi3): Rename from ssse3_pmulhrswv8hi3 insn pattern.
> (ssse3_pmulhrswv8hi3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
> (*ssse3_pmulhrswv4hi3): Rename from ssse3_pmulhrswv4hi3 insn pattern.
> (ssse3_pmulhrswv4hi3): New expander. Use
> ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
>
> (<sse>_vm<plusminus_insn><mode>3): Do not use ix86_binary_operator_ok.
> (<sse>_vmmul<mode>3): Ditto.
> (divv4sf3): Do not use ix86_fixup_binary_operands_no_copy.
> (divv2df3): Ditto.
> (ssse3_pmaddubsw128): Use register_operand for operand 1.
> (ssse3_pmaddubsw): Ditto.
>
> * config/i386/sse.md (ix86_fixup_binary_operands): Assert that src1
> and src2 must have the same mode when swapped.
> (ix86_expand_binop_builtin): Do not use ix86_fixup_binary_operands
> and ix86_binary_operator_ok. Do not force operands in registers
> when optimizing.
>
> testsuite/ChangeLog:
>
> * gcc.target/i386/pr35714.c: New test.
>
> The patch was bootstrapped and regression tested on i686-pc-linux-gnu
> as well as x86_64-pc-linux-gnu {,-m32}. Patch is committed to
> mainline.
>
> Uros.
>