[ARM] Use vector wide add for mixed-mode adds

Michael Collison michael.collison@linaro.org
Thu Oct 1 10:05:00 GMT 2015


Kyrill,

I have modified the patch to address your comments. I also modified 
check_effective_target_vect_widen_sum_hi_to_si_pattern in 
target-supports.exp to
indicate that arm neon supports vector widen sum of HImode to SImode. 
This resolved
several test suite failures.

Successfully tested on arm-none-eabi, arm-none-linux-gnueabihf. I have 
four related execution failure
tests on armeb-non-linux-gnueabihf with -flto only.

gcc.dg/vect/vect-outer-4f.c -flto -ffat-lto-objects execution test
gcc.dg/vect/vect-outer-4g.c -flto -ffat-lto-objects execution test
gcc.dg/vect/vect-outer-4k.c -flto -ffat-lto-objects execution test
gcc.dg/vect/vect-outer-4l.c -flto -ffat-lto-objects execution test


I am debugging but have not tracked down the root cause yet. Feedback?

2015-07-22  Michael Collison  <michael.collison@linaro.org>

     * config/arm/neon.md (widen_<us>sum<mode>): New patterns
     where mode is VQI to improve mixed mode vectorization.
     * config/arm/neon.md (vec_sel_widen_ssum_lo<VQI:mode><VW:mode>3): New
     define_insn to match low half of signed vaddw.
     * config/arm/neon.md (vec_sel_widen_ssum_hi<VQI:mode><VW:mode>3): New
     define_insn to match high half of signed vaddw.
     * config/arm/neon.md (vec_sel_widen_usum_lo<VQI:mode><VW:mode>3): New
     define_insn to match low half of unsigned vaddw.
     * config/arm/neon.md (vec_sel_widen_usum_hi<VQI:mode><VW:mode>3): New
     define_insn to match high half of unsigned vaddw.
     * testsuite/gcc.target/arm/neon-vaddws16.c: New test.
     * testsuite/gcc.target/arm/neon-vaddws32.c: New test.
     * testsuite/gcc.target/arm/neon-vaddwu16.c: New test.
     * testsuite/gcc.target/arm/neon-vaddwu32.c: New test.
     * testsuite/gcc.target/arm/neon-vaddwu8.c: New test.
     * testsuite/lib/target-supports.exp
     (check_effective_target_vect_widen_sum_hi_to_si_pattern): Indicate
     that arm neon support vector widen sum of HImode TO SImode.

On 09/23/2015 01:49 AM, Kyrill Tkachov wrote:
> Hi Michael,
>
> On 23/09/15 00:52, Michael Collison wrote:
>> This is a modified version of the previous patch that removes the
>> documentation and read-md.c fixes. These patches have been submitted
>> separately and approved.
>>
>> This patch is designed to address code that was not being vectorized due
>> to missing widening patterns in the ARM backend. Code such as:
>>
>> int t6(int len, void * dummy, short * __restrict x)
>> {
>>     len = len & ~31;
>>     int result = 0;
>>     __asm volatile ("");
>>     for (int i = 0; i < len; i++)
>>       result += x[i];
>>     return result;
>> }
>>
>> Validated on arm-none-eabi, arm-none-linux-gnueabi,
>> arm-none-linux-gnueabihf, and armeb-none-linux-gnueabihf.
>>
>> 2015-09-22  Michael Collison <michael.collison@linaro.org>
>>
>>       * config/arm/neon.md (widen_<us>sum<mode>): New patterns
>>       where mode is VQI to improve mixed mode add vectorization.
>>
>
> Please list all the new define_expands and define_insns
> in the changelog. Also, please add an ChangeLog entry for
> the testsuite additions.
>
> The approach looks ok to me with a few comments on some
> parts of the patch itself.
>
>
> +(define_insn "vec_sel_widen_ssum_hi<VQI:mode><VW:mode>3"
> +  [(set (match_operand:<VW:V_widen> 0 "s_register_operand" "=w")
> +    (plus:<VW:V_widen> (sign_extend:<VW:V_widen> (vec_select:VW 
> (match_operand:VQI 1 "s_register_operand" "%w")
> +                           (match_operand:VQI 2 
> "vect_par_constant_high" "")))
> +                (match_operand:<VW:V_widen> 3 "s_register_operand" 
> "0")))]
> +  "TARGET_NEON"
> +  "vaddw.<V_s_elem>\t%q0, %q3, %f1"
> +  [(set_attr "type" "neon_add_widen")
> +  (set_attr "length" "8")]
> +)
>
>
> This is a single instruction, and it has a length of 4, so no need to 
> override the length attribute.
> Same with the other define_insns in this patch.
>
>
> diff --git a/gcc/testsuite/gcc.target/arm/neon-vaddws16.c 
> b/gcc/testsuite/gcc.target/arm/neon-vaddws16.c
> new file mode 100644
> index 0000000..ed10669
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/neon-vaddws16.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_hw } */
>
> The arm_neon_hw check is usually used when you want to run the tests.
> Since this is a compile-only tests you just need arm_neon_ok.
>
>  +/* { dg-add-options arm_neon_ok } */
> +/* { dg-options "-O3" } */
> +
> +
> +int
> +t6(int len, void * dummy, short * __restrict x)
> +{
> +  len = len & ~31;
> +  int result = 0;
> +  __asm volatile ("");
> +  for (int i = 0; i < len; i++)
> +    result += x[i];
> +  return result;
> +}
> +
> +/* { dg-final { scan-assembler "vaddw\.s16" } } */
> +
> +
> +
>
> Stray trailing newlines. Similar comments for the other testcases.
>
> Thanks,
> Kyrill
>

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: tcwg-833-aarc32-upstream.patch
Type: text/x-patch
Size: 8791 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20151001/42096c11/attachment.bin>


More information about the Gcc-patches mailing list