This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] [ARM] Post-indexed addressing for NEON memory access

From: Ramana Radhakrishnan <ramana dot gcc at googlemail dot com>
To: Charles Baylis <charles dot baylis at linaro dot org>
Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>
Date: Wed, 18 Jun 2014 11:06:20 +0100
Subject: Re: [PATCH] [ARM] Post-indexed addressing for NEON memory access
Authentication-results: sourceware.org; auth=none
References: <CADnVucAaG=uAZyxQGvyf5bqrmW8JfhfjCp84uCpVnf+=Tois5w at mail dot gmail dot com> <CAJA7tRbHVGuBJFB0cqXfZeuvJj6WO0BJrCtTOmQqi8tt7ZffPQ at mail dot gmail dot com> <CADnVucC_LXyyY1WrxbzdkmTYo=8Ur1Nvr09auZk3rNM7MnsqSg at mail dot gmail dot com>
Reply-to: ramrad01 at arm dot com

On Tue, Jun 17, 2014 at 4:03 PM, Charles Baylis
<charles.baylis@linaro.org> wrote:
> On 5 June 2014 07:27, Ramana Radhakrishnan <ramana.gcc@googlemail.com> wrote:
>> On Mon, Jun 2, 2014 at 5:47 PM, Charles Baylis
>> <charles.baylis@linaro.org> wrote:
>>> This patch adds support for post-indexed addressing for NEON structure
>>> memory accesses.
>>>
>>> For example VLD1.8 {d0}, [r0], r1
>>>
>>>
>>> Bootstrapped and checked on arm-unknown-gnueabihf using Qemu.
>>>
>>> Ok for trunk?
>>
>> This looks like a reasonable start but this work doesn't look complete
>> to me yet.
>>
>> Can you also look at the impact on performance of a range of
>> benchmarks especially a popular embedded one to see how this behaves
>> unless you have already done so ?
>
> I ran a popular suite of embedded benchmarks, and there is no impact
> at all on Chromebook (including with the additional attached patch)

Thanks for the due diligence

>
> The patch was developed to address a performance issue with a new
> version of libvpx which uses intrinsics instead of NEON assembler. The
> patch results in a 3% improvement for VP8 decode.

Good - 3% not to be sneezed at.

>
>> POST_INC, POST_MODIFY usually have a funny way of biting you with
>> either ivopts or the way in which address costs work. I think there
>> maybe further tweaks needed but for a first step I'd like to know what
>> the performance impact is.
>
>> I would also suggest running this through clyon's neon intrinsics
>> testsuite to see if that catches any issues especially with the large
>> vector modes.

Thanks.

>
> No issues found in clyon's tests.

Please keep an eye out for any regressions.

>
> Your mention of larger vector modes prompted me to check that the
> patch has the desired result with them. In fact, the costs are
> estimated incorrectly which means the post_modify pattern is not used.
> The attached patch fixes that. (used in combination with my original
> patch)
>
>
> 2014-06-15  Charles Baylis  <charles.bayls@linaro.org>
>
>         * config/arm/arm.c (arm_new_rtx_costs): Reduce cost for mem with
>         embedded side effects.

I'm not too thrilled with putting in more special cases that are not
table driven in there. Can you file a PR with some testcases that show
this so that we don't forget and CC me on it please ?


Ramana

Follow-Ups:
- Re: [PATCH] [ARM] Post-indexed addressing for NEON memory access
  - From: Charles Baylis

References:
- [PATCH] [ARM] Post-indexed addressing for NEON memory access
  - From: Charles Baylis
- Re: [PATCH] [ARM] Post-indexed addressing for NEON memory access
  - From: Ramana Radhakrishnan
- Re: [PATCH] [ARM] Post-indexed addressing for NEON memory access
  - From: Charles Baylis

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]