This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFC: [ARM] Disable peeling

From: Richard Biener <richard dot guenther at gmail dot com>
To: Christophe Lyon <christophe dot lyon at linaro dot org>
Cc: Richard Earnshaw <rearnsha at arm dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
Date: Mon, 10 Dec 2012 19:59:44 +0100
Subject: Re: RFC: [ARM] Disable peeling
References: <CAKdteOZb66r_0t1LLUdToQkJFo8UnX8f671pduuc4i7vOcL6qQ@mail.gmail.com> <50C227C5.4010601@arm.com> <CAFiYyc2LNa=MRAn5S0CZV_=Ds0SAsvqH9w1MOi7of1GFhp=ABQ@mail.gmail.com> <CAKdteObbdmZ+DsNnwsZ3ehEYVO_88tpJS=i8OD44P2Ra82uqMQ@mail.gmail.com>

On Mon, Dec 10, 2012 at 4:42 PM, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> On 10 December 2012 10:02, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Fri, Dec 7, 2012 at 6:30 PM, Richard Earnshaw <rearnsha@arm.com> wrote:
>>> On 07/12/12 15:13, Christophe Lyon wrote:
>>>>
>>>> Hi,
>>>>
>>>> As ARM supports unaligned vector accesses for almost no penalty, I'd
>>>> like to disable loop peeling on ARM targets.
>>>>
>>>> I have ran benchmarks on cortex-A9 (hard-float) and noticed these
>>>> significant improvements:
>>>> * 1.5% improvement on a popular embedded benchmark (with peaks at +20% and
>>>> +29%)
>>>> * 2.1% on spec2k mesa
>>>> * 9.2% on spec2k eon
>>>> * up to 3.4% on some part of another embedded benchmark
>>>>
>>>> The largest regression I noticed is 1%.
>>>>
>>>> I have attached a preliminary patch to discuss how acceptable it would
>>>> be, and to discuss the needed changes in the testsuite. Indeed; quite
>>>> a few tests now fail because they count the number of "vectorizing an
>>>> unaligned access" and "alignment of access forced using peeling"
>>>> occurrences in the vectorizer traces.
>>>>
>>>> I could add a property to target-supports.exp, which would currently
>>>> be only true on ARM to select whether to rely on peeling or not, and
>>>> updated all the affected tests accordingly.
>>>>
>>>> As there are quite a few tests to update, I'd like opinions first.
>>>>
>>>> Thanks,
>>>>
>>>> Christophe.
>>>>
>>>
>>> This feels a bit like a sledge-hammer for a nut that really needs just a
>>> normal hammer.  I guess the crux of the question comes down to do we know
>>> how many times the loop will be executed?  If the answer is no, then OK we
>>> assume that the execution count will be small and don't peel.  If the answer
>>> is yes (or we know the minimum iteration count), then we should be able to
>>> work out what the saving will be by peeling to reach alignment.
>>>
>>> So I think your hook should pass the known (minimum) iteration count as well
>>> -- with 0 indicating that we don't know what the minimum is.
>>>
>>> Note, it may be that today we can't work out what the minimum will be and
>>> that for now we always pass zero.  But that doesn't mean we shouldn't code
>>> for the time when we can work this out.
>>
>> I agree that this is a sledgehammer.  If aligned/unaligned loads/stores have
>> the same cost then reflect that in the vectorized stmt cost hook.  If that
>
> I am not sure to understand which hook you are referring to?

builtin_vectorization_cost or its more modern set of cost hooks,
init_cost, add_stmt_cost and finish_cost.

> My understanding of vect_enhance_data_refs_alignment() is that it uses
> cost to check if the target misaligned stores are more expensive than
> misaligned loads, but at this point it has already decided to perform
> peeling. On simple loops, it has no reason to later decide not to
> perform peeling.

If that is so that's certainly a reason to improve this heuristics.  Of course
it has to be seen in the context of necessary epilogue peeling or other
reasons for prologue peeling (like peeling for number of iterations).

>> alone does not prevent peeling for alignment to happen then the fix is to
>> not consider doing peeling for alignment if aligned/unaligned costs are the
>> same, not adding a new hook.
>>
> I thought that a new hook could enable target variations on this: if
> the cost is very slightly different, it might be worth peeling or not,
> depending on the peeling amount or the number of iterations as Richard
> Earnshaw mentioned.

The new cost infrastructure can be a way to express this (of course as you
mentioned, peeling is not driven by cost but by necessity decisions).

Richard.

> Thanks for your comments,
>
> Christophe.

References:
- RFC: [ARM] Disable peeling
  - From: Christophe Lyon
- Re: RFC: [ARM] Disable peeling
  - From: Richard Earnshaw
- Re: RFC: [ARM] Disable peeling
  - From: Richard Biener
- Re: RFC: [ARM] Disable peeling
  - From: Christophe Lyon

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]