This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS

From: Richard Biener <richard dot guenther at gmail dot com>
To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
Cc: Jeff Law <law at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>
Date: Wed, 2 Nov 2016 14:58:13 +0100
Subject: Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
Authentication-results: sourceware.org; auth=none
References: <AM5PR0802MB2610405E0020EE80CF9099F383A10@AM5PR0802MB2610.eurprd08.prod.outlook.com> <e8afbaab-03b7-00ba-00e2-1e0c72ede7d2@redhat.com> <AM5PR0802MB26109BD66B2DE30806EDBCA983A10@AM5PR0802MB2610.eurprd08.prod.outlook.com> <CAFiYyc23DA-KCwcRoDVTQNVEGrik5CF4C+LoLeBig-byZP4W4g@mail.gmail.com> <AM5PR0802MB2610FBD39828C4FABB73495583A00@AM5PR0802MB2610.eurprd08.prod.outlook.com>

On Wed, Nov 2, 2016 at 2:43 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Richard Biener wrote:
> On Tue, Nov 1, 2016 at 10:39 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>
>> > If bswap is false no byte swap is needed, so we found a native endian load
>> > and it will always perform the optimization by inserting an unaligned load.
>>
>> Yes, the general agreement is that the expander can do best and thus we
>> should canonicalize accesses to larger ones even for SLOW_UNALIGNED_ACCESS.
>> The expander will generate the canonical best code (hopefully...).
>
> Right, but there are cases where you have to choose between unaligned or aligned
> accesses and you need to know whether the unaligned access is fast.
>
> A good example is memcpy expansion, if you have fast unaligned accesses then you
> should use them to deal with the last few bytes, but if they get expanded, using several
> aligned accesses is much faster than a single unaligned access.

Yes.  That's RTL expansion at which point you of course have to look
at SLOW_UNALIGNED_ACCESS.

>> > This apparently works on all targets, and doesn't cause alignment traps or
>> > huge slowdowns via trap emulation claimed by SLOW_UNALIGNED_ACCESS.
>> > So I'm at a loss what these macros are supposed to mean and how I can query
>> > whether a backend supports fast unaligned access for a particular mode.
>> >
>> > What I actually want to write is something like:
>> >
>> >  if (!FAST_UNALIGNED_LOAD (mode, align)) return false;
>> >
>> > And know that it only accepts unaligned accesses that are efficient on the target.
>> > Maybe we need a new hook like this and get rid of the old one?
>>
>> No, we don't need to other hook.
>>
>> Note there is another similar user in gimple-fold.c when folding small
>> memcpy/memmove
>> to single load/store pairs (patch posted but not applied by me -- I've
>> asked for strict-align
>> target maintainer feedback but got none).
>
> I didn't find it, do you have a link?

https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00598.html

>> Now - for bswap I'm only 99% sure that unaligned load + bswap is
>> better than piecewise loads plus manual swap.
>
> It depends on whether unaligned loads and bswap are expanded or not. Even if we
> assume the expansion is at least as efficient as doing it explicitly (definitely true
> for modes larger than the native integer size - as we found out in PR77308!),
> if both the unaligned load and bswap are expanded it seems better not to make the
> transformation for modes up to the word size. But there is no way to find out as
> SLOW_UNALIGNED_ACCESS must be true whenever STRICT_ALIGN is true.

The case I was thinking about is availability of a bswap load operating only on
aligned memory and "regular" register bswap being "fake" provided by first
spilling to an aligned stack slot and then loading from that.

Maybe a bit far-fetched.

>> But generally I'm always in favor of removing SLOW_UNALIGNED_ACCESS /
>> STRICT_ALIGNMENT checks from the GIMPLE side of the compiler.
>
> I sort of agree because the purpose of these macros is unclear - the documentation
> is insufficient and out of date. I do believe however we need an accurate way to find out
> whether a target supports fast unaligned accesses as that is required to generate good
> target code.

I believe the target macros are solely for RTL expansion and say that
it has to avoid
unaligned ops as those would trap.

Richard.

> Wilco

References:
- [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
  - From: Wilco Dijkstra
- Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
  - From: Jeff Law
- Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
  - From: Wilco Dijkstra
- Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
  - From: Richard Biener
- Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
  - From: Wilco Dijkstra

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]