This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR54733 Optimize endian independent load/store
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: "Thomas Preud'homme" <thomas dot preudhomme at arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 20 May 2014 11:06:17 +0200
- Subject: Re: [PATCH] Fix PR54733 Optimize endian independent load/store
- Authentication-results: sourceware.org; auth=none
- References: <006f01cf6b71$1cf10df0$56d329d0$ at arm dot com> <000001cf70ee$9aa2ed90$cfe8c8b0$ at arm dot com> <CAFiYyc1-5KbvVXqiQKu3aVn_X0RKvvtJn4hBtADp5eA3QFEb4A at mail dot gmail dot com> <EF3B84D2-BB18-405B-8CE3-3C1F2A792473 at gmail dot com> <CAFiYyc360hKJvypP+qDwWF-7JM8dVj-gsVpnwGFMgNYo=taqMQ at mail dot gmail dot com> <000801cf73d5$be55b530$3b011f90$ at arm dot com>
On Tue, May 20, 2014 at 4:46 AM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>>
>> Agreed, but I am happy with doing that as a followup. Btw,
>> a very simple one would be to reject unaligned
>> SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align).
>> [of course that may be true on MIPS even for the cases where
>> a "reasonable" fast unalgined variant exists - nearly no target
>> defines that macro in a too fancy way]
>
> Indeed, it's defined to 1 without consideration of the mode or alignment
> At least ARM, alpha, tilegx, tilepro and all target with STRICT_ALIGNMENT
> since that's the default value for SLOW_UNALIGNED_ACCESS macro. Thus
> mips should be in there too for instance.
>
> However, I fail to see how the code produced to do an unaligned load
> could be worse than the manual load done in the original bitwise
> expression. It might be worse for load + bswap though. Maybe I could
> skip the optimization based on this macro only for bswap?
It may do three aligned loads, char, short, char and combine them
while doing an unaligned int load may end up being slower. Though
very probable the RTL expansion machinery for unaligned loads
is way more clever to emit an optimal sequence than a programmer is.
Anyway, as said before please consider addressing any cost issues
as followup - just make sure to properly emit unaligned loads via
a sequence I suggested.
Thanks,
Richard.
> Best regards,
>
> Thomas
>
>