This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR54733 Optimize endian independent load/store
- From: pinskia at gmail dot com
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Thomas Preud'homme <thomas dot preudhomme at arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 16 May 2014 03:56:40 -0700
- Subject: Re: [PATCH] Fix PR54733 Optimize endian independent load/store
- Authentication-results: sourceware.org; auth=none
- References: <006f01cf6b71$1cf10df0$56d329d0$ at arm dot com> <000001cf70ee$9aa2ed90$cfe8c8b0$ at arm dot com> <CAFiYyc1-5KbvVXqiQKu3aVn_X0RKvvtJn4hBtADp5eA3QFEb4A at mail dot gmail dot com>
> On May 16, 2014, at 3:48 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Fri, May 16, 2014 at 12:07 PM, Thomas Preud'homme
> <thomas.preudhomme@arm.com> wrote:
>> Ping?
>
> Sorry ...
>
>> Best regards,
>>
>> Thomas Preud'homme
>>
>>> -----Original Message-----
>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
>>> Sent: Friday, May 09, 2014 6:26 PM
>>> To: GCC Patches
>>> Subject: RE: [PATCH] Fix PR54733 Optimize endian independent load/store
>>>
>>> Sorry, took longer than expected as I got distracted by some other patch.
>>> I merged the whole patchset in a single patch as I was told the current setup
>>> is actually more difficult to read.
>>>
>>> Here are the updated ChangeLogs:
>>>
>>> *** gcc/ChangeLog ***
>>>
>>> 2014-05-09 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>>
>>> PR tree-optimization/54733
>>> * expr.c (get_inner_reference): Add a parameter to control whether
>>> a
>>> MEM_REF should be split into base + offset.
>>> * tree.h (get_inner_reference): Default new parameter to false.
>>> * tree-ssa-math-opts.c (nop_stats): New "bswap_stats" structure.
>>> (CMPNOP): Define.
>>> (find_bswap_or_nop_load): New.
>>> (find_bswap_1): Renamed to ...
>>> (find_bswap_or_nop_1): This. Also add support for memory source.
>>> (find_bswap): Renamed to ...
>>> (find_bswap_or_nop): This. Also add support for memory source and
>>> detection of bitwise operations equivalent to load in host endianness.
>>> (execute_optimize_bswap): Likewise. Also move its leading
>>> comment back
>>> in place and split statement transformation into ...
>>> (bswap_replace): This. Add assert when updating bswap_stats.
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>>
>>> 2014-05-09 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>>
>>> PR tree-optimization/54733
>>> * gcc.dg/optimize-bswapdi-3.c: New test to check extension of
>>> bswap
>>> optimization to support memory sources and bitwise operations
>>> equivalent to load in host endianness.
>>> * gcc.dg/optimize-bswaphi-1.c: Likewise.
>>> * gcc.dg/optimize-bswapsi-2.c: Likewise.
>>> * gcc.c-torture/execute/bswap-2.c: Likewise.
>>>
>>> Ok for trunk?
>
> Ok, I now decided otherwise and dislike the new parameter to
> get_inner_reference. Can you please revert that part and just
> deal with a MEM_REF result in your only caller?
>
> And (of course) I found another possible issue. The way you
> compute load_type and use it here:
>
> + /* Perform the load. */
> + load_offset_ptr = build_int_cst (n->alias_set, 0);
> + val_expr = fold_build2 (MEM_REF, load_type, addr_tmp,
> + load_offset_ptr);
>
> makes the load always appear aligned according to the mode of
> load_type. On strict-alignment targets this may cause faults.
>
> So what you have to do is either (simpler)
>
> unsigned int align = get_pointer_alignment (addr_tmp);
> tree al_load_type = load_type;
> if (align < TYPE_ALIGN (load_type))
> al_load_type = build_aligned_type (load_type, align);
> ...
> val_expr = fold_build2 (MEM_REF, al_load_type, addr_tmp,
> load_offset_ptr);
>
> or keep track of the "first" actual load and use
>
> unsigned int align = get_object_alignment (that_first_load);
>
> "first" in the one that corresponds to addr_tmp. From that there
> is a much better chance to derive good alignment values.
>
> Of course on STRICT_ALIGNMENT targets a not aligned load
> will be decomposed again, so eventually doing the transformation
> may no longer be profitable(?).
Not always decomposed. On MIPS, it should using the load/store left/right instructions for unaligned load/stores which is normally better than decomposed load/stores. So having a cost model would be nice.
Thanks,
Andrew
>
> Thanks and sorry again for the delay.
>
> Otherwise the patch looks good to me.
>
> Richard.
>
>>> Best regards,
>>>
>>> Thomas
>>>
>>>> -----Original Message-----
>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
>>>> Sent: Monday, May 05, 2014 7:30 PM
>>>> To: GCC Patches
>>>> Subject: RE: [PATCH][2/3] Fix PR54733 Optimize endian independent
>>>> load/store
>>>>
>>>> I found a way to improve the function find_bswap/find_bswap_or_nop
>>>> and reduce its size. Please hold for the review, I will post an updated
>>>> version as soon as I finish testing.
>>>>
>>>> Best regards,
>>>>
>>>> Thomas Preud'homme
>>
>>