This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR54733 Optimize endian independent load/store


On Fri, May 16, 2014 at 12:56 PM,  <pinskia@gmail.com> wrote:
>
>
>> On May 16, 2014, at 3:48 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>
>> On Fri, May 16, 2014 at 12:07 PM, Thomas Preud'homme
>> <thomas.preudhomme@arm.com> wrote:
>>> Ping?
>>
>> Sorry ...
>>
>>> Best regards,
>>>
>>> Thomas Preud'homme
>>>
>>>> -----Original Message-----
>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
>>>> Sent: Friday, May 09, 2014 6:26 PM
>>>> To: GCC Patches
>>>> Subject: RE: [PATCH] Fix PR54733 Optimize endian independent load/store
>>>>
>>>> Sorry, took longer than expected as I got distracted by some other patch.
>>>> I merged the whole patchset in a single patch as I was told the current setup
>>>> is actually more difficult to read.
>>>>
>>>> Here are the updated ChangeLogs:
>>>>
>>>> *** gcc/ChangeLog ***
>>>>
>>>> 2014-05-09  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>
>>>>      PR tree-optimization/54733
>>>>      * expr.c (get_inner_reference): Add a parameter to control whether
>>>> a
>>>>      MEM_REF should be split into base + offset.
>>>>      * tree.h (get_inner_reference): Default new parameter to false.
>>>>      * tree-ssa-math-opts.c (nop_stats): New "bswap_stats" structure.
>>>>      (CMPNOP): Define.
>>>>      (find_bswap_or_nop_load): New.
>>>>      (find_bswap_1): Renamed to ...
>>>>      (find_bswap_or_nop_1): This. Also add support for memory source.
>>>>      (find_bswap): Renamed to ...
>>>>      (find_bswap_or_nop): This. Also add support for memory source and
>>>>      detection of bitwise operations equivalent to load in host endianness.
>>>>      (execute_optimize_bswap): Likewise. Also move its leading
>>>> comment back
>>>>      in place and split statement transformation into ...
>>>>      (bswap_replace): This. Add assert when updating bswap_stats.
>>>>
>>>> *** gcc/testsuite/ChangeLog ***
>>>>
>>>> 2014-05-09  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>
>>>>      PR tree-optimization/54733
>>>>      * gcc.dg/optimize-bswapdi-3.c: New test to check extension of
>>>> bswap
>>>>      optimization to support memory sources and bitwise operations
>>>>      equivalent to load in host endianness.
>>>>      * gcc.dg/optimize-bswaphi-1.c: Likewise.
>>>>      * gcc.dg/optimize-bswapsi-2.c: Likewise.
>>>>      * gcc.c-torture/execute/bswap-2.c: Likewise.
>>>>
>>>> Ok for trunk?
>>
>> Ok, I now decided otherwise and dislike the new parameter to
>> get_inner_reference.  Can you please revert that part and just
>> deal with a MEM_REF result in your only caller?
>>
>> And (of course) I found another possible issue.  The way you
>> compute load_type and use it here:
>>
>> +      /* Perform the load.  */
>> +      load_offset_ptr = build_int_cst (n->alias_set, 0);
>> +      val_expr = fold_build2 (MEM_REF, load_type, addr_tmp,
>> +                             load_offset_ptr);
>>
>> makes the load always appear aligned according to the mode of
>> load_type.  On strict-alignment targets this may cause faults.
>>
>> So what you have to do is either (simpler)
>>
>>   unsigned int align = get_pointer_alignment (addr_tmp);
>>   tree al_load_type = load_type;
>>   if (align < TYPE_ALIGN (load_type))
>>     al_load_type = build_aligned_type (load_type, align);
>> ...
>>    val_expr = fold_build2 (MEM_REF, al_load_type, addr_tmp,
>>                                     load_offset_ptr);
>>
>> or keep track of the "first" actual load and use
>>
>>   unsigned int align = get_object_alignment (that_first_load);
>>
>> "first" in the one that corresponds to addr_tmp.  From that there
>> is a much better chance to derive good alignment values.
>>
>> Of course on STRICT_ALIGNMENT targets a not aligned load
>> will be decomposed again, so eventually doing the transformation
>> may no longer be profitable(?).
>
> Not always decomposed. On MIPS, it should using the load/store left/right instructions for unaligned load/stores which is normally better than decomposed load/stores. So having a cost model would be nice.

Agreed, but I am happy with doing that as a followup.  Btw,
a very simple one would be to reject unaligned
SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align).
[of course that may be true on MIPS even for the cases where
a "reasonable" fast unalgined variant exists - nearly no target
defines that macro in a too fancy way]

Richard.

> Thanks,
> Andrew
>
>>
>> Thanks and sorry again for the delay.
>>
>> Otherwise the patch looks good to me.
>>
>> Richard.
>>
>>>> Best regards,
>>>>
>>>> Thomas
>>>>
>>>>> -----Original Message-----
>>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>>> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
>>>>> Sent: Monday, May 05, 2014 7:30 PM
>>>>> To: GCC Patches
>>>>> Subject: RE: [PATCH][2/3] Fix PR54733 Optimize endian independent
>>>>> load/store
>>>>>
>>>>> I found a way to improve the function find_bswap/find_bswap_or_nop
>>>>> and reduce its size. Please hold for the review, I will post an updated
>>>>> version as soon as I finish testing.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Thomas Preud'homme
>>>
>>>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]