This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR54733 Optimize endian independent load/store



> On May 16, 2014, at 4:13 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> On Fri, May 16, 2014 at 1:03 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Fri, May 16, 2014 at 12:56 PM,  <pinskia@gmail.com> wrote:
>>> 
>>> 
>>>> On May 16, 2014, at 3:48 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>> 
>>>> On Fri, May 16, 2014 at 12:07 PM, Thomas Preud'homme
>>>> <thomas.preudhomme@arm.com> wrote:
>>>>> Ping?
>>>> 
>>>> Sorry ...
>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Thomas Preud'homme
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>>>> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
>>>>>> Sent: Friday, May 09, 2014 6:26 PM
>>>>>> To: GCC Patches
>>>>>> Subject: RE: [PATCH] Fix PR54733 Optimize endian independent load/store
>>>>>> 
>>>>>> Sorry, took longer than expected as I got distracted by some other patch.
>>>>>> I merged the whole patchset in a single patch as I was told the current setup
>>>>>> is actually more difficult to read.
>>>>>> 
>>>>>> Here are the updated ChangeLogs:
>>>>>> 
>>>>>> *** gcc/ChangeLog ***
>>>>>> 
>>>>>> 2014-05-09  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>> 
>>>>>>     PR tree-optimization/54733
>>>>>>     * expr.c (get_inner_reference): Add a parameter to control whether
>>>>>> a
>>>>>>     MEM_REF should be split into base + offset.
>>>>>>     * tree.h (get_inner_reference): Default new parameter to false.
>>>>>>     * tree-ssa-math-opts.c (nop_stats): New "bswap_stats" structure.
>>>>>>     (CMPNOP): Define.
>>>>>>     (find_bswap_or_nop_load): New.
>>>>>>     (find_bswap_1): Renamed to ...
>>>>>>     (find_bswap_or_nop_1): This. Also add support for memory source.
>>>>>>     (find_bswap): Renamed to ...
>>>>>>     (find_bswap_or_nop): This. Also add support for memory source and
>>>>>>     detection of bitwise operations equivalent to load in host endianness.
>>>>>>     (execute_optimize_bswap): Likewise. Also move its leading
>>>>>> comment back
>>>>>>     in place and split statement transformation into ...
>>>>>>     (bswap_replace): This. Add assert when updating bswap_stats.
>>>>>> 
>>>>>> *** gcc/testsuite/ChangeLog ***
>>>>>> 
>>>>>> 2014-05-09  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>> 
>>>>>>     PR tree-optimization/54733
>>>>>>     * gcc.dg/optimize-bswapdi-3.c: New test to check extension of
>>>>>> bswap
>>>>>>     optimization to support memory sources and bitwise operations
>>>>>>     equivalent to load in host endianness.
>>>>>>     * gcc.dg/optimize-bswaphi-1.c: Likewise.
>>>>>>     * gcc.dg/optimize-bswapsi-2.c: Likewise.
>>>>>>     * gcc.c-torture/execute/bswap-2.c: Likewise.
>>>>>> 
>>>>>> Ok for trunk?
>>>> 
>>>> Ok, I now decided otherwise and dislike the new parameter to
>>>> get_inner_reference.  Can you please revert that part and just
>>>> deal with a MEM_REF result in your only caller?
>>>> 
>>>> And (of course) I found another possible issue.  The way you
>>>> compute load_type and use it here:
>>>> 
>>>> +      /* Perform the load.  */
>>>> +      load_offset_ptr = build_int_cst (n->alias_set, 0);
>>>> +      val_expr = fold_build2 (MEM_REF, load_type, addr_tmp,
>>>> +                             load_offset_ptr);
>>>> 
>>>> makes the load always appear aligned according to the mode of
>>>> load_type.  On strict-alignment targets this may cause faults.
>>>> 
>>>> So what you have to do is either (simpler)
>>>> 
>>>>  unsigned int align = get_pointer_alignment (addr_tmp);
>>>>  tree al_load_type = load_type;
>>>>  if (align < TYPE_ALIGN (load_type))
>>>>    al_load_type = build_aligned_type (load_type, align);
>>>> ...
>>>>   val_expr = fold_build2 (MEM_REF, al_load_type, addr_tmp,
>>>>                                    load_offset_ptr);
>>>> 
>>>> or keep track of the "first" actual load and use
>>>> 
>>>>  unsigned int align = get_object_alignment (that_first_load);
>>>> 
>>>> "first" in the one that corresponds to addr_tmp.  From that there
>>>> is a much better chance to derive good alignment values.
>>>> 
>>>> Of course on STRICT_ALIGNMENT targets a not aligned load
>>>> will be decomposed again, so eventually doing the transformation
>>>> may no longer be profitable(?).
>>> 
>>> Not always decomposed. On MIPS, it should using the load/store left/right instructions for unaligned load/stores which is normally better than decomposed load/stores. So having a cost model would be nice.
>> 
>> Agreed, but I am happy with doing that as a followup.  Btw,
>> a very simple one would be to reject unaligned
>> SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align).
>> [of course that may be true on MIPS even for the cases where
>> a "reasonable" fast unalgined variant exists - nearly no target
>> defines that macro in a too fancy way]
> 
> Oh, and what happens for
> 
> unsigned foo (unsigned char *x)
> {
>  return x[0] << 24 | x[2] << 8 | x[3];
> }
> 
> ?  We could do an unsigned int load from x and zero byte 3
> with an AND.  Enhancement for a followup, similar to also
> considering vector types for the load (also I'm not sure
> that uint64_type_node always has non-BLKmode for all
> targets).

No we cannot if x[4] is on a different page which is not mapped in, we get a fault. Not something we want. 

Thanks,
Andrew


> 
> Richard.
> 
>> Richard.
>> 
>>> Thanks,
>>> Andrew
>>> 
>>>> 
>>>> Thanks and sorry again for the delay.
>>>> 
>>>> Otherwise the patch looks good to me.
>>>> 
>>>> Richard.
>>>> 
>>>>>> Best regards,
>>>>>> 
>>>>>> Thomas
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>>>>>>> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
>>>>>>> Sent: Monday, May 05, 2014 7:30 PM
>>>>>>> To: GCC Patches
>>>>>>> Subject: RE: [PATCH][2/3] Fix PR54733 Optimize endian independent
>>>>>>> load/store
>>>>>>> 
>>>>>>> I found a way to improve the function find_bswap/find_bswap_or_nop
>>>>>>> and reduce its size. Please hold for the review, I will post an updated
>>>>>>> version as soon as I finish testing.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> 
>>>>>>> Thomas Preud'homme
>>>>> 
>>>>> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]