This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst

From: Bill Schmidt <wschmidt at linux dot ibm dot com>
To: Richard Biener <richard dot guenther at gmail dot com>
Cc: will_schmidt at vnet dot ibm dot com, Segher Boessenkool <segher at kernel dot crashing dot org>, "William J. Schmidt" <wschmidt at linux dot vnet dot ibm dot com>, David Edelsohn <dje dot gcc at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Fri, 1 Jun 2018 10:44:53 -0500
Subject: Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst
References: <1527794851.6620.20.camel@brimstone.rchland.ibm.com> <1527796780.15912.32.camel@brimstone.rchland.ibm.com> <CAFiYyc1p5_UwXUpMyP71QR6ch=kocynnAvMG91gPTMDjGpzBng@mail.gmail.com> <1527865886.15912.57.camel@brimstone.rchland.ibm.com> <9AEF10BA-9148-41A2-931A-95B3B12B6312@linux.ibm.com> <00C44272-C08B-4D87-9265-CB90FA8BF79E@gmail.com>

On Jun 1, 2018, at 10:35 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> On June 1, 2018 5:15:58 PM GMT+02:00, Bill Schmidt <wschmidt@linux.ibm.com> wrote:
>> On Jun 1, 2018, at 10:11 AM, Will Schmidt <will_schmidt@vnet.ibm.com>
>> wrote:
>>> 
>>> On Fri, 2018-06-01 at 08:53 +0200, Richard Biener wrote:
>>>> On Thu, May 31, 2018 at 9:59 PM Will Schmidt
>> <will_schmidt@vnet.ibm.com> wrote:
>>>>> 
>>>>> Hi,
>>>>> Add support for gimple folding for unaligned vector loads and
>> stores.
>>>>> testcases posted separately in this thread.
>>>>> 
>>>>> Regtest completed across variety of systems, P6,P7,P8,P9.
>>>>> 
>>>>> OK for trunk?
>>>>> Thanks,
>>>>> -Will
>>>>> 
>>>>> [gcc]
>>>>> 
>>>>> 2018-05-31 Will Schmidt <will_schmidt@vnet.ibm.com>
>>>>> 
>>>>>       * config/rs6000/rs6000.c: (rs6000_builtin_valid_without_lhs)
>> Add vec_xst
>>>>>       variants to the list.  (rs6000_gimple_fold_builtin) Add
>> support for
>>>>>       folding unaligned vector loads and stores.
>>>>> 
>>>>> diff --git a/gcc/config/rs6000/rs6000.c
>> b/gcc/config/rs6000/rs6000.c
>>>>> index d62abdf..54b7de2 100644
>>>>> --- a/gcc/config/rs6000/rs6000.c
>>>>> +++ b/gcc/config/rs6000/rs6000.c
>>>>> @@ -15360,10 +15360,16 @@ rs6000_builtin_valid_without_lhs (enum
>> rs6000_builtins fn_code)
>>>>>    case ALTIVEC_BUILTIN_STVX_V8HI:
>>>>>    case ALTIVEC_BUILTIN_STVX_V4SI:
>>>>>    case ALTIVEC_BUILTIN_STVX_V4SF:
>>>>>    case ALTIVEC_BUILTIN_STVX_V2DI:
>>>>>    case ALTIVEC_BUILTIN_STVX_V2DF:
>>>>> +    case VSX_BUILTIN_STXVW4X_V16QI:
>>>>> +    case VSX_BUILTIN_STXVW4X_V8HI:
>>>>> +    case VSX_BUILTIN_STXVW4X_V4SF:
>>>>> +    case VSX_BUILTIN_STXVW4X_V4SI:
>>>>> +    case VSX_BUILTIN_STXVD2X_V2DF:
>>>>> +    case VSX_BUILTIN_STXVD2X_V2DI:
>>>>>      return true;
>>>>>    default:
>>>>>      return false;
>>>>>    }
>>>>> }
>>>>> @@ -15869,10 +15875,77 @@ rs6000_gimple_fold_builtin
>> (gimple_stmt_iterator *gsi)
>>>>>       gimple_set_location (g, loc);
>>>>>       gsi_replace (gsi, g, true);
>>>>>       return true;
>>>>>      }
>>>>> 
>>>>> +    /* unaligned Vector loads.  */
>>>>> +    case VSX_BUILTIN_LXVW4X_V16QI:
>>>>> +    case VSX_BUILTIN_LXVW4X_V8HI:
>>>>> +    case VSX_BUILTIN_LXVW4X_V4SF:
>>>>> +    case VSX_BUILTIN_LXVW4X_V4SI:
>>>>> +    case VSX_BUILTIN_LXVD2X_V2DF:
>>>>> +    case VSX_BUILTIN_LXVD2X_V2DI:
>>>>> +      {
>>>>> +        arg0 = gimple_call_arg (stmt, 0);  // offset
>>>>> +        arg1 = gimple_call_arg (stmt, 1);  // address
>>>>> +        lhs = gimple_call_lhs (stmt);
>>>>> +        location_t loc = gimple_location (stmt);
>>>>> +        /* Since arg1 may be cast to a different type, just use
>> ptr_type_node
>>>>> +           here instead of trying to enforce TBAA on pointer
>> types.  */
>>>>> +        tree arg1_type = ptr_type_node;
>>>>> +        tree lhs_type = TREE_TYPE (lhs);
>>>>> +        /* POINTER_PLUS_EXPR wants the offset to be of type
>> 'sizetype'.  Create
>>>>> +           the tree using the value from arg0.  The resulting type
>> will match
>>>>> +           the type of arg1.  */
>>>>> +        gimple_seq stmts = NULL;
>>>>> +        tree temp_offset = gimple_convert (&stmts, loc, sizetype,
>> arg0);
>>>>> +        tree temp_addr = gimple_build (&stmts, loc,
>> POINTER_PLUS_EXPR,
>>>>> +                                      arg1_type, arg1,
>> temp_offset);
>>>>> +        gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>>>>> +        /* Use the build2 helper to set up the mem_ref.  The
>> MEM_REF could also
>>>>> +           take an offset, but since we've already incorporated
>> the offset
>>>>> +           above, here we just pass in a zero.  */
>>>>> +        gimple *g;
>>>>> +        g = gimple_build_assign (lhs, build2 (MEM_REF, lhs_type,
>> temp_addr,
>>>>> +                                               build_int_cst
>> (arg1_type, 0)));
>>>> 
>>>> So in GIMPLE the type of the MEM_REF specifies the alignment so my
>> question
>>>> is what type does the lhs usually have here?  I'd simply guess V4SF,
>> etc.?  In
>>> 
>>> yes.  (double-checking).  my reference for the intrinsic signatures
>>> shows the lhs is a vector of type.  The rhs can be either *type or
>>> *vector of type. 
>>> 
>>> vector double vec_vsx_ld (int, const vector double *);
>>> vector double vec_vsx_ld (int, const double *);
>>> With similar/same for the assorted other types.
>>> 
>>> These are also on my list as 'unaligned' vector loads.  I'm not
>> certain
>>> if that adds a twist to how I should answer the below.. 
>>> 
>>> Bill?
>> 
>> 'unaligned' means not necessarily aligned on a vector boundary.
>> They are guaranteed to be aligned on an element boundary.
>>> 
>>>> this case you are missing a
>>>> tree ltype = build_aligned_type (lhs_type, desired-alignment);
>>>> 
>>>> and use that ltype for building the MEM_REF.  I suppose in this case
>> the known
>>>> alignment is either BITS_PER_UNIT or element alignment (thus
>>>> TYPE_ALIGN (TREE_TYPE (lhs_type)))?
>>> 
>>> I'd think element alignment.  but no longer certain.  :-)
>> 
>> Yep, element alignment.
> 
> Note the x86 unaligned intrinsics support arbitray unaligned loads. So that's not available for power? Does the HW implementation require element alignment? 

I had to go look this up again...

Actually, the required alignment is 4 bytes regardless of the data type.  I thought
it was 8 bytes for V2DF/V2DI accesses, but that's not correct.  But we don't support
arbitrary alignment at the byte level.

Thanks!
Bill
> 
> Richard. 
> 
>> Thanks,
>> Bill
>>> 
>>>> Or is the type of the load the element types?
>>> 
>>> 
>>> So, In any case..  I'll build up / modify some tests to look at data
>>> being loaded, and see if I can see alignment issues here.
>>> 
>>> Thanks,
>>> -Will 
>>> 
>>> 
>>> 
>>>> Richard.
>>>> 
>>>>> +        gimple_set_location (g, loc);
>>>>> +        gsi_replace (gsi, g, true);
>>>>> +        return true;
>>>>> +      }
>>>>> +
>>>>> +    /* unaligned Vector stores.  */
>>>>> +    case VSX_BUILTIN_STXVW4X_V16QI:
>>>>> +    case VSX_BUILTIN_STXVW4X_V8HI:
>>>>> +    case VSX_BUILTIN_STXVW4X_V4SF:
>>>>> +    case VSX_BUILTIN_STXVW4X_V4SI:
>>>>> +    case VSX_BUILTIN_STXVD2X_V2DF:
>>>>> +    case VSX_BUILTIN_STXVD2X_V2DI:
>>>>> +      {
>>>>> +        arg0 = gimple_call_arg (stmt, 0); /* Value to be stored. 
>> */
>>>>> +        arg1 = gimple_call_arg (stmt, 1); /* Offset.  */
>>>>> +        tree arg2 = gimple_call_arg (stmt, 2); /* Store-to
>> address.  */
>>>>> +        location_t loc = gimple_location (stmt);
>>>>> +        tree arg0_type = TREE_TYPE (arg0);
>>>>> +        /* Use ptr_type_node (no TBAA) for the arg2_type.  */
>>>>> +        tree arg2_type = ptr_type_node;
>>>>> +        /* POINTER_PLUS_EXPR wants the offset to be of type
>> 'sizetype'.  Create
>>>>> +           the tree using the value from arg0.  The resulting type
>> will match
>>>>> +           the type of arg2.  */
>>>>> +        gimple_seq stmts = NULL;
>>>>> +        tree temp_offset = gimple_convert (&stmts, loc, sizetype,
>> arg1);
>>>>> +        tree temp_addr = gimple_build (&stmts, loc,
>> POINTER_PLUS_EXPR,
>>>>> +                                      arg2_type, arg2,
>> temp_offset);
>>>>> +        /* Mask off any lower bits from the address.  */
>>>>> +        gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>>>>> +        gimple *g;
>>>>> +        g = gimple_build_assign (build2 (MEM_REF, arg0_type,
>> temp_addr,
>>>>> +                                          build_int_cst
>> (arg2_type, 0)), arg0);
>>>>> +        gimple_set_location (g, loc);
>>>>> +        gsi_replace (gsi, g, true);
>>>>> +        return true;
>>>>> +      }
>>>>> +
>>>>>    /* Vector Fused multiply-add (fma).  */
>>>>>    case ALTIVEC_BUILTIN_VMADDFP:
>>>>>    case VSX_BUILTIN_XVMADDDP:
>>>>>    case ALTIVEC_BUILTIN_VMLADDUHM:
>>>>>      {

References:
- Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst
  - From: Richard Biener
- Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst
  - From: Will Schmidt
- Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst
  - From: Bill Schmidt
- Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst
  - From: Richard Biener

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]